What Is Data Bias and How to Avoid It - Deepstash
What Is Data Bias and How to Avoid It

What Is Data Bias and How to Avoid It

Curated from: hackernoon.com

Ideas, facts & insights covering these topics:

4 ideas

·

236 reads

2

Explore the World's Best Ideas

Join today and uncover 100+ curated journeys from 50+ topics. Unlock access to our mobile app with extensive features.

AI models trained with biased data

AI models trained with biased data

Data bias can have notable implications for research and practical applications. For example, in a Facebook scandal, its AI shockingly asked users if they wanted to continue seeing videos about primates after watching a video featuring Black men.

Data bias refers to data sets that don't represent the population in study. Models trained on biased data could contain prejudice. That's why AI Researchers and Data Scientists must be vigilant to ensure models don't contain any bias.

5

64 reads

Collect data from a variety of sources.

The most common avenues for collecting training data:

  • Paying for data sets
  • Using public data sets
  • Sourcing open source content
  • Using in-person or field-collected data sets

A combination of all four will provide the best training data.

4

60 reads

Ensure diverse data

Ensure diverse data

Ensure the data is diverse. Speakers in the audio or video files should possess a range of characteristics, including locations, dialects, genders, sex, race, and nationality.

Sourcing such data could be difficult if you only rely on open-source data.

4

62 reads

Monitor real-world performance

Once you've ensured your initial data set is diverse, your model can still have bias.

That's why you should monitor your model's real-world performance. For example, does your model better predict female speech over male speech? If so, retrain with new datasets to overcome any problem areas.

4

50 reads

IDEAS CURATED BY

anty

I’ve got 99 problems and I’m not dealing with any of them.

Antonio Y.'s ideas are part of this journey:

Machine Learning With Google

Learn more about computerscience with this collection

Understanding machine learning models

Improving data analysis and decision-making

How Google uses logic in machine learning

Related collections

Read & Learn

20x Faster

without
deepstash

with
deepstash

with

deepstash

Personalized microlearning

100+ Learning Journeys

Access to 200,000+ ideas

Access to the mobile app

Unlimited idea saving

Unlimited history

Unlimited listening to ideas

Downloading & offline access

Supercharge your mind with one idea per day

Enter your email and spend 1 minute every day to learn something new.

Email

I agree to receive email updates