Facts are stubborn things, but statistics are pliable.

What is Simpson’s Paradox?

Simpson's Paradox, also known as the Yule-Simpson effect, is a statistical phenomenon in which a trend that appears in several groups of data eventually disappears or overturns when the groups are combined.

Why Simpson's Paradox?

With Simpson's paradox at hand, decision-making can be hard. But understanding and identifying this paradox is important for interpreting the data clearly.

Diving into Paradox!

Let us assume that we are observing several groups of data and we establish a relationship or correlation between them.

According to Simpson's Paradox, when we combine all the groups together and look at the aggregate form of the data, the correlation that we noticed may reverse itself!

Why Simpson's Paradox Occurs?

Simpson's paradox occurs when there are variables that split data into multiple separate distributions. This hidden variable is referred to as the Lurking variable.

Realtime Example of Simpson's Paradox

One of the most famous examples of Simpson's paradox is, admission to graduate school at US Berkeley in 1973. Due to this turmoil UC Berkeley Almost Got Sued For Sex Discrimination!

When we look at the overall Admissions, men are most likely to get accepted than women.

Out of 8442 male applicants, 44% (about 3714) got accepted.

whereas 31% (approximately 1339) of 4321 female applicants got accepted.

Example Continued...

However, when overall admissions are divided into department-wise admissions, women are more likely to be admitted than men.

Out of the 6 departments 4 of the departments accepted women more than men.

Lurking Variable!!!

The reason behind this uncertainty was that we ignored the below point.

Compared to men, more women applied to departments that were harder to get into.

How to Resolve this Paradox?

To avoid Simpson's Paradox, which leads us to two different interpretations of the same data,

we need to analyse the data either by splitting it into groups or by aggregating it together.

