May 20, 2023

Simpson’s Paradox, also known as the Yule-Simpson effect, is a statistical phenomenon that occurs when a trend appears in different groups of data but disappears or reverses when the groups are combined. In other words, the direction of the relationship between two variables can change or disappear when a third variable is introduced. This paradox is named after Edward Simpson, who first described it in 1951.

## Background

In statistics, Simpson’s Paradox is a consequence of a common assumption in statistical analysis that correlations between variables hold constant across subpopulations. However, this assumption can be violated when the subpopulations have different distributions of the confounding variable.

The paradox often occurs in the context of observational studies, where the researcher observes the relationships between variables without controlling or manipulating them. In such studies, the relationships between variables can be influenced by confounding variables, which are factors that affect both the dependent and independent variables.

## Examples

A classic example of Simpson’s Paradox involves the admission rates of men and women to a graduate school. Suppose that a university has two departments, A and B, and that the admission rates for each department are as follows:

``````| Department | Men (admitted/total) | Women (admitted/total) |
|------------|---------------------|------------------------|
| A          | 45/100              | 20/80                  |
| B          | 20/80               | 30/100                 |
| Overall    | 65/180              | 50/180                 |
``````

At first glance, it appears that men are admitted at a higher rate than women in both departments: 45% versus 20% in department A, and 20% versus 30% in department B. However, when the data is combined, the opposite trend emerges: women are admitted at a higher rate than men overall, 50% versus 37%.

This paradox occurs because the departments have different admission rates, and the proportions of men and women applicants are different in each department. In department A, for example, women represent only 44% of the total applicants, while in department B, women represent 57% of the total applicants. Thus, the overall admission rate for women is lower than that for men in each department, but when the data is combined, the difference in admission rates between men and women is smaller in department A than in department B, leading to the reversal of the trend.

Another example of Simpson’s Paradox involves the relationship between a drug and a disease. Suppose that a clinical trial is conducted to test the efficacy of a new drug in reducing the symptoms of a certain disease. The trial involves two groups of patients: a treatment group that receives the drug, and a control group that receives a placebo.

Suppose that the results of the trial are as follows:

``````| Group   | Patients (improved/total) |
|---------|--------------------------|
| Drug    | 200/500                  |
| Placebo | 400/1000                 |
| Overall | 600/1500                 |
``````

At first glance, it appears that the drug is less effective than the placebo, since only 40% of the patients in the treatment group improved, compared to 40% of the patients in the control group. However, when the data is disaggregated by the severity of the disease, a different picture emerges.

Suppose that the patients are divided into two categories: mild and severe, based on the severity of their symptoms. The results for each category are as follows:

``````| Group   | Mild (improved/total) | Severe (improved/total) |
|---------|-----------------------|-------------------------|
| Drug    | 80/200                | 120/300                 |
| Placebo | 160/400               | 240/600                 |
| Overall | 240/600               | 360/900                 |
``````

In each category, the drug is more effective than the placebo: 40% versus 30% in the mild category, and 40% versus 40% in the severe category. However, when the data is combined, the effect of the drug appears to be weaker than that of the placebo, due to the difference in the severity of the disease in the two groups.