Home Theories Berkson's paradox

Berkson's paradox

The tendency to misinterpret statistical experiments involving conditional probabilities. Because scenarios where neither event occurs, are eliminated, the chance of event A occurring is greater in the presence of event B.

Table of contents:
  1. What Causes Berkson's Paradox?
    1. Simple Example
  2. Examples
    1. Hospitals
    2. Dating
    3. Book Recommendations
  3. The components of Berkson's paradox
  4. Conclusion:

While making conditional comparisons in your head, Berkson's paradox is one of the more clickbaity conclusions you may get. There is a widespread assumption, for example, that Hollywood spoils good texts. The better the underlying content, it appears, the worse the movie is. This view is sometimes linked to higher expectations for the films due to superior source material. Still, another explanation is: the mental analysis itself ignores a significant amount of the available data. Berkson's paradox is the name given to this situation.

When it comes to movies, people frequently recall situations where either the original material or the film was excellent. Specifically, we tend to fabricate (ignore) examples in which both the source and the movie were awful. As a result, a fictitious negative correlation between the variables appears.

A similar assumption is that attractive people are more likely to be jerks. We notice this because when we go out on dates, we prefer to go out with someone attractive or pleasant. Specifically, we overlook those who are neither attractive nor pleasant. Berkson's paradox is the counterintuitive notion that events that appear to be connected aren't.

Consider two events, A and B, unrelated (for example, lung cancer and diabetes). If both A (lung cancer) and B (diabetes) are present in a study, diabetes will increase the likelihood of lung cancer. Intuitively, this makes little sense, yet the evidence appears to support this counterintuitive idea, indicating that there is a link.

What Causes Berkson's Paradox?

Because scenarios where neither event occurs, are eliminated, the chance of event A occurring is greater in the presence of event B.

Data from autopsies were studied in a famous instance (from Everitt's Medical Statistics from A to Z: A Guide for Clinicians and Medical Students). Cancer and TB were identified in fewer cases than predicted, showing that tuberculosis is preventive against cancer. However, the true explanation for the lower rates is that not all autopsies were included in the research; patients with cancer and TB may have lower autopsy rates for many reasons.

Simple Example

Consider the situation at a certain children's hospital during an influenza outbreak. We'll show that having influenza provides some protection against appendicitis, which seems counterintuitive. Influenza affects 10% of the overall population.

The chances are larger in a hospital full of ill youngsters; 30 percent of the children may have been hospitalized for influenza. Assume that 10% of the children were hospitalized due to appendicitis.

There will be some overlap; we presume that a kid with appendicitis is just as likely as any other child to have the flu and that a child with the flu can still have appendicitis. The proportion of appendicitis patients with influenza would be 10% of 10% or 1% of all hospital patients.

If you pick a youngster from the hospital at random, he has a 30% chance of having influenza and a 10% chance of having epilepsy/convulsions. 10 out of 100 youngsters will get epilepsy or convulsions, whereas 30 out of 100 will develop influenza.

You are aware of the following:

Appendicitis/influenza (red/blue) youngsters are among the thirty influenza patients outside of the yellow box. In our case, that's only one child; there were ten overall appendicitis patients among the 100 children; therefore, there will be nine among the seventy non-influenza patients we're selecting today.

As a result, we can compute the new percentage: a child without influenza has a 9 / 70 = 12.9 / 100, or 12.9%, risk of developing appendicitis. This is greater than the overall rate of appendicitis, which is 10%.

Even though these two occurrences are completely unrelated, the data from within the hospital suggest that having influenza provides some protection against appendicitis.



Berkson's paradox was first observed in epidemiological investigations, which study the link between disease and probable risk factors. The link between lung cancer and smoking as a risk factor, for example, is illustrative. Conducting these studies at hospitals is particularly easy since many individuals in one location have already been diagnosed with the condition.

On the other hand, hospitals admit patients based on a mixture of symptoms, just like colleges accept students based on a combination of favorable attributes. For example, if research is done at a prenatal clinic to determine if pregnancy increases or lowers the time it takes for an HIV-positive woman to acquire full-blown AIDS, the study may be paradoxical. Women will attend for either a pregnancy check or an AIDS risk assessment. As a result, a woman who visits the clinic but is not pregnant is more likely to have AIDS than someone in the general community because she is more likely there for a purpose other than pregnancy.

Even if there isn't one, there may appear to be one. The control group was drawn from patients of a gastroenterologist in hospital-based research to see if pancreatic cancer and coffee use are linked. However, people with gastrointestinal distress are less likely to drink coffee than the general population because of their gastrointestinal pain. As a result, the test group's rate of coffee consumption was artificially inflated compared to the deflated control group.


People choose dating partners based on a combination of physical and personality characteristics. A similar graph for institutions might be made, with physical and psychological attributes on two axes and a population region. This might explain the widely held (albeit unsupported) belief that attractive guys are jerks.

Book Recommendations

Researchers discovered that quality evaluations for novels that received literary honors dropped after the award was given. This appears counterintuitive because people are more likely to enjoy books if they know other people who enjoy them. On the other hand, people choose to read a book based on a mix of its popularity and if it appears to be fascinating to them. If a book becomes more popular (for example, by receiving prizes), more people will read it since it immediately appeals to them. Avoid novels you want to read because your friends are reading them if you want to read the finest books. Directed Acyclic Graphs (DAGs)

Snoep et al. study . 's exemplify the strength and beauty of Directed Acyclic Graphs (DAGs). We used to try to comprehend using words, probabilities, and numerical examples, which is now much more elegantly explored using causal diagrams. This is a significant step forward, as it reveals several facets of Berkson's paradox. DAGs have clarified the previously ambiguous link between the selection paradox and confusion in general. The selection paradox has traditionally been defined as a dilemma resulting from the improper selection of research subjects from the source population.

On one level, this is obvious, but the term "selection" has frequently been applied to the improper selection of a comparison group, leading to confusion about whether phenomena like the healthy worker effect are examples of selection paradox or confounding. The situation is further complicated because determinants of selection can effectively become confounders and be controlled for in the analysis, even if they were not confounders in the first place.

The components of Berkson's paradox

When the hospitalization rate for the cases is less than 100%, and the exposure is another disease, or a cause of another disease, which results in hospitalization, Berkson's paradox can be seen as a paradox estimation of the odds of exposure among the cases because exposed cases are identified with greater probability than non-exposed cases. The many processes involved in Berkson's paradox can be shown with numerical examples.


The most prominent example of Berkson's paradox is an erroneous belief that there is a negative correlation between two positive attributes, implying that people with one good feature are less likely to have the other. Berkson's paradox happens when this observation appears correct even though the two traits are unrelated-or maybe positively correlated-because individuals who are missing both are not evenly noticed. For example, a person may notice that fast-food restaurants in their area that serve good hamburgers also serve bad fries, and vice versa; however, because they would not eat anywhere where both were bad, they overlook the large number of restaurants in this category that would weaken or even reverse the correlation.

A new cognitive bias in your inbox every week

You'll get to learn more about CRO and psychology. You'll be able to take experimenting to a whole new level!

* We send our mails on Monday morning btw.

Frequently Asked Questions

Do you think you know enough about CRO?

Join our monthly mailing to continue learning more and more about CRO and psychology.