Week 3 Student Companion Guide
A confounder is a variable that distorts (inflates or deflates) the true association between an exposure and an outcome. To be a confounder, a variable must simultaneously satisfy three criteria:
The confounder is distributed unequally between exposed and unexposed groups.
Affects the risk of the outcome over and above the effect of the exposure.
It is not an intermediary step. The exposure does not cause the confounder.
A heuristic used to determine if a variable is a meaningful confounder:
Example: If Crude OR = 5.09 and Adjusted OR = 5.42. The change is |5.09 - 5.42| / 5.09 = 6.5%. Because 6.5% < 10%, it is only mild confounding.
This is the most commonly confused point in epidemiology. One is a bias to be removed; the other is a biological/social reality to be reported.
| Feature | Confounding (Bias) | Effect Modification (Finding) |
|---|---|---|
| What it is | A bias—the true association is distorted. | A real phenomenon—the effect genuinely differs between groups. |
| What to do | Adjust for it (regression or pool via M-H). | Report it! Present stratum-specific estimates. |
| OR across strata | Similar within strata (homogeneous). | Differs significantly across strata (heterogeneous). |
| Statistical Test | Compare crude vs. adjusted OR (10% rule). | Interaction term p-value (e.g., p < 0.05). |
# 1. Crude OR (No adjustment) m_crude <- glm(LungCa ~ Smoking, data=df, family=binomial) exp(coef(m_crude)) # 2. Adjusted OR (Controlling for Confounder 'OccExp') m_adj <- glm(LungCa ~ Smoking + OccExp, data=df, family=binomial) exp(coef(m_adj)) # 3. Testing Effect Modification (Interaction Term) m_int <- glm(LungCa ~ Smoking * AgeGrp, data=df, family=binomial) summary(m_int) # Look at the 'Smoking:AgeGrp' p-value row