HPG 6104 • Epidemiological Methods II

Confounding & Effect Modification

Week 3 Student Companion Guide

01

What Is Confounding?

A confounder is a variable that distorts (inflates or deflates) the true association between an exposure and an outcome. To be a confounder, a variable must simultaneously satisfy three criteria:

1

Associated with Exposure

The confounder is distributed unequally between exposed and unexposed groups.

2

Independent Risk Factor

Affects the risk of the outcome over and above the effect of the exposure.

3

Not on Causal Pathway

It is not an intermediary step. The exposure does not cause the confounder.

The 10% Rule (Detecting Confounding)

A heuristic used to determine if a variable is a meaningful confounder:

| (Crude OR - Adjusted OR) | / Crude OR > 0.10

Example: If Crude OR = 5.09 and Adjusted OR = 5.42. The change is |5.09 - 5.42| / 5.09 = 6.5%. Because 6.5% < 10%, it is only mild confounding.

02

Confounding vs. Effect Modification

This is the most commonly confused point in epidemiology. One is a bias to be removed; the other is a biological/social reality to be reported.

Feature Confounding (Bias) Effect Modification (Finding)
What it is A bias—the true association is distorted. A real phenomenon—the effect genuinely differs between groups.
What to do Adjust for it (regression or pool via M-H). Report it! Present stratum-specific estimates.
OR across strata Similar within strata (homogeneous). Differs significantly across strata (heterogeneous).
Statistical Test Compare crude vs. adjusted OR (10% rule). Interaction term p-value (e.g., p < 0.05).
03

Controlling for Confounding

Study Design Approaches

  • Randomization: The gold standard (RCTs). Balances both measured and unmeasured confounders.
  • Restriction: Limit the study to one stratum (e.g., only non-smokers). Simple, but reduces generalisability.
  • Matching: Pair cases and controls on confounder values. Controls it, but requires complex matched analysis (and prevents studying the matched variable).

Data Analysis Approaches

  • Stratification: Compute stratum-specific estimates (Mantel-Haenszel pooling). Best for 1 or 2 categorical confounders.
  • Multivariable Regression: Include confounders as covariates. Can handle many simultaneously.
  • Propensity Scores: Model probability of exposure. Used in large observational studies. Cannot fix unmeasured confounding.
04

R Code Quick Reference

# 1. Crude OR (No adjustment)
m_crude <- glm(LungCa ~ Smoking, data=df, family=binomial)
exp(coef(m_crude))

# 2. Adjusted OR (Controlling for Confounder 'OccExp')
m_adj <- glm(LungCa ~ Smoking + OccExp, data=df, family=binomial)
exp(coef(m_adj))

# 3. Testing Effect Modification (Interaction Term)
m_int <- glm(LungCa ~ Smoking * AgeGrp, data=df, family=binomial)
summary(m_int) # Look at the 'Smoking:AgeGrp' p-value row
← Back to Course Portal © 2026 University of Nairobi