Odds Ratio (OR)

The odds ratio (OR) is an epidemiological and statistical measure of association defined as the ratio of two odds, such as the odds of an event occurring in an exposed group compared to the odds of the same event in an unexposed or control group.
Odds themselves represent the probability of a specific event taking place divided by the probability of the event not taking place.
Mathematically, odds are calculated by dividing the number of individuals who experience the event by the number of individuals who do not experience the event.
The odds ratio is the primary measure of the strength of association utilized in case-control studies and cross-sectional studies where incidence rates cannot be directly observed or calculated.
Because the probability is defined as a ratio, the value of an odds ratio can theoretically range from zero to infinity.

To calculate an odds ratio, clinical data evaluating an exposure and a disease outcome are typically arranged into a standard 2x2 contingency table.

Exposure Status	Disease Present (Cases)	Disease Absent (Controls)
Exposed	a	b
Non-exposed	c	d

The odds of having the disease among the exposed group is calculated as the ratio $a / b$ .
The odds of having the disease among the non-exposed group is calculated as the ratio $c / d$ .
The Odds Ratio is thus formulated as $(a / b) / (c / d)$ , which can be simplified algebraically to the cross-product $a d / b c$ .

OR = 1.0: Indicates that the odds of the event are identical in both groups; there is no increase or decrease in risk, signifying no association between the exposure and the outcome.
OR > 1.0: Indicates that the odds of the outcome occurring are higher in the exposed group, signifying that the exposure is a positive risk factor and the event is more likely.
OR < 1.0: Indicates that the odds of the outcome occurring are lower in the exposed group, signifying that the exposure is a protective factor and the event is less likely.
For example, an OR of 2.53 suggests that patients with the exposure are approximately 2.5 times more likely to develop the disease compared to those without the exposure.

The odds ratio is the fundamental effect estimate generated by logistic regression analysis.
In a logistic regression model, the natural logarithm of the odds (the logit transformation) is modelled as a linear function of the explanatory variables.
The exponential of the partial regression coefficient ( $e x p (β)$ ) generated by the logistic regression directly provides the estimated odds ratio for that specific variable.
When a simple logistic regression is performed with a single predictor, it yields a crude or unadjusted odds ratio, which does not account for potential confounding variables.
When multiple independent variables are included in a multivariable logistic regression model, it yields an adjusted odds ratio.
The adjusted odds ratio evaluates the impact of an exposure while statistically holding other predictor variables constant, thereby minimizing the effect of confounding factors.

The following table contrasts the Odds Ratio with the Relative Risk (RR), another common measure of association:

Feature	Odds Ratio (OR)	Relative Risk (RR)
Mathematical Basis	A ratio of two odds (part/part divided by part/part).	A ratio of two risks or incidences (part/total divided by part/total).
Study Design Suitability	Can be calculated in case-control, cross-sectional, and cohort studies.	Can only be calculated in prospective cohort studies or randomized controlled trials.
Approximation	Approximates the true relative risk only when the disease is rare (low prevalence).	Is the direct measure of risk probability.
Magnitude Bias	Always yields a value further from 1 (more extreme) than the RR when calculated from the same data.	Provides a more conservative and clinically intuitive measure of event probability.

When the null hypothesis is true (OR = 1), the distribution of the odds ratio is highly skewed to the right.
To calculate a 95% confidence interval (CI), a logarithmic transformation must be applied first to normalize the distribution.
The standard error (SE) of the natural logarithm of the odds ratio is calculated as $S E [l n (O R)] = \sqrt{1 / a + 1 / b + 1 / c + 1 / d}$ .
The 95% CI is computed on the log scale and then transformed back by taking the anti-log (exponential) of the upper and lower limits.
If the 95% confidence interval for an odds ratio includes the null value of 1, the observed association is considered not statistically significant.
If the 95% confidence interval does not contain 1, the odds ratio is considered statistically significant.
The exact p-value for the significance of an OR generated from a 2x2 contingency table is formally obtained by utilizing either the Fisher's exact test or the Chi-square test.