6.5 Estimating an OR using a 2 \(\times\) 2 table

For the special case of a binary predictor, the OR can be computed from a 2 \(\times\) 2 table. In Table 6.1, the OR comparing the odds of “positive” between risk factor levels 2 and 1 is the cross-product \((d \times a) \div (b \times c)\). We started the cross-product at d since we wanted the odds of “positive” comparing risk factor level 2 to level 1. If, instead, we wanted the odds of “positive” comparing level 1 to level 2, we would start at b and compute the OR as \((b \times c) \div (d \times a)\) (as previously mentioned, if you change the order of comparison then you invert the OR).

Table 6.1: Computing an odds ratio from a two-way table
	Condition negative	Condition positive
Risk factor level 1	a	b
Risk factor level 2	c	d

Example 6.2: Using the 2019 National Survey of Drug Use and Health (NSDUH) teaching dataset (Section A.5), compute the OR comparing the odds of lifetime marijuana use (mj_lifetime) between males and females (demog_sex).

load("Data/nsduh2019_adult_sub_rmph.RData")
# Shorter name
nsduh <- nsduh_adult_sub
TAB <- table(nsduh$demog_sex, nsduh$mj_lifetime)
TAB

##         
##           No Yes
##   Male   206 260
##   Female 285 249

The way the table is set up, the cross-product \((d \times a) \div (b \times c)\) will give us the odds of “Yes” comparing females to males.

# (d*a) / (b*c)
(TAB[2,2]*TAB[1,1])/(TAB[1,2]*TAB[2,1])

## [1] 0.6922

However, the question asked for the OR comparing males to females, so we need to use the formula \((b \times c) \div (d \times a)\).

# (b*c) / (d*a)
(TAB[1,2]*TAB[2,1]) / (TAB[2,2]*TAB[1,1])

## [1] 1.445

Thus, the estimated OR for lifetime marijuana use comparing males to females is 1.445; males have 44.5% greater odds of lifetime marijuana use than females. We could have also obtained this OR by inverting the OR comparing females to males (1 / 0.692 = 1.445).

To review, here is a reminder of where the value of 1.445 comes from:

P(lifetime marijuana use among males) = 260 \(\div\) (206 + 260) = 0.558
P(lifetime marijuana use among females) = 249 \(\div\) (285 + 249) = 0.466
Odds for males = 0.558 \(\div\) (1 – 0.558) = 1.262 (odds \(>\) 1 because probability \(>\) 0.5)
Odds for females = 0.466 \(\div\) (1 – 0.466) = 0.874 (odds \(<\) 1 because probability \(<\) 0.5)
Odds ratio = 1.262 \(\div\) 0.874 = 1.445 (OR \(>\) 1 because odds for males \(>\) odds for females)

Below are some useful functions for converting probabilities to odds and ORs, along with the logit function and its inverse which we will use in the next section. These functions are all in Functions_rmph.R which you loaded at the beginning of this chapter.

odds       <- function(p)      p/(1-p)
odds.ratio <- function(p1, p2) odds(p1)/odds(p2)
logit      <- function(p)      log(p/(1-p))
ilogit     <- function(x)      exp(x)/(1+exp(x))
# exp() is the exponential function

# Example 6.2
PM <- TAB[1,2]/(TAB[1,2]+TAB[1,1]) # P(Yes | Male)
PF <- TAB[2,2]/(TAB[2,2]+TAB[2,1]) # P(Yes | Female)
OM <- odds(PM)                     # PM / (1 - PM)
OF <- odds(PF)                     # PF / (1 - PF)
OR.MvsF <- odds.ratio(PM, PF)      # Odds ratio
round(c(PM, PF, OM, OF, OR.MvsF), 3)

## [1] 0.558 0.466 1.262 0.874 1.445