## 6.2 Interpretation of the logistic regression coefficients

How do we interpret the logistic regression coefficients? First, we need to get into some math. In the end, we will use R to do all the computations for us; however, it is important to see the math to understand how to interpret a logistic regression model.

If $$p$$ is the probability of an event, then $$p/(1-p)$$ is the “odds” of the event, the ratio of how likely the event is to occur to how likely it is to not occur. The left-hand side of the logistic regression equation $$\ln{(p/(1-p))}$$ is the natural logarithm of the odds, also known as the “log-odds” or “logit”. To convert log-odds to odds, use the inverse of the natural logarithm which is the exponential function $$e^x$$. To convert log-odds to a probability, use the inverse logit function $$e^x / (1 + e^x)$$.

Intercept

Plugging $$X = 0$$ into Equation (6.1), we find that the intercept $$\beta_0$$ is $$\ln(p/(1-p))$$, the log-odds when all predictors are 0 or at their reference level. Using the exponential function demonstrates that $$e^{\beta_0}$$ is the corresponding odds of the outcome and using the inverse logit function demonstrates that $$e^{\beta_0} / (1 + e^{\beta_0})$$ is the corresponding probability of the outcome.

Predictor coefficients

In linear regression, $$\beta_k$$ was the difference in the outcome associated with a 1-unit difference in $$X_k$$ (or between a level and the reference level). Similarly, in logistic regression, it is the difference in the log-odds of the outcome associated with a 1-unit difference in $$X_k$$. After some math, we see that $$e^{\beta_k}$$ is the odds ratio (OR) comparing individuals who differ by 1-unit in $$X_k$$. To see why exponentiating $$\beta_k$$ results in an OR, start by exponentiating both sides of the logistic regression equation to get the odds.

$\frac{p}{1-p} = e^{\beta_0 + \beta_1 X_1 + \beta_2 X_2 + \ldots + \beta_K X_K}$

For a continuous predictor $$X_1$$, the ratio of the odds at $$X_1 = x_1 + 1$$ to the odds at $$X_1 = x_1$$ (a one-unit difference) can be expressed as the following ratio, for which all the terms cancel except $$e^{\beta_1}$$.

$e^{\beta_0 + \beta_1 (X_1 + 1) + \beta_2 X_2 + \ldots + \beta_K X_K} / e^{\beta_0 + \beta_1 X_1 + \beta_2 X_2 + \ldots + \beta_K X_K} = e^{\beta_1}$

If the first predictor is instead categorical and we want the OR comparing the first non-reference level to the reference level, then we want the ratio of the odds at $$X_1 = 1$$ to the odds at $$X_1 = 0$$. This is a 1-unit difference, so the derivation above also applies to categorical predictors.

Summary of interpretation of regression coefficients

• The intercept is the log-odds of the outcome when all predictors are at 0 or their reference level. Use the exponential function $$(e^{\beta_0})$$ to convert the intercept to odds and the inverse logit function $$\left(e^{\beta_0} / (1 + e^{\beta_0})\right)$$ to convert the intercept to a probability.
• For a continuous predictor the regression coefficient is the log of the odds ratio comparing individuals who differ in that predictor by one unit, holding the other predictors fixed.
• For a categorical predictor, the regression coefficient is the log of the odds ratio comparing individuals at a given level of the predictor to those at the reference level, holding the other predictors fixed.
• To compute an OR for $$X_k$$, exponentiate the corresponding regression coefficient, $$e^{\beta_k}$$, thus converting the log of the odds ratio to an OR.
• When there are multiple predictors, ORs are called adjusted ORs (AORs).