7.13 Separation

As with logistic regression (Section 6.10), Cox regression can have a problem with separation. However, unlike logistic regression, the outcome in a Cox regression is not a simple binary variable, but rather a combination of a binary event indicator and an event time. In logistic regression, separation occurs when the binary outcome is always 0 or always 1 within a level of a predictor. In Cox regression, however, separation only occurs when the binary event indicator is always 0 within a level of a predictor (all event times censored). If all event times within a level are uncensored (everyone experienced the event), there is no separation.

Example 7.7 (continued): Consider the subset of mothers with pre-pregnancy hypertension. We saw earlier that this is a small sample, and there was a zero in the two-way table of RF_PPTERM vs. the event indicator.

subdat1 <- natality %>% 
  filter(RF_PHYPE == "Yes")

table(subdat1$RF_PPTERM, subdat1$preterm01)
##        0  1
##   No  27  5
##   Yes  0  2

This zero does not cause a problem with separation – coxph() runs with no errors and we get an estimated AHR and standard error that seem reasonable (yes, the AHR and upper confidence bound are large, but as we saw earlier this is due to the small sample size).

fit1 <- coxph(Surv(gestage37, preterm01) ~ RF_PPTERM,
              data = subdat1)

cbind("HR"      = exp(summary(fit1)$coef[, "coef"]),
      "p-value" = summary(fit1)$coef[, "Pr(>|z|)"])
##                 HR 2.5 % 97.5 %  p-value
## RF_PPTERMYes 14.91  2.44  91.12 0.003435

However, what if instead of no censored times at RF_PPTERM == "Yes" they were all censored?

subdat2 <- natality %>% 
  filter(RF_PHYPE == "Yes")

# For illustration, change the two times in the "Yes" row to censored
subdat2$preterm01[subdat2$RF_PPTERM == "Yes" &
                  subdat2$preterm01 == 1] <- 0

table(subdat2$RF_PPTERM, subdat2$preterm01)
##        0  1
##   No  27  5
##   Yes  2  0

Now the 0 is in the preterm01 = 1 column indicating that all the times are censored for RF_PPTERM = "Yes". Using this data, coxph returns a warning and a HR approaching 0 or \(\infty\).

fit2 <- coxph(Surv(gestage37, preterm01) ~ RF_PPTERM,
              data = subdat2)
## Warning in coxph.fit(X, Y, istrat, offset, init, control, weights = weights, :
## Loglik converged before variable 1 ; coefficient may be infinite.
cbind("HR"      = exp(summary(fit2)$coef[, "coef"]),
      "p-value" = summary(fit2)$coef[, "Pr(>|z|)"])
##                         HR 2.5 % 97.5 % p-value
## RF_PPTERMYes 0.00000003847     0    Inf   0.999

Diagnosis and resolution

Before running a Cox regression, always check for separation using a complete case dataset so the sample size is the same as will be used in the regression. For each categorical predictor, create a two-way table of the predictor vs. the event indicator. For predictors that have any levels at which all observations are censored (zero events) solve the problem using filtering, collapsing, or removing as discussed in Section 6.10.4. An alternative, not covered here, is to use penalized Cox regression. See, for example, the coxphf package (Heinze et al. 2023a).


Heinze, Georg, Meinhard Ploner, Lena Jiricka, and Gregor Steiner. 2023a. Coxphf: Cox Regression with Firth’s Penalized Likelihood. https://cemsiis.meduniwien.ac.at/kb/wf/software/statistische-software/fccoxphf/.