BioLINCC is the National Heart, Lung, and Blood Institute (NHLBI) Biologic Specimen and Data Repository (National Heart, Lung and Blood Institute 2022). They have made available the following teaching datasets: the Childhood Asthma Management Program (CAMP) dataset, the Digitalis Investigation Group (DIG) dataset, and the Framingham Heart Study dataset.
- Digitalis Investigation Group (DIG)
- Childhood Asthma Management Program (CAMP)
- Framingham Heart Study
These teaching datasets, as provided by BioLINCC, are not appropriate for publication purposes. Each has been rendered anonymous through the application of certain statistical processes such as permutations and/or random visit selection.
Any analyses, interpretations, or conclusions reached herein are are only for the purpose of illustrating regression methods and are credited to the author, not to BioLINCC. The author makes no claim or implication that any inferences derived from these teaching datasets are valid estimates.
For the versions of these teaching datasets used in this text, the data were not changed but some variables were converted to factors with appropriate labels and only a subset of variables were retained. For the CAMP dataset, only data for the main study and 48-month follow-up were retained. For the Framingham study, the dataset
fram_time_invar_rmph.rData contains only time-invariant predictors and excludes individuals with prevalent coronary heart disease, angina pectoris, myocardial infarction, or stroke at baseline. Additionally, variables describing hypertension were set to missing for those with prevalent hypertension at baseline. Additional datasets containing both time-invariant and time-varying information were created for a set of outcomes.
Creating the Teaching Datasets
- Request the DIG, CAMP, and Framingham datasets from the Request a teaching dataset link at the NHLBI Teaching Datasets site.
- After your request has been granted, download the datasets
frmgham2.csvfrom the links provided by NHLBI.
- Download the R script files
CAMP Process.R, and
Framingham Process.Rfrom RMPH Resources.
- Run the R scripts
CAMP Process.R, and
Framingham Process.Rto process the raw data and create the following teaching datasets:
Rows and columns
These files have the following numbers of rows and columns:
##  6800 71
##  629 15
##  4215 20