Chapter 6 Assignment 6 - Streaming Analytics – Understanding Platform Popularity Across Age Groups

6.1 Introduction

I have been hired as a Data Analyst for The Streaming Analytics Division (SAD). I have been tasked with uncovering whether age group influences people’s preferred streaming platform.

The company wants to know if certain platforms (like Netflix, Hulu, Disney+, or Amazon) appeal more to specific age demographics. I am hoping my analysis will help guide targeted marketing, promotional strategies, and content investment decisions that align with audience preferences.

Using simulated survey data, I will conduct a reproducible R Markdown analysis to determine whether Platform Preference and Age Category are related. I will identify which age–platform combinations contribute most to any significant differences and assess the strength of the relationship using Cramer’s V.

6.2 Step 1: Data Preparation

I have been provided a dataset representing survey responses from people of three age groups: 18–25, 26–40, and 41+. Each respondent selected their preferred streaming platform from the following options: Netflix, Hulu, Disney+, Amazon, and Other.

## # A tibble: 1 × 1
##       n
##   <int>
## 1   300
Amazon Disney+ Hulu Netflix Other
18–25 4 22 23 47 4
26–40 11 25 16 41 7
41+ 39 14 7 23 17

6.3 Step 2: Visualization

6.3.1 Stacked Bar Chart & Clustered Bar Chart

The graph on the left represents the proportion of participants that prefer each streaming service by each age group. The graph on the right represents the preferred streaming service of each age group.

6.4 Step 3: Chi-Square Test of Independence

## 
##  Pearson's Chi-squared test
## 
## data:  c_table
## X-squared = 68.044, df = 8, p-value = 1.203e-11
  • The Chi-Square statistic (χ²) = 68.044

  • Degrees of freedom (df) = 8

  • The p-value = 1.203e-11 (statistically significant)

This test indicates that the relationship between age and platform preference is statistically significant.

6.5 Step 4: Observed, Expected, and Residual Values

6.5.1 Observed counts

6.5.1.1 The actual frequencies in our dataset:

Amazon Disney+ Hulu Netflix Other
18–25 4 22 23 47 4
26–40 11 25 16 41 7
41+ 39 14 7 23 17

6.5.2 Expected counts

6.5.2.1 What we would expect if age and platform were independent:

Amazon Disney+ Hulu Netflix Other
18–25 18 20.33333 15.33333 37 9.333333
26–40 18 20.33333 15.33333 37 9.333333
41+ 18 20.33333 15.33333 37 9.333333

6.5.3 Residuals

6.5.3.1 The difference between observed and expected values:

Amazon Disney+ Hulu Netflix Other
18–25 -3.299832 0.3696106 1.9578900 1.6439899 -1.7457431
26–40 -1.649916 1.0349098 0.1702513 0.6575959 -0.7637626
41+ 4.949747 -1.4045204 -2.1281413 -2.3015858 2.5095057

Older viewers prefer Amazon more than expected, and prefer Hulu and Netflix less than expected. On the other hand, younger people prefer Hulu and Netflix more than expected, and Amazon less than expected. 26-40 year olds also prefer Amazon less than expected.

6.6 Step 5: Contributions to the Chi-Square Statistic

6.6.1 Contributions

Amazon Disney+ Hulu Netflix Other
18–25 10.888889 0.136612 3.8333333 2.7027027 3.0476190
26–40 2.722222 1.071038 0.0289855 0.4324324 0.5833333
41+ 24.500000 1.972678 4.5289855 5.2972973 6.2976190

6.6.2 Percent Contributions

6.6.2.1 Which age-platform pairs drive the overall result?

Amazon Disney+ Hulu Netflix Other
18–25 16.002777 0.2007709 5.6336306 3.9720074 4.4789112
26–40 4.000694 1.5740436 0.0425983 0.6355212 0.8572916
41+ 36.006248 2.8991313 6.6559907 7.7851346 9.2552502

6.6.2.2 Let’s visualize it:

The cell that majorly contributed to our chi-square statistic is 41+/Amazon by over 36%, followed by the 18-25/Amazon cell at 16%. These two cells contributed the most because way more older people and way less younger people prefer Amazon than expected.

6.7 Step 6: Effect Size (Cramer’s V)

## Cramer V 
##   0.3368

There is a moderate association (0.3) between Age Category and Platform, which means that a person’s age group can give you an idea of what their preferred platform is, but it will not always predict it. However, due to the percentage contributions, the effect size may be stronger for Age Category & Amazon specifically; you can confidently assume that the younger a person is, the less they prefer Amazon, and as the age category rises, the likelihood of preference for Amazon rises too.

6.8 Step 7: Final Interpretation

The chi-square test revealed a significant relationship between age and platform preference, χ²(8, N = 300) = 68.044, p = 1.203e-11. The largest contributions came from the 18-25/Amazon and 41+/Amazon cells. Cramer’s V = 0.34 indicates a moderate association between Age Category and Platform. This suggests that younger viewers strongly disfavor Amazon and older viewers strongly prefer it. Instead, younger viewers opt for platforms such as Netflix or Hulu.