First check the data inside esoph with head. I like to see it with View function actually, however it does not shown on the html output.
head(esoph)
## agegp alcgp tobgp ncases ncontrols
## 1 25-34 0-39g/day 0-9g/day 0 40
## 2 25-34 0-39g/day 10-19 0 10
## 3 25-34 0-39g/day 20-29 0 6
## 4 25-34 0-39g/day 30+ 0 5
## 5 25-34 40-79 0-9g/day 0 27
## 6 25-34 40-79 10-19 0 7
cancer_tobacco_alcohol_sep <- glm(cbind(ncases, ncontrols) ~ agegp + unclass(tobgp) + unclass(alcgp), data = esoph, family = binomial())
anova(cancer_tobacco_alcohol_sep)
## Analysis of Deviance Table
##
## Model: binomial, link: logit
##
## Response: cbind(ncases, ncontrols)
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev
## NULL 87 227.241
## agegp 5 88.128 82 139.112
## unclass(tobgp) 1 17.522 81 121.591
## unclass(alcgp) 1 62.314 80 59.277
By loooking at DF, we see that most important factor is Age. After that Alcohol usage and tobacco usage comes together.
cancer_tobacco_alcohol_comb <- glm(cbind(ncases, ncontrols) ~ agegp + tobgp * alcgp, data = esoph, family = binomial())
anova(cancer_tobacco_alcohol_comb)
## Analysis of Deviance Table
##
## Model: binomial, link: logit
##
## Response: cbind(ncases, ncontrols)
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev
## NULL 87 227.241
## agegp 5 88.128 82 139.112
## tobgp 3 19.085 79 120.028
## alcgp 3 66.054 76 53.973
## tobgp:alcgp 9 6.489 67 47.484
On this analyse, we see that most important factor is using alcohol and tobacco at the same time. Age factor comes after using them together.