“So why did they get such different results from so many earlier studies? In their response to Kripke, they offer a clear answer:
They adjusted for three hundred confounders.
This is a totally unreasonable number of confounders to adjust for. I’ve never seen any other study do anything even close. Most other papers in this area have adjusted for ten or twenty confounders. Kripke’s study adjusted for age, sex, ethnicity, marital status, BMI, alcohol use, smoking, and twelve diseases. Adjusting for nineteen things is impressive. It’s the sort of thing you do when you really want to cover your bases. Adjusting for 300 different confounders is totally above and beyond what anyone would normally consider.
Reading between the lines, one of the P&a co-authors was Robert Glynn, a Harvard professor of statistics who helped develop an algorithm that automatically identifies massive numbers of confounders to form a ‘propensity score’, then adjusts for it. The P&a study was one of the first applications of the algorithm on a controversial medical question. It looks like this study was partly intended to test it out. And it got the opposite result from almost every past study in this field.”—Scott Alexander, “More Confounders.” Slate Star Codex. June 24, 2019.
Open question: Are sleep aids bad for you?
Open question: Are confounders one of the central problems of reproducibility in science?