Correlation is Not Causation

By Susan Goldhaber MPH — Oct 20, 2021
"Synthetic chemical in consumer products linked to early death, study finds.” “People with the highest levels of phthalates had a greater risk of death from any cause, especially cardiovascular mortality, according to a study published today in a peer-reviewed journal.” Let’s take a look behind the headlines, at the study itself, to see what it actually says. [1]
Image by jamesoladujoye from Pixabay

Before jumping in, let’s provide a bit of background.

Phthalates are a large group of chemicals that increase the flexibility and durability of plastic. They are present in thousands of everyday products, including food packaging, textiles, cosmetics, soaps, medical devices, and construction materials. As I discussed previously, phthalates have been reviewed by a number of government agencies and found to be safe. However, they are increasingly facing scrutiny by government agencies, including the EPA, and have been targeted for bans by some States and environmental groups claiming that phthalates cause a myriad of health effects in the population.

What we are all trying to understand is whether there is a causal relationship between exposure to phthalates and early deaths due to heart and other diseases? 

What is the new study that drove the scary headlines?

Trasande and his colleagues examined the association of exposure to phthalates with deaths in the US, calculating the total costs of increased deaths and lost economic productivity. The “gold standard” remains the randomized controlled trial where the unexposed control group is compared to the group exposed to the treatment or, in this case, chemical of concern. However, even a study like this does not provide definitive proof; there need to be many more studies showing the same result and other confirmatory data.

This study, population-based, is considered secondary evidence and demonstrates correlation, not causation. Participants are grouped based on exposure and the difference in outcomes; in this instance, disease and deaths are assessed. The result is presented as a hazard ratio (HR); between the higher and lower exposure groups- an HR greater than 1 suggests a possible correlation between the chemical exposure and disease or deaths, but again does not show that the chemical caused the disease or deaths.        

Eleven phthalate metabolites (breakdown products of phthalates measurable in urine) were measured in 5,303 adults in 2001-2002 or 2009-2010. Phthalate levels in urine were grouped into three categories: low, medium, or high. Phthalates were characterized as coming from personal care products and cosmetics or used in food packaging materials and flooring. Deaths among participants through 2015 were obtained from the National Death Index, a CDC database. Hazard ratios were calculated for all-cause, cardiovascular, and cancer deaths. The ratios were extrapolated to the US population and multiplied by the lifetime economic productivity loss to determine economic loss.

  What did the authors find?

  • Increases in all-cause deaths correlated with exposure to phthalates used in food packaging and flooring.    
  • Increases in cardiovascular deaths correlated with exposure to only one of the eleven metabolites.
  • Phthalates had no impact on deaths from cancer.
  • Extrapolating to the US population (55–64-year old’s) resulted in roughly 100,000 deaths and an estimated $43 billion in lost productivity due to phthalate exposure.

Based upon these findings, the authors concluded that further studies were needed but that regulatory action was urgently needed. I respectfully disagree, and here is why.

A significant shortcoming of the study is using participants with known cardiovascular disease or cancer at the beginning of the study. These participants would be more likely to die of these diseases than the other participants, skewing the study results. Apparently, the authors also recognized this, reanalyzing their data excluding these individuals, reducing the study to 3,951 participants. Interestingly, the results in this instance were only to be found in the supplement. The results now shown no correlation between phthalates and cardiovascular deaths. They were not included in the discussion or the extrapolation of economic costs, making the reported numbers significantly overstated.  

Correlation does not imply causation.

Even if two factors are correlated, there is no way to tell from this type of study whether or not phthalate exposure actually caused increased all-cause deaths.

  • Instead of X causing Y, a third hidden variable (Z) might affect both, resulting in correlation – the third-cause fallacy. 

For example, As ice cream sales increase, the rate of drowning increases sharply. Therefore, eating ice cream causes drowning.  This example fails to recognize that more ice cream is sold during hot summer months than in colder weather, and it is in these months, more people go swimming. The increased drowning deaths are caused by more exposure to water-based activities, such as swimming, not to ice cream.

  • Or two variables are not related at all, and the correlation appears by chance – a spurious correlation.

Per capita consumption of mozzarella cheese correlates with the number of civil engineering doctorates awarded. Not recognizing that this is a spurious correlation would lead you to conclude that eating a lot of cheese will get you a civil engineering Ph.D. You can find additional humorous spurious correlations here.

Conclusions

The study authors greatly overstate their results.  Consider two significant limitations. First, the urinary phthalate levels in urine were only measured at one time-point – with phthalate's half-life of 1-3 days, the measurement is not representative of long-term exposure. Second, death certificates are inaccurate and may not reflect underlying conditions contributing to the deaths.

At the very most, the data suggest a potential association between certain phthalates and increased mortality. It is more likely that this is a spurious correlation or a correlation caused by a third variable. The latest FDA guidance on the health effects of salt indicates that we consume too much salt. Could this be that hidden third variable? Salt tends to be higher in processed food than unprocessed food, and processed foods are more likely to be packaged than unprocessed food.  Thus, it could be the salt that is causing the increased deaths, not the phthalates.

Finally, the calculated excess deaths and loss of economic productivity in the US implies a degree of certainty that misleads well-intentioned journalists and misinforms the public. When studies such as this overreach with their conclusions, demanding regulatory action, I tend to be suspicious and wonder whether it was written to fulfill a particular agenda instead of a scientific one.  This study is a perfect example of why it is essential to read the study itself, not just the abstract. The most important parts of a study are often hidden in the details that only become apparent after a complete reading.    

[1] Synthetic chemical in consumer products linked to early death, study finds CNN