In business, we put a lot of faith in statistics to tell us about very large amounts of information or data in digestible figures. Practical knowledge of underlying statistical principles, therefore, makes you a much stronger participant in business decision-making—and helps you avoid the pitfalls of inaccurate conclusions that lead to business failure.
One of the first lessons people learn in statistics (a fundamental pillar of data science and analytics in general) is that “correlation does not imply causation.” It is also one of the most abused lessons in business statistics.
This lesson can prematurely end discussions about scientific findings, gets used as a fallacious but enduring criticism of statistical reasoning, and ultimately stands in as a means to push away inconvenient data.
To make great business decisions and to use analytics responsibly, we cannot push away data when it’s inconvenient. Data scientists must interpret and incorporate data precisely when it doesn’t confirm our beliefs about the business.
Now, in one sense, correlation not implying causation is certainly true, and if a student does not recognize this, many, many statistical errors and poor reasoning will result. For example: did you know that there is a positive correlation between ice cream consumption and deaths by swimming pool drowning?
The lesson here is not that ice cream consumption causes drowning. The lesson is that variables can be correlated, but it is not sufficient to find correlation and declare a causal relationship. In the ice cream case, the stronger correlation between ice cream consumption and drowning is probably warmer weather, since people eat more ice cream in warmer weather and swim more often.
The abuse of this notion occurs when it is taken to an extreme, when it’s translated from the subtle claim that “correlation does not describe a causal relationship but a measure of interdependence of variables” into a proclamation that correlation has nothing to say about causation. This interpretation is patently false; it fails to appreciate one of the main reasons why we use correlations: to discern causal relationships between events.
We use correlation as a stepping stone to unearth and understand relationships and positively impact the business world with new (and founded) opinions. Correlation is not the final word on a subject, but a starting position in pursuit of causal relationships. Once we’ve aligned our thinking toward this end of our analyses (a full embrace of our curiosity and scientific pursuit for causal relationships), we can enter into the finer points of statistical analysis as it concerns causation.
Say you’re selling outdoor equipment, and you’re presented with the following business question from your seasoned Sales & Marketing executive: how does our pricing effect sales velocity? You think: we could be looking at three different variables at first glance—time, price, and sales—and so you procure that perfect data set, dive right into your analysis, get results, and craft your response from there (since computing P-values and F-tests of Overall Significance can be done at the click of a button). However, after sharing your results with a coworker before sending them up the chain, she suggests there may actually be a fourth factor, weather, that you failed to account for; indeed, you check, and find that it does muddy your previously ironclad three-factor findings. Adopting a causation-first mindset broadens your thinking, and brings with it an openness for surprise at the hidden, the unknown.
The next time you are working with correlations in your data, be curious about the real relationships, dig into the data and see if you can discern a deeper understanding of how A might influence B, then C, and maybe discover a D as well. This search for causation is where the real magic and insight in data analytics happens. In the words of the data visualization master Edward Tufte, “Correlation isn’t causation, but sure is a hint.”
This post was written in collaboration with data scientist Christo Lute. To read our interview with Christo, click here.