Correlation, Causation, and Association: What Does It All Mean?
Let's clear something up: Correlation isn't causation, but it's important.
Posted Mar 30, 2010
In this instance the reader was mistaken, as I had specifically used the word "associated," but the comment made me think that maybe I should explain the differences between correlation, causation, and association. I'm a scientist studying addiction, and in the field, it's very important to be clear about what each of the words you use means.
Being clear about inferences in research
Correlation. When researchers find a correlation, which can also be called an association, what they are saying is that they found a relationship between two, or more, variables.
Correlations can be positive — so that as one variable (marijuana smoking) goes up, so does the other (relationship trouble); or they can be negative, which would mean that as one variable goes up (methamphetamine smoking) another goes down (grade point average).
The trouble is that, unless they are properly controlled for, there could be other variables affecting this relationship that the researchers don't know about. For instance, education, gender, and mental health issues could be behind the marijuana-relationship association (these variables were all controlled for by the researchers in that study).
Researchers have at their disposal a number of sophisticated statistical tools to control for these, ranging from the relatively simple (like multiple regression) to the highly complex and involved (multi-level modeling and structural equation modeling). These methods allow researchers to separate the effect of one variable from others, thereby leaving them more confident in making assertions about the true nature of the relationships they found. Still, even under the best analysis circumstances, correlation is not the same as causation.
Causation. When an article says that causation was found, this means that the researchers found that changes in one variable they measured directly caused changes in the other.
An example would be research showing that jumping off a cliff directly causes great physical damage. In order to do this, researchers would need to assign people to jump off a cliff (versus, let's say, jumping off of a 12-inch ledge) and measure the amount of physical damage caused. When they find that jumping off the cliff causes more damage, they can assert causality. (Good luck recruiting for that study!)
Most of the research you read about indicates a correlation between variables, not causation. You can find the keywords by carefully reading. If the article says something like "men were found to have," or "women were more likely to," they're talking about associations, not causation.
Why the difference?
The reason is that in order to actually be able to claim causation, the researchers have to split the participants into different groups, and assign them the behavior they want to study (like taking a new drug), while the rest don't.
This is in fact what happens in clinical trials of medication because the FDA requires proof that the medication actually makes people better (more so than a placebo). It's this random assignment to conditions that makes experiments suitable for the discovery of causality. Unlike in association studies, random assignment assures (if everything is designed correctly) that its the behavior being studied, and not some other random effect, that is causing the outcome.
Obviously, it is much more difficult to prove causation than it is to prove an association.
Should we just ignore associations?
No! Not at all! Not even close!
Correlations are crucial for research and still need to be looked at and studied, especially in some areas of research like addiction.
The reason is simple: We can't randomly give people drugs like methamphetamine as children and study their brain development to see how the stuff affects them — that would be unethical. So what we're left with is the study of what meth use (and use of other drugs) is associated with.
It's for this reason that researchers use special statistical methods to assess associations, making certain that they are also considering other things that may be interfering with their results.
In the case of the marijuana article, the researchers ruled out a number of other interfering variables known to affect relationships, like aggression, gender, education, closeness with other family members, etc. By doing so, they did their best to assure that the association found between marijuana and relationship status was real. Obviously, other possibilities exist, but as more researchers assess this relationship in different ways, we'll learn more about its true nature.
This is how research works.
It's also how we found out that smoking causes cancer. Through endlessly repeated findings showing an association. That turned out pretty well, I think.
© 2010 Adi Jaffe, All Rights Reserved.
Join Adi's mailing list.