Last week, a study was published that claimed to establish a link between casual marijuana use and abnormalities in the brains of recreational users. Intrigued by a claim made by one of the paper's authors in the wave of ensuing press coverage, UC Berkeley computational biologist Lior Pachter decided to take a closer look.
In reading the news last week I came across multiple reports claiming that even casually smoking marijuana can change your brain. I usually don't pay much attention to such articles; I've never smoked a joint in my life. In fact, I've never even smoked a cigarette. So even though as a scientist I've been interested in cannabis from the molecular biology point of view, and as a citizen from a legal point of view, the issues have not been personal. However reading a USA Today article about the paper, I noticed that the principal investigator Hans Breiter was claiming to be a psychiatrist and mathematician. That is an unusual combination so I decided to take a closer look.
I immediately found out the claim was a lie. In fact, the totality of math credentials of Hans Breiter consist of some logic/philosophy courses during a year abroad at St. Andrews while he was a pre-med student at Northwestern. Even being an undergraduate major in mathematics does not make one a mathematician, just as being an undergraduate major in biology does not makes one a doctor. Thus, with his outlandish claim, Hans Breiter had succeeded in personally offending me! So, I decided to take a look at his paper underlying the multiple news reports:
- J.M. Gilman et al., Cannabis Use Is Quantitatively Associated with Nucleus Accumbens and Amygdala Abnormalities in Young Adult Recreational Users, Journal of Neuroscience (Neurobiology of Disease section), 34 (2014), 5529–5538.
This is quite possibly the worst paper I've read all year (as some of my previous blog posts show I am saying something with this statement). Here is a breakdown of some of the issues with the paper:
1. STUDY DESIGN
First of all, the study has a very small sample size, with only 20 "cases" (marijuana users), a fact that is important to keep in mind in what follows.
The title uses the term "recreational users" to describe them, and in the press release accompanying the article Breiter says that "Some of these people only used marijuana to get high once or twice a week. People think a little recreational use shouldn't cause a problem, if someone is doing OK with work or school. Our data directly says this is not the case." In fact, the majority of users in the study were smoking more than 10 joints per week. There is even a person in the study smoking more than 30 joints per week (as disclosed above, I'm not an expert on this stuff but if 30 joints per week is "recreation" then it seems to me that person is having a lot of fun).
More importantly, Breiter's statement in the press release is a lie. There is no evidence in the paper whatsoever, not even a tiny shred, that the users who were getting high once or twice a week were having any problems. There are also other issues with the study design. For example, the paper claims the users are not "abusing" other drugs, but it is quite possible that they are getting high on cocaine, heroin, or ??? as well, an issue that could quite possibly affect the study. The experiment consisted of an MRI scan of each user/control, but only a single scan was done. Given the variability in MRI scans this also seems problematic.
2. MULTIPLE TESTING
The study looked at three aspects of brain morphometry in the study participants: gray matter density, volume and shape. Each of these morphometric analyses constituted multiple tests. In the case of gray matter density, estimates were based on small clusters of voxels, resulting in 123 tests (association of each voxel cluster with marijuana use). Volumes were estimated for four regions: left and right nucleus accumbens and amygdala. Shape was also tested in the same four regions. What the authors should have done is to correct the p-values computed for each of these tests by accounting for the total number of tests performed. Instead, (Bonferroni) corrections were performed separately for each type of analysis. For example, in the volume analysis p-values were required to be less than 0.0125 = 0.05/4.
In other words, the extent of testing was not properly accounted for. Even so, many of the results were not significant.
For example, the volume analysis showed no significant association for any of the four tested regions. The best case was the left nucleus accumbens (Figure 1C) with a corrected p-value of 0.015 which is over the authors' own stated required threshold of 0.0125 (see caption). They use the language "The association with drug use, after correcting for 4 comparisons, was determined to be a trend toward significance" to describe this non-effect.
It is worth noting that the removal of the outlier at a volume of over 800 mm3 would almost certainly flatten the line altogether and remove even the slight effect. It would have been nice to test this hypothesis but the authors did not release any of their data.
In the Fox News article about the paper, Breiter is quoted as saying the following:
"For the NAC [nucleus accumbens], all three measures were abnormal, and they were abnormal in a dose-dependent way, meaning the changes were greater with the amount of marijuana used," Breiter said. "The amygdala had abnormalities for shape and density, and only volume correlated with use. But if you looked at all three types of measures, it showed the relationships between them were quite abnormal in the marijuana users, compared to the normal controls."
The result above shows this to be a lie. Volume did not significantly correlate with use.
This is all very bad, but things get uglier the more one looks at the paper. In the tables reporting the p-values, the authors do something I have never seen before in a published paper. They report the uncorrected p-values, indicating those that are significant (prior to correction) in boldface, and then put an asterisk next to those that are significant after their (incomplete) correction.
I realize my own use of boldface is controversial… but what they are doing is truly insane. The fact that they put an asterisk next to the values significant after correction indicates they are aware that multiple testing is required. So why bother boldfacing p-values that they know are not significant? The overall effect is an impression that more tests are significant that is actually the case. See for yourself in their Table 4:
The fact that there are multiple columns is also problematic. Separate tests were performed for smoking occasions per day, joints per occasion, joints per week and smoking days per week. These measures are highly correlated, but even so multiply testing them requires multiple test correction. The authors simply didn't perform it. They say "We did not correct for the number of drug use measures because these measures tend not be independent of each other". In other words, they multiplied the number of tests by four, and chose to not worry about that. Unbelievable.
Then there is Table 5, where the authors did not report the p-values at all, only whether they were significant or not… without correction:
3. CORRELATION VS. CAUSATION
This issue is one of the oldest in the book. There is even a wikipedia entry about it. Correlation does not imply causation. Yet despite the fact the every result in the paper is directed at testing for association, in the last sentence of the abstract they say "These data suggest that marijuana exposure, even in young recreational users, is associated with exposure-dependent alterations of the neural matrix of core reward structures and is consistent with animal studies of changes in dendritic arborization." At a minimum, such a result would require doing a longitudinal study. Breiter takes this language to an extreme in the press release accompanying the article. I repeat the statement he made that I quoted above where I boldface the causal claim: ""Some of these people only used marijuana to get high once or twice a week. People think a little recreational use shouldn't cause a problem, if someone is doing OK with work or school. Our data directly says this is not the case." I believe that scientists should be sanctioned for making public statements that directly contradict the content of their papers, as appears to be the case here. There is precedent for this.