Tuesday, March 22, 2022

The importance of statictical power in experimental protocols

In 2016, the book, Moral Brains: The Neuroscience of Morality, was published. It was edited by bioethicist S.M. Liao (book review here). The book was a summary of the current neuroscience of morality as described by leading researchers. Their work was critiqued by philosophers who understood the science and the data analysis. 

A concern the philosophers raised was caution about over interpreting brain scan data. Their concern was that the sample size (number of people in the experiments) of most brain scan experiments was too small, due in part to the high cost of such research. That opened the possibility that results could be misleading. Small sample sizes can lead to insufficient statistical power, leading to irreproducible and thus probably inaccurate results. 

The philosophers' concern appears to have been right. The New York Times writes on a study that analyzed brain scan results from three huge brain scan studies. Based on results from those three, the researchers concluded that sample size in typical studies in the past was usually too small for the conclusions to be reliable. This was not just about studies on morality, but it includes essentially all brain scan research and possibly other areas of research, including cancer research. The NYT writes:
For two decades, researchers have used brain-imaging technology to try to identify how the structure and function of a person’s brain connects to a range of mental-health ailments, from anxiety and depression to suicidal tendencies.

But a new paper, published Wednesday in Nature, calls into question whether much of this research is actually yielding valid findings. Many such studies, the paper’s authors found, tend to include fewer than two dozen participants, far shy of the number needed to generate reliable results.

“You need thousands of individuals,” said Scott Marek, a psychiatric researcher at the Washington University School of Medicine in St. Louis and an author of the paper. He described the finding as a “gut punch” for the typical studies that use imaging to try to better understand mental health.

Studies that use magnetic-resonance imaging technology commonly temper their conclusions with a cautionary statement noting the small sample size. .... The median number of subjects in mental-health-related studies that use brain imaging is around 23, he added.

But the Nature paper demonstrates that the data drawn from just two dozen subjects is generally insufficient to be reliable and can in fact yield “massively inflated” findings,” Dr. Dosenbach said.

The authors ran millions of calculations by using different sample sizes and the hundreds of brain regions explored in the various major studies. Time and again, the researchers found that subsets of data from fewer than several thousand people did not produce results consistent with those of the full data set.

Dr. Marek said that the paper’s findings “absolutely” applied beyond mental health. Other fields, like genomics and cancer research, have had their own reckonings with the limits of small sample sizes and have tried to correct course, he noted.

“My hunch this is much more about population science than it is about any one of those fields,” he said.
Another source commented on this research:
Scientists rely on brain-wide association studies to measure brain structure and function -; using MRI brain scans -; and link them to complex characteristics such as personality, behavior, cognition, neurological conditions, and mental illness. But a study by researchers at Washington University School of Medicine in St. Louis and the University of Minnesota, published March 16 in Nature, shows that most published brain-wide association studies are performed with too few participants to yield reliable findings.

Such so-called underpowered studies are susceptible to uncovering strong but spurious associations by chance while missing real but weaker associations. Routinely underpowered brain-wide association studies result in a glut of astonishingly strong yet irreproducible findings that slow progress toward understanding how the brain works, the researchers said.

A 2011 article commented: Since its inception in 1990, fMRI has been used in an exceptionally large number of studies in the cognitive neurosciences, clinical psychiatry/ psychology, and presurgical planning (between 100,000 and 250,000 entries in PubMed, depending on keywords).

Looks like it's going to take some more time to figure the brain out. Guess that's no surprise. Heck, it took ~42 years and several hundred thousand experiments just to figure out that sample size was too small. It's slow going when one is slogging through the great Grimpen Mire of Dartmoore in Devon, location of Baskerville Hall, in the middle of the night. 

Sigh. Gotta listen to them philosophers more carefully. 


The machine colors in areas of increased or decreased 
brain activity in response to a physical or mental task




No comments:

Post a Comment