Revisions 2
/ 4 min read
Table of Contents
Reviewer 1:
The authors have addressed my concerns.
Reviewer 2:
First round
The small sample size (n=15) really limits the generalizability and statistical power of the study. This is the largest weakness of this study and one that is difficult to overlook. With arguments being made in the literature that sample sizes of N>=100 are needed for many brain mapping investigations, a sample size of 15 really only represents a “proof-of-concept” level investigation. Many more subjects would be necessary before this would be a major contribution in the space of claustrum investigations using MRI (of any field strength).
The reviewer is raising two concerns related to the sample size of 15, one being generalizability and the second being statistical power. We would like to comment on each of these aspects separately.
-
Generalizability. Generalizability of the parameter estimates (in our case the magnitude of the claustrum activity) is indeed a concern for smaller sample sizes, as the estimates become less reliable with smaller N. However, we are mainly concerned with the presence vs. absence of an effect, rather than its magnitude (see recent editorial in the Journal of Neuroscience for a discussion of this topic (Editorial, 2020)). Therefore, generalizability of parameter estimates is a minor concern, at least at this stage. If the reviewer has replicability (rather than generalizability) in mind, we would like to point out that in addition to sample- level results we also report significant results at the level of individual subjects. In cases where a basic effect that is expected to be present in all healthy adult humans is present, reporting reliable effects at individual subject level is a viable alternative (Smith & Little, 2018).
-
Statistical power. Statistical power is influenced not only by the sample size, but also by the size of underlying true effects. A priori power analysis in neuroimaging studies is not trivial, because the underlying true effects are often unknown, and previously reported ones are unreliable (Poldrack et al., 2017). Our sample size was carefully justified a priori during preregistration (https://aspredicted.org/hj4sr.pdf point 7), which is best practice in cognitive neuroscience and psychology research (Nosek et al., 2022). It is designed to detect large effects, and we agree that this could have prevented us from detecting potentially medium or small-size effects in the auditory claustrum. However, visual effects we found cannot be explained by our moderate sample size, as sample size does not affect the likelihood of false positive findings (which is typically kept at 0.05). Crucially, we are not aware of any simulations or empirical evidence that would claim that at least 100 participants are necessary.
We do agree with the reviewer that our study can be viewed as a “proof-of- concept”, which is also its merit, as it shows, for the first time in humans, that it is possible to measure evoked sensory responses within the claustrum using fMRI. Future studies could focus on reliable estimation of the magnitude of visual claustrum responses. Overall, we are convinced that we did our best to strike a balance between responsible resource usage and good scientific practice during sample size planning. We have now added the following text to the discussion section of the manuscript
Second round
-
Thank you for addressing the majority of the comments.
-
This really is a pilot study, however, and the fact that it was done at 7T is not a decent enough reason to suggest that it is not. It should indicate this “proof-of-concept” status in the Abstract, Introduction, as well as the Conclusion section. The authors are commended that they preregistered their study.
-
But N=17 is still a very small sample size by today’s standards where dozens, if not hundreds of subjects, are used. That the authors are unfamiliar with papers suggesting that 100’s of subjects might be needed to find reliable and repeatable effects suggests a little review of the recent neuroimaging literature would be beneficial.
-
Power, while dependent on a measure of effect size, is also a function of sample size. For a poorly known effect size, ensuing a robust sample size is the best way to ensure optimal statistical power (which they still do not attempt to compute). But their mention that power analyses might be needed for future studies of the claustrum is appreciated.
-
Please do check those names in the reference section once again. Notably, the author list in the Torgerson study. The authors might also be interested in the following review of the claustrum, as well: https://pubmed.ncbi.nlm.nih.gov/25426062/