Figure: How large should effect sizes be in neuroimaging to have sufficient power?

Specification of alternative

In a brain map in an MNI template, with smoothness of 3 times the voxelsize, there are three similar active regions with voxelwise effect size D. The (spatial) size of the regions is relatively small (<200 voxels). We want to know how large D should be in order to have 80% power to detect all three regions using voxelwise FWE thresholding using Random Field Theory. Detect the region means that the maximum in the activated area exceeds the significance threshold.

Strategy

  1. Compute the voxelwise threshold for the specified smoothness and volume
    • FweThres = 5.12
  2. Compute the required power, so that the omnibus power is 80%
    • PowerPerRegion = 0.928
  3. How large should the maximum statistic in a (small) region be to exceed the voxelwise threshold with 0.928 power?
    • muMax = 4.47
  4. How does this voxel statistic translate to Cohen's D for a given sample size?
    • See Figure

In [1]:
% matplotlib inline
from __future__ import division
import os
import nibabel as nib
import numpy as np
from neuropower import peakdistribution
import scipy.integrate as integrate
import pandas as pd
import matplotlib.pyplot as plt
import palettable.colorbrewer as cb

1. What is the voxelwise threshold?


In [2]:
# From smoothness + mask to ReselCount

FWHM = 3
ReselSize = FWHM**3
MNI_mask = nib.load("MNI152_T1_2mm_brain_mask.nii.gz").get_data()
Volume = np.sum(MNI_mask)
ReselCount = Volume/ReselSize

print("ReselSize: "+str(ReselSize))
print("Volume: "+str(Volume))
print("ReselCount: "+str(ReselCount))
print("------------")

# From ReselCount to FWE treshold

FweThres_cmd = 'ptoz 0.05 -g %s' %ReselCount
FweThres = os.popen(FweThres_cmd).read()

print("FWE voxelwise GRF threshold: "+str(FweThres))


ReselSize: 27
Volume: 228483
ReselCount: 8462.33333333
------------
FWE voxelwise GRF threshold: 5.123062

2. Definition of alternative

Detect 1 region

We define a 'success' as a situation in which the maximum in the active field exceeds the threshold.


In [3]:
Power = 0.8

Detect 3 regions

What if there are three regions and we want to detect all three. What should the power be in each region to have omnibus 80% power (find all 3)


In [4]:
regions = 3
PowerPerRegion = Power**(1/3)
print("The power in each region should be: "+str(PowerPerRegion))


The power in each region should be: 0.928317766723

Detect 1 larger region

There is about 1 peak in 200 voxels (based on simulations) Before, we made the assumption that the activated field is so small that there is only one local maximum. We expect there to be only one local maximum if the activated region is smaller than 200 voxels (based on simulations), which seems reasonable. What if the activated regions is larger and we expect 3 peaks to appear in the field?

We treat the peaks as independent, which is reasonable, see this notebook for the simulated results.


In [5]:
PowerLarge = 1-(1-Power)**(1/3)
print("If we expect 3 peaks, then the power per peak should be: "+str(PowerLarge))


If we expect 3 peaks, then the power per peak should be: 0.415196452357

3. How large statistic in a field be to exceed the threshold with power 0.80?

We quantify this by computing the expected local maximum in the field (which is a null field elevated by value D). We use the distribution of local maxima of Cheng&Schwartzman to compute the power/effect size.


In [6]:
muRange = np.arange(1.8,5,0.01)
muLarge = []
muSingle = []
for muMax in muRange:
# what is the power to detect a maximum 
    power = 1-integrate.quad(lambda x:peakdistribution.peakdens3D(x,1),-20,float(FweThres)-muMax)[0]
    if power>PowerLarge:
        muLarge.append(muMax)
    if power>Power:
        muSingle.append(muMax)
    if power>PowerPerRegion:
        muUnion = muMax
        break
print("The power is sufficient in larger fields if mu equals: "+str(muLarge[0]))
print("The power is sufficient for one region if mu equals: "+str(muSingle[0]))
print("The power is sufficient for all regions if mu equals: "+str(muUnion))


The power is sufficient in larger fields if mu equals: 3.17
The power is sufficient for one region if mu equals: 4.0
The power is sufficient for all regions if mu equals: 4.47

5. From the required voxel statistic to Cohen's D for a given sample size


In [7]:
# Read in data
Data = pd.read_csv("tal_data.txt",sep=" ",header=None,names=['year','n'])
Data['source']='Tal'
David = pd.read_csv("david_data.txt",sep=" ",header=None,names=['year','n'])
David['source']='David'
Data=Data.append(David)

In [8]:
# add detectable effect
Data['deltaUnion']=muUnion/np.sqrt(Data['n'])
Data['deltaSingle']=muSingle[0]/np.sqrt(Data['n'])
Data['deltaLarge']=muLarge[0]/np.sqrt(Data['n'])

In [9]:
# add jitter for figure
stdev = 0.01*(max(Data.year)-min(Data.year))
Data['year_jitter'] = Data.year+np.random.randn(len(Data))*stdev

In [20]:
# Compute medians per year (for smoother)
Medians = pd.DataFrame({'year':
                     np.arange(start=np.min(Data.year),stop=np.max(Data.year)+1),
                      'TalMdSS':'nan',
                     'DavidMdSS':'nan',
                      'TalMdDSingle':'nan',
                      'TalMdDUnion':'nan',
                      'TalMdDLarge':'nan',
                     'DavidMdDSingle':'nan',
                     'DavidMdDUnion':'nan',
                     'DavidMdDLarge':'nan',
                        'MdSS':'nan',
                        'DSingle':'nan',
                        'DUnion':'nan',
                        'DLarge':'nan'
                       })


Out[20]:
DLarge DSingle DUnion DavidMdDLarge DavidMdDSingle DavidMdDUnion DavidMdSS MdSS TalMdDLarge TalMdDSingle TalMdDUnion TalMdSS year
0 nan nan nan nan nan nan nan nan nan nan nan nan 1988.0
1 nan nan nan nan nan nan nan nan nan nan nan nan 1989.0
2 nan nan nan nan nan nan nan nan nan nan nan nan 1990.0
3 nan nan nan nan nan nan nan nan nan nan nan nan 1991.0
4 nan nan nan nan nan nan nan nan nan nan nan nan 1992.0
5 nan nan nan nan nan nan nan nan nan nan nan nan 1993.0
6 nan nan nan nan nan nan nan nan nan nan nan nan 1994.0
7 nan nan nan nan nan nan nan nan nan nan nan nan 1995.0
8 nan nan nan nan nan nan nan nan nan nan nan nan 1996.0
9 nan nan nan nan nan nan nan nan nan nan nan nan 1997.0
10 nan nan nan nan nan nan nan nan nan nan nan nan 1998.0
11 nan nan nan nan nan nan nan nan nan nan nan nan 1999.0
12 nan nan nan nan nan nan nan nan nan nan nan nan 2000.0
13 nan nan nan nan nan nan nan nan nan nan nan nan 2001.0
14 nan nan nan nan nan nan nan nan nan nan nan nan 2002.0
15 nan nan nan nan nan nan nan nan nan nan nan nan 2003.0
16 nan nan nan nan nan nan nan nan nan nan nan nan 2004.0
17 nan nan nan nan nan nan nan nan nan nan nan nan 2005.0
18 nan nan nan nan nan nan nan nan nan nan nan nan 2006.0
19 nan nan nan nan nan nan nan nan nan nan nan nan 2007.0
20 nan nan nan nan nan nan nan nan nan nan nan nan 2008.0
21 nan nan nan nan nan nan nan nan nan nan nan nan 2009.0
22 nan nan nan nan nan nan nan nan nan nan nan nan 2010.0
23 nan nan nan nan nan nan nan nan nan nan nan nan 2011.0
24 nan nan nan nan nan nan nan nan nan nan nan nan 2012.0
25 nan nan nan nan nan nan nan nan nan nan nan nan 2013.0
26 nan nan nan nan nan nan nan nan nan nan nan nan 2014.0
27 nan nan nan nan nan nan nan nan nan nan nan nan 2015.0

In [21]:
for yearInd in (range(len(Medians))):
    # Compute medians for Tal's data
    yearBoolTal = np.array([a and b for a,b in zip(Data.source=="Tal",Data.year==Medians.year[yearInd])])
    Medians.TalMdSS[yearInd] = np.median(Data.n[yearBoolTal])
    Medians.TalMdDSingle[yearInd] = np.median(Data.deltaSingle[yearBoolTal])
    Medians.TalMdDUnion[yearInd] = np.median(Data.deltaUnion[yearBoolTal])
    Medians.TalMdDLarge[yearInd] = np.median(Data.deltaLarge[yearBoolTal])
    # Compute medians for David's data
    yearBoolDavid = np.array([a and b for a,b in zip(Data.source=="David",Data.year==Medians.year[yearInd])])
    Medians.DavidMdSS[yearInd] = np.median(Data.n[yearBoolDavid])
    Medians.DavidMdDSingle[yearInd] = np.median(Data.deltaSingle[yearBoolDavid])
    Medians.DavidMdDUnion[yearInd] = np.median(Data.deltaUnion[yearBoolDavid])
    Medians.DavidMdDLarge[yearInd] = np.median(Data.deltaLarge[yearBoolDavid])
    # Compute medians for all data
    yearBool = np.array(Data.year==Medians.year[yearInd])
    Medians.MdSS[yearInd] = np.median(Data.n[yearBool])
    Medians.DSingle[yearInd] = np.median(Data.deltaSingle[yearBool])
    Medians.DUnion[yearInd] = np.median(Data.deltaUnion[yearBool])
    Medians.DLarge[yearInd] = np.median(Data.deltaLarge[yearBool])


/Users/Joke/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
/Users/Joke/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
/Users/Joke/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:6: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
/Users/Joke/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:7: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
/Users/Joke/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:10: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
/Users/Joke/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:11: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
/Users/Joke/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:12: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
/Users/Joke/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:13: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
/Users/Joke/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:16: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
/Users/Joke/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:17: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
/Users/Joke/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:18: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
/Users/Joke/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:19: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

In [22]:
Medians


Out[22]:
DLarge DSingle DUnion DavidMdDLarge DavidMdDSingle DavidMdDUnion DavidMdSS MdSS TalMdDLarge TalMdDSingle TalMdDUnion TalMdSS year
0 0.768838 0.970143 1.08413 0.768838 0.970143 1.08413 17 17 NaN NaN NaN NaN 1988.0
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1989.0
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1990.0
3 1.29415 1.63299 1.82487 1.29415 1.63299 1.82487 6 6 NaN NaN NaN NaN 1991.0
4 0.691246 0.872234 0.974722 0.691246 0.872234 0.974722 29 29 NaN NaN NaN NaN 1992.0
5 1.41767 1.78885 1.99904 1.41767 1.78885 1.99904 5 5 NaN NaN NaN NaN 1993.0
6 1.20746 1.5236 1.70263 1.20746 1.5236 1.70263 7 7 NaN NaN NaN NaN 1994.0
7 1.14829 1.44895 1.6192 1.14829 1.44895 1.6192 8 8 NaN NaN NaN NaN 1995.0
8 1.05667 1.33333 1.49 1.05667 1.33333 1.49 9 9 NaN NaN NaN NaN 1996.0
9 0.955791 1.20605 1.34776 0.955791 1.20605 1.34776 11 11 3.17 4 4.47 1 1997.0
10 1.02955 1.29912 1.45177 1.0616 1.33956 1.49696 9 9.5 0.985883 1.24402 1.39019 10.5 1998.0
11 1.12076 1.41421 1.58038 1.12076 1.41421 1.58038 8 8 1.00244 1.26491 1.41354 10 1999.0
12 1.05667 1.33333 1.49 1.00244 1.26491 1.41354 10 9 1.19815 1.51186 1.6895 7 2000.0
13 1.00244 1.26491 1.41354 1.00244 1.26491 1.41354 10 10 1.12076 1.41421 1.58038 8 2001.0
14 0.9151 1.1547 1.29038 0.9151 1.1547 1.29038 12 12 1.00244 1.26491 1.41354 10 2002.0
15 0.955791 1.20605 1.34776 0.955791 1.20605 1.34776 11 11 0.935446 1.18037 1.31907 11.5 2003.0
16 0.9151 1.1547 1.29038 0.9151 1.1547 1.29038 12 12 0.8792 1.1094 1.23975 13 2004.0
17 0.847218 1.06904 1.19466 0.847218 1.06904 1.19466 14 14 0.847218 1.06904 1.19466 14 2005.0
18 0.81849 1.0328 1.15415 0.832854 1.05092 1.1744 14.5 15 0.81849 1.0328 1.15415 15 2006.0
19 0.81849 1.0328 1.15415 0.7925 1 1.1175 16 15 0.847218 1.06904 1.19466 14 2007.0
20 0.847218 1.06904 1.19466 0.81849 1.0328 1.15415 15 14 0.847218 1.06904 1.19466 14 2008.0
21 0.768838 0.970143 1.08413 0.768838 0.970143 1.08413 17 17 0.768838 0.970143 1.08413 17 2009.0
22 0.727248 0.917663 1.02549 0.708834 0.894427 0.999522 20 19 0.737212 0.930236 1.03954 18.5 2010.0
23 0.747176 0.942809 1.05359 0.747176 0.942809 1.05359 18 18 0.747176 0.942809 1.05359 18 2011.0
24 0.708834 0.894427 0.999522 NaN NaN NaN NaN 20 0.708834 0.894427 0.999522 20 2012.0
25 0.708834 0.894427 0.999522 NaN NaN NaN NaN 20 0.708834 0.894427 0.999522 20 2013.0
26 0.675846 0.852803 0.953007 NaN NaN NaN NaN 22 0.675846 0.852803 0.953007 22 2014.0
27 0.647074 0.816497 0.912435 0.708834 0.894427 0.999522 20 24 0.647074 0.816497 0.912435 24 2015.0

In [23]:
# add logscale

Medians.MdSSLog = [np.log(x) for x in Medians.MdSS]
Data.nLog = [np.log(x) for x in Data.n]

The figure per List (Tal or David)


In [12]:
twocol = cb.qualitative.Paired_12.mpl_colors

fig,axs = plt.subplots(1,2,figsize=(12,5))
fig.subplots_adjust(hspace=.5,wspace=.3)
axs=axs.ravel()
axs[0].plot(Data.year_jitter[Data.source=="Tal"],Data.n[Data.source=="Tal"],"r.",color=twocol[0],alpha=0.5,label="")
axs[0].plot(Data.year_jitter[Data.source=="David"],Data.n[Data.source=="David"],"r.",color=twocol[2],alpha=0.5,label="")
axs[0].plot(Medians.year,Medians.TalMdSS,color=twocol[1],lw=3,label="Neurosynth")
axs[0].plot(Medians.year,Medians.DavidMdSS,color=twocol[3],lw=3,label="David et al.")
axs[0].set_xlim([1993,2016])
axs[0].set_ylim([0,200])
axs[0].set_xlabel("Year")
axs[0].set_ylabel("Median Sample Size")
axs[0].legend(loc="upper left",frameon=False)

axs[1].plot(Data.year_jitter[Data.source=="Tal"],Data.deltaUnion[Data.source=="Tal"],"r.",color=twocol[0],alpha=0.5,label="")
axs[1].plot(Data.year_jitter[Data.source=="David"],Data.deltaUnion[Data.source=="David"],"r.",color=twocol[2],alpha=0.5,label="")
axs[1].plot(Medians.year,Medians.TalMdDUnion,color=twocol[1],lw=3,label="Neurosynth")
axs[1].plot(Medians.year,Medians.DavidMdDUnion,color=twocol[3],lw=3,label="David et al.")
axs[1].set_xlim([1993,2016])
axs[1].set_ylim([0,3])
axs[1].set_xlabel("Year")
axs[1].set_ylabel("Effect Size with 80% power")
axs[1].legend(loc="upper right",frameon=False)
plt.show()


The figure for different Alternatives


In [17]:
twocol = cb.qualitative.Paired_12.mpl_colors

fig,axs = plt.subplots(1,2,figsize=(12,5))
fig.subplots_adjust(hspace=.5,wspace=.3)
axs=axs.ravel()
axs[0].plot(Data.year_jitter,Data.n,"r.",color=twocol[0],alpha=0.5,label="")
axs[0].plot(Medians.year,Medians.MdSS,color=twocol[1],lw=3,label="Median Sample Size")
axs[0].set_xlim([1993,2016])
axs[0].set_ylim([0,200])
axs[0].set_xlabel("Year")
axs[0].set_ylabel("Sample Size")
axs[0].legend(loc="upper left",frameon=False)

axs[1].plot(Data.year_jitter,Data.deltaUnion,"r.",color=twocol[0],alpha=0.5,label="")
axs[1].plot(Data.year_jitter,Data.deltaSingle,"r.",color=twocol[2],alpha=0.5,label="")
axs[1].plot(Data.year_jitter,Data.deltaLarge,"r.",color=twocol[4],alpha=0.5,label="")
axs[1].plot(Medians.year,Medians.DUnion,color=twocol[1],lw=3,label="Alternative: detect 3 regions")
axs[1].plot(Medians.year,Medians.DSingle,color=twocol[3],lw=3,label="Alternative: detect 1 region")
axs[1].plot(Medians.year,Medians.DLarge,color=twocol[5],lw=3,label="Alternative: detect 1 (larger) region")
axs[1].set_xlim([1993,2016])
axs[1].set_ylim([0,3])
axs[1].set_xlabel("Year")
axs[1].set_ylabel("Effect Size detectable with 80% power")
axs[1].legend(loc="upper right",frameon=False)
#plt.savefig("ReqEffSize.pdf")


Out[17]:
<matplotlib.legend.Legend at 0x10bbdad90>

Figure with logscale and only 1 condition


In [27]:
twocol = cb.qualitative.Paired_12.mpl_colors

fig,axs = plt.subplots(1,2,figsize=(12,5))
fig.subplots_adjust(hspace=.5,wspace=.3)
axs=axs.ravel()
axs[0].plot(Data.year_jitter,Data.n,"r.",color=twocol[0],alpha=0.5,label="")
axs[0].plot(Medians.year,Medians.MdSS,color=twocol[1],lw=3,label="log(Median Sample Size)")
axs[0].set_xlim([1993,2016])
axs[0].set_ylim([0,200])
axs[0].set_xlabel("Year")
axs[0].set_ylabel("log(Sample Size)")
axs[0].legend(loc="upper left",frameon=False)

#axs[1].plot(Data.year_jitter,Data.deltaUnion,"r.",color=twocol[0],alpha=0.5,label="")
axs[1].plot(Data.year_jitter,Data.deltaSingle,"r.",color=twocol[2],alpha=0.5,label="")
#axs[1].plot(Data.year_jitter,Data.deltaLarge,"r.",color=twocol[4],alpha=0.5,label="")
#axs[1].plot(Medians.year,Medians.DUnion,color=twocol[1],lw=3,label="Alternative: detect 3 regions")
axs[1].plot(Medians.year,Medians.DSingle,color=twocol[3],lw=3,label="Median detectable effect size")
#axs[1].plot(Medians.year,Medians.DLarge,color=twocol[5],lw=3,label="Alternative: detect 1 (larger) region")
axs[1].set_xlim([1993,2016])
axs[1].set_ylim([0,3])
axs[1].set_xlabel("Year")
axs[1].set_ylabel("Effect Size detectable with 80% power")
axs[1].legend(loc="upper right",frameon=False)
plt.savefig("ReqEffSize.jpg")