In [1]:
import os.path as op
import MRS.data as mrd
import MRS.utils as mru
import urllib as url
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats
import nitime.timeseries as nts
import nitime.viz as viz
%matplotlib inline
We start by downloading some MRS data from the Stanford Data Repository:
In [2]:
mrd.fetch_from_sdr(data='example')
In [3]:
fname = '12_1_PROBE_MEGA_L_Occ.nii.gz'
fpath = op.join(mrd.data_folder, fname)
Let's examine the structure of the file. This is a standard nifti file, so it can be read using nibabel.
In [4]:
import nibabel as nib
In [5]:
mrs_nifti = nib.load(fpath)
In [6]:
print(mrs_nifti.shape)
These are according to their their order:
If your data is not in a nifti file, or not organized according to these conventions, you will need to do some preprocessing to get it into this form!
The data has a 'complex' dtype:
In [7]:
mrs_data = mrs_nifti.get_data()
mrs_data.dtype
Out[7]:
That is, each number is a complex number, because the measurement is done in quadrature (see also this other notebook for more details).
Using nitime, we can plot the phase and the amplitude of the measurement. We select the 5th transient, which is the first water-suppressed transient in the experiment
In [8]:
ts_abs = nts.TimeSeries(np.abs(mrs_data[:,0,0,4,0,0]), sampling_rate=5000., time_unit='ms')
ts_angle = nts.TimeSeries(np.angle(mrs_data[:,0,0,4,0,0]), sampling_rate=5000., time_unit='ms')
In [9]:
fig = viz.plot_tseries(ts_abs)
fig = viz.plot_tseries(ts_angle)
The magnitude of the signal decays quickly. The is the FID curve. The phase precesses more-or-less systematically, as long as the FID is above the noise, but then the phase becomes incoherent.
Next, let's use the SMAL API to process the data and analyze it
In [10]:
import MRS.api as mrs
This API can accept as input either a full path to a nifti file, or an array that has the data organized as above.
In [11]:
G = mrs.GABA(fpath)
Upon initialization, the GABA object performed several stages of analysis and processing, that are required before fitting a model of GABA and quantification can be done:
Data is combined across coils, using the signal/noise$^2$ rule [Hall2013]. The data from each coil is also zero-order phase corrected, so that they all have similar phases.
On-resonance frequency shifts are corrected, by fitting a Lorentzian line-shape to the water peak in the non-water-suppressed transients (per-default, the first transient is discarded, to mitigate onset effects).
The spectrum of each transient/echo is derived.
The sum and differences between the echos for each transient are calculated.
Because the original time-series data were complex, the derived spectra are also complex. Following the previous practice in the field, we use the real part of the spectrum for our analysis. Here are the sum spectra and diff spectra (64 repetitions of each)
In [12]:
fig, (ax1, ax2) = plt.subplots(2)
for ax in [ax1, ax2]:
ax.set_ylabel('Transient (#)')
ax.invert_xaxis()
ax1.imshow(np.real(G.sum_spectra[:, G.idx]), interpolation='none', cmap=matplotlib.cm.hot,
extent=[mru.ppm_to_freq(4.0), mru.ppm_to_freq(0), 64, 0])
ax2.imshow(np.real(G.diff_spectra[:, G.idx]), interpolation='none', cmap=matplotlib.cm.hot,
extent=[mru.ppm_to_freq(4.0), mru.ppm_to_freq(0), 64, 0])
ax2.set_xticks(mru.ppm_to_freq(np.array([1,2,3,4])))
ax2.set_xticklabels([1,2,3,4])
ax1.set_xticks([])
ax2.set_xlabel('Frequency (ppm)')
fig.set_size_inches([10, 3])
plt.tight_layout()
Note that the object already has the slicing attribute idx, which is used to help you focus on the part of the spectrum that you might be interested in. Per default, this includes the range from 0.7 to 4.3 ppm. For 3T, at body temperature, this corresponds to approximately 50 Hz to 700 Hz. It does not contain the large water peak at 0 Hz. Most of the molecular species of interest in the analysis (GABA, Creatine, Glutamate and Glutamine) have resonance peaks between 2 and 4 ppm.
And here are the traces averaged over all transients for each one of these:
In [13]:
fig, (ax1, ax2) = plt.subplots(2)
ax1.plot(G.f_ppm[G.idx], np.mean(np.real(G.sum_spectra[:, G.idx]), 0))
ax2.plot(G.f_ppm[G.idx], np.mean(np.real(G.diff_spectra[:, G.idx]), 0))
for ax in (ax1, ax2):
ax.invert_xaxis()
ax2.set_xlabel('ppm')
Out[13]:
At this point, we are ready to fit the models to the various peaks in the spectrum. The GABA object contains methods for fitting a Lorentzian linewidth to the Creatine and Choline peaks around 3 PPM. Once the Creatine model is fit, we can use the phase shift in the Creatine peak to phase-correct the data in the GABA data as well.
Once this correction is done, a Gaussian model composed of either one or two Gaussian line-shapes is fit to the GABA data at 3 PPM. A cross-validation procedure is used to select the better model (single or double Gaussian): each model is fit to half of the data (even transients) and model prediction errors are assessed on the other half (odd transients). The same procedure is repeated fitting to the other half (odd transients) and checking on the first half (even transients). The model with smaller prediction error is chosen and fit to all of the transients.
In [14]:
G.fit_gaba()
Once the fit_gaba method has been called, we can plot the GABA model with the difference spectrum from which it was calculated:
In [15]:
fig, ax = plt.subplots(1)
ax.plot(G.f_ppm[G.idx], stats.nanmean(G.diff_spectra[G._gaba_transients, G.idx], 1).squeeze())
ax.plot(G.f_ppm[G.idx], stats.nanmean(G.sum_spectra[G._gaba_transients, G.idx], 1).squeeze())
ax.plot(G.f_ppm[G.gaba_idx], stats.nanmean(G.gaba_model, 0), 'r')
ax.invert_xaxis()
ax.set_xlabel('ppm')
fig.set_size_inches([10,6])
The areas under the curves for both the creatine and the GABA are calculated from the model parameters for the full function domain, and can be used to calculate their relative abundances:
In [16]:
print(stats.nanmean(G.gaba_auc)/stats.nanmean(G.creatine_auc))
Another option is to use the water as a reference:
In [17]:
G.fit_water()
print(stats.nanmean(G.gaba_auc)/stats.nanmean(G.water_auc))
We can demonstrate the cross-validation procedure, by considering the model fit on even/odd trials, compared the measurement. Model fit is relative to the noise in the data, which is demonstrated in the third panel below as the test-retest reliability of the measurement
In [19]:
fig, ax = plt.subplots(3)
ax[0].plot(G.f_ppm[G.gaba_idx], np.nanmean(G.gaba_model[::2], 0), label='Model | even')
ax[0].plot(G.f_ppm[G.gaba_idx], np.nanmean(G.gaba_signal[1::2], 0), label='Data | odd')
ax[0].legend()
ax[1].plot(G.f_ppm[G.gaba_idx], np.nanmean(G.gaba_model[1::2], 0), label='Model | odd')
ax[1].plot(G.f_ppm[G.gaba_idx], np.nanmean(G.gaba_signal[::2], 0), label='Data | even')
ax[1].legend()
ax[2].plot(G.f_ppm[G.gaba_idx], np.nanmean(G.gaba_signal[::2], 0), label='Data | even')
ax[2].plot(G.f_ppm[G.gaba_idx], np.nanmean(G.gaba_signal[1::2], 0), label='Data | odd')
ax[2].legend()
fig.set_size_inches([10, 6])