From last week, I was able to generate 3D TIFF slices and image classifiers on Fear199 downsampled data. However, my problems were that:
1) The TIFF slices were odd, cigar-shaped tubes.
2) I was unable to generate a significant classifier using the existing data because of the weird image layout.
3) I had trouble loading in the TIFF stack despite having generated one via ImageJ
What I did this week was:
1) Figure out why my original data was the odd cigar-shaped data.
2) Correctly generate a subset of TIFF slices for Fear199.
3) Generate a pixel-based object classifier.
What I need help with/still need to learn:
1) How to interpret/better validate my classifier results (currently have hdf5/TIFF output, how can I validate this?)
2) How to apply this to density mapping
In [ ]:
## Script used to download nii run on Docker
from ndreg import *
import matplotlib
import ndio.remote.neurodata as neurodata
import nibabel as nb
inToken = "Fear199"
nd = neurodata()
print(nd.get_metadata(inToken)['dataset']['voxelres'].keys())
inImg = imgDownload(inToken, resolution=5)
imgWrite(inImg, "./Fear199.nii")
In [ ]:
## Method 1:
import os
import numpy as np
from PIL import Image
import nibabel as nib
import scipy.misc
TokenName = 'Fear199.nii'
img = nib.load(TokenName)
## Convert into np array (or memmap in this case)
data = img.get_data()
print data.shape
print type(data)
In [ ]:
## Method 2:
rawData = sitk.GetArrayFromImage(inImg) ## convert to simpleITK image to normal numpy ndarray
print type(rawData)
In a nutshell, method 1 generates an array with shape (x, y, z) -- specifically, (540, 717, 1358). The method 2 generates a numpy array with shape (z, y, x) -- specifically, (1358, 717, 540). Since we want the first column to be z slices, the original method was granting me x-slices (hence the cigar-tube dimensions).
In order to interconvert, we can either just use the rawData approach after directly calling from ndstore, or we can take our numpy array after loading from nibabel and use numpy's swapaxes method to just swap two of the dimensions (shown below).
In [ ]:
## if we have (i, j, k), we want (k, j, i) (converts nibabel format to sitk format)
new_im = newer_img.swapaxes(0,2) # just swap i and k
In [ ]:
plane = 0;
for plane in (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 100, 101, 102, 103, 104):
output = np.asarray(rawData[plane])
## Save as TIFF for Ilastik
scipy.misc.toimage(output).save('RAWoutfile' + TokenName + 'ITK' + str(plane) + '.tiff')
As shown above, I generated data for the first 13 planes, and then some subset of planes from z = 100 to z = 104. I then trained the classifier on the 0 through 12 data, and used the 100 to 104 slices to validate my results. Below I've included images of the raw and histogram equalized images at one specific slice to just show the raw images I was working with.
Raw Fear 199 at Z = 100:
Histogram Equalized Fear 199 at Z = 100:
In order to generate the histogram equalized TIFF slices, I wrote a Jupyter notebook here: https://github.com/NeuroDataDesign/seelviz/blob/gh-pages/Tony/ipynb/generate%2BhistEQ.ipynb
From there, I generated a pixel-based object-classifier for both the raw and the histogram equalized data. Shown first are images demonstrating me generating the classifier for the raw data. Then I show the images of me generating the classifier for the histogram equalized data.
First, the raw data classifier. I load in planes 0-12 TIFF slices and generate a classifier by training it to select for borders, background, and individual bright points:
Loading Raw Training Data
Selecting Borders in Raw Training Data
Selecting Bright Points in Raw Training Data
Closeup of Feature Selection in Raw Training Data
Object Output for Raw Training Data
In order to run Ilastik headlessly on 3D classified data, use globstring syntax to tell ilastik which images to combine for each volume. EG:
For me to get it to work, I made a folder called "runtime" where each file was named WITHOUT the underscores. Make sure to add the * to get it to work, also.
Repeating the process for the histogram equalized brains, we get:
Loading HistEq Training Data
Selecting Borders and Background in HistEq Training Data
Selecting Bright Points in HistEq Training Data
Object Output for HistEq Training Data
Again running our classifier using the headless display (command below), we can generate a TIFF probability mapping.
However, upon analyzing the object probabilities, I discovered that both are just black, blank TIFFs. Apparently, these stacks each had 0 objects.
Object Output for New Data (both)
As I mentioned briefly previously, I spent a lot of time fiddling with the inputs/outputs such that I could then run the classifier on new data. After running the object classifier on the new TIFF slice in both cases, my TIFF version of the probability map is completely black. Evidently, this indicates one of two problems:
1) I have not generated an object classifier (unlikely, given that I can see objects showing up on the predictions in Ilastik)
2) I am improperly using the headless display to call my classifier (likely, given the fiddling I went through to get that step to work).
An obvious workaround would be loading the new data inside the batch predictions menu built into Ilastik. I spent time trying to do so, but I ran into some odd challenges. When I loaded the individual TIFF slices, they're obviously the wrong dimensions (since they're each x-y-greyscale, while the classifier takes z-y-x-greyscale). When I created a TIFF stack, Ilastik threw a completely mysterious error (that, after Googling, seems to be resolved by redownloading an older version of Ilastik). I tried to do this, but that didn't work - the issue still happens to be open on their Github page.
As per my deliverable, I'm supposed to provide reasoning for my sample sizes and trials. I chose 12 samples because:
1) There are over 1300 total Z slices, and I need a manageable subset
2) When I started with 6 and 10 slices, the classifier didn't find any objects (see last step)
12 just happened to be the minimum number that I needed in order to get any number of raw data objects to show up in the classifier. Thus, that's why I did 12 for raw.
As for the HistEq reasoning, I just went with what worked for raw to be consistent.
Eventually, we want our methodology to do something similar to this: http://ilastik.org/documentation/counting/counting. We want to be able to use our pixel-based classifier to try to find the number of neurons in a given 3D image.
However, much to my chagrin, this methodology currently does NOT support 3D data -- it only supports 2D data. Need to do some tweaking for this one to work.
In [ ]: