The purpose of this notebook is to determine if implimenting Sparse Arrays will speed up the Connected Components portion of our pipeline.
** Initialization:** (inputs: image)
Give the Sparse Array a data member called "elements"
Run Connected Components on the input image
Run clusterComponents on input image
Threshold the clusterComponent by volume
For each index in this thresholded clusterComponent, add its value to elements
Give the Sparse Array a data member called "clusterList by using the genClusters function
**genClusters:** (input: image)
create a variable called clusterList that will contain the Clusters
for each unique label in the input image:
find which indices in the Sparse Array are equal to that label
input these indices into the Cluster class and append this Cluster to clusterList
return clusterList
**addValue:** (inputs: tuple, value)
if value is greater than 0:
set self.elements at tuple equal to value
else if value equals 0 and there already exists a value at tuple:
set delete self.elemnts at tuple
**readValue:** (inputs: tuple)
try:
value gets self.elements at tuple
catch KeyError:
value = 0 (a.k.a. if there exists no element with that tuple, return 0)
return value
In [135]:
%matplotlib inline
import matplotlib.pyplot as plt
import sys
sys.path.insert(0,'../code/functions/')
import tiffIO as tIO
import connectLib as cLib
import plosLib as pLib
import time
import scipy.ndimage as ndimage
import numpy as np
import time
In [123]:
import clusterComponents
class SparseArray:
def genClusters(self, image):
clusterList = []
for label in range(np.max(image)):
memberList = []
for z in range(len(image)):
for y in range(len(image[z])):
for x in range(len(image[z][y])):
if (self.readValue((z, y, x)) == label):
memberList.append((z, y, x))
self.addValue((z, y, x), 0)
clusterList.append(Cluster(memberList))
return clusterList
def __init__(self, input, threshold = 200):
self.elements = {}
imCC = clusterComponents.ClusterComponent(input)
imCC.volumeThreshold(threshold)
ccThresh = imCC.labeledIm
for z in range(len(ccThresh)):
for y in range(len(ccThresh[z])):
for x in range(len(ccThresh[z][y])):
self.addValue((z, y, x), ccThresh[z][y][x])
self.clusterList = self.genClusters(ccThresh)
def addValue(self, tuple, value):
if value > 0:
self.elements[tuple] = value
elif value == 0 and not(self.readValue(tuple) == 0):
del self.elements[tuple]
def readValue(self, tuple):
try:
value = self.elements[tuple]
except KeyError:
# could also be 0.0 if using floats...
value = 0
return value
In [77]:
import numpy as np
import math
class Cluster:
def __init__(self, members):
self.members = members
self.volume = self.getVolume()
def getVolume(self):
return len(self.members)
def getCentroid(self):
unzipList = zip(*self.members)
listZ = unzipList[0]
listY = unzipList[1]
listX = unzipList[2]
return [np.average(listZ), np.average(listY), np.average(listX)]
def getMembers(self):
return self.members
In [4]:
dataSubset = tIO.unzipChannels(tIO.loadTiff('../data/SEP-GluA1-KI_tp1.tif'))[0][0:5]
plt.imshow(dataSubset[0], cmap="gray")
plt.axis('off')
plt.title('Raw Data Slice at z=0')
plt.show()
In [9]:
#finding the clusters after plosPipeline
plosOutSub = pLib.pipeline(dataSubset)
In [10]:
#binarize output of plos lib
bianOutSub = cLib.otsuVox(plosOutSub)
In [11]:
#dilate the output based on neigborhood size
bianOutSubDil = ndimage.morphology.binary_dilation(bianOutSub).astype(int)
In [132]:
sparse = SparseArray(bianOutSubDil)
In [108]:
intensities = []
index = 0
for z in range(len(bianOutSubThresh)):
for y in range(len(bianOutSubThresh[z])):
for x in range(len(bianOutSubThresh[z][y])):
val = sparse.readValue((z, y, x))
if val > 0:
intensities.append(val)
index = index + 1
In [134]:
print 'Percentage of Volume\n\tActual: ' + str(1.0*np.max(index)/(1024*1024*3)) + '\tExpected: .015'
print 'Number of Clusters\n\tActual: ' + str(np.max(intensities)) + '\tExpected: 1500 - 2000'
In [114]:
displayIm = np.zeros_like(bianOutSubDil)
for z in range(len(bianOutSubDil)):
for y in range(len(bianOutSubDil[z])):
for x in range(len(bianOutSubDil[z][y])):
displayIm[z][y][x] = sparse.readValue((z, y, x))
In [125]:
plt.imshow(displayIm[3])
plt.show()
Results:
After Storing the Data as a sparse array and counting the non-zero values, I found that the non-zero values (a.k.a. the detected Synapses) represented about 1.5% of the data. It also detected 1754 Synapses, which is also around how many were detected without using Sparse Arrays. Both of these signs indicate that the Sparse Array is storing values correctly.
The display image also looks as we would expect it to.
Now, let's determine if it helps the Connected Components speed:
In [136]:
start_time = time.time()
sparse = SparseArray(bianOutSubDil)
print time.time() - start_time