Running source extraction algorithms

In this notebook, we use simulated data to introduce source extraction algorithms.

Setup plotting



In [1]:

    
%matplotlib inline
from thunder import Colorize
image = Colorize.image
import seaborn as sns
figsize = 12

Create data with ground truth



In [2]:

    
data, ts, truth = tsc.makeExample('sources', centers=10, noise=1.0, returnParams=True)

The data set is an Images object, in this case 100 images of size 100 x 200



In [3]:

    
data









    Out[3]:





Images
nrecords: 100
dtype: float
dims: min=(0, 0), max=(99, 199), count=(100, 200)

Look at the mean image



In [4]:

    
im = data.mean()



In [5]:

    
image(im, size=11)

Confirm that the ground truth exactly matches the sources, using the masks method to see the source outlines



In [6]:

    
image(truth.masks((100,200), base=im, outline=True), size=figsize)

Run an algorithm

Many common algorithms with sensible defaults are availiable directly through the top level SourceExtraction class and can be called by name.

We'll start by using the localmax method, which is an example of a "feature method". These methods compute simple statistics on the data and them use image-based operations to identifty likely sources. The local max algorithms simply identified sources around local peaks in the mean image.



In [7]:

    
from thunder import SourceExtraction
model = SourceExtraction('localmax')

We can use the model to fit the data, which yields a SourceModel



In [8]:

    
sources = model.fit(data)



In [9]:

    
image(sources.masks((100, 200), base=im, outline=True), size=figsize)

Some methods have parameters, such as the maxSources method for this method, which will likely improve the result by eliminating false positives



In [10]:

    
sources = SourceExtraction('localmax', maxSources=10).fit(data)
image(sources.masks((100, 200), base=im, outline=True), size=figsize)

Run a block algorithm

Another classes of algorithms perform operations on spatio temporal blocks to identify sources locally, and then merge sources across blocks. A variety of local operations are possible; many are based on matrix factorization, for example, non-negative matrix factorization.



In [11]:

    
model = SourceExtraction('nmf')

When fitting this mode, we need to specify the size of the block as an argument



In [12]:

    
sources = model.fit(data, size=(25,25))



In [13]:

    
sources









    Out[13]:





SourceModel
12 sources

Look at the result; it's likely not very good due to artifacts at the block boundaries



In [14]:

    
image(sources.masks((100, 200), base=im, outline=True, color='random'), size=figsize)

We can improve things by padding the blocks through an extra argument during fitting



In [15]:

    
sources = model.fit(data, size=(25,25), padding=7)



In [16]:

    
sources









    Out[16]:





SourceModel
43 sources

You'll find that all sources were found, but many now overlap (due to the padded region)



In [17]:

    
image(sources.masks((100, 200), base=im, outline=True, color='random'), size=figsize)

This can be improved through the use of a custom merger, for example, the OverlapBlockMerger, which will merge sources from each block to those in adjacent blocks so long as they overlap by a certain fraction.



In [18]:

    
from thunder.extraction import OverlapBlockMerger



In [19]:

    
model = SourceExtraction('nmf', merger=OverlapBlockMerger(0.25), minArea=100)



In [20]:

    
sources = model.fit(data, size=(25,25), padding=7)



In [21]:

    
sources









    Out[21]:





SourceModel
9 sources



In [22]:

    
image(sources.masks((100, 200), base=im, outline=True, color='random'), size=figsize)