To state the classification problem as a binary classification problem, we need to be able to find potential hosts in an infrared image. Potential hosts are galaxies that may or may not contain the radio source we see in the associated radio image.
Previously, I have found potential hosts by looking for local maxima in the IR image. This finds too many potential hosts, mostly due to background noise. Julie suggested fitting Gaussians to the image, and scikit-image provides a few blob detection functions that do this. In this notebook, I will try skimage's blob detection for detecting potential hosts.
In [47]:
import collections
import itertools
import logging
import pprint
import sys
import warnings
import matplotlib.pyplot
import numpy
import skimage.feature
sys.path.insert(1, '..')
import crowdastro.data
import crowdastro.rgz_analysis.consensus
import crowdastro.show
%matplotlib inline
warnings.simplefilter('ignore', UserWarning) # astropy always raises warnings on Windows.
Let's start by finding a test subject.
In [48]:
# subject = crowdastro.data.get_random_subject()
subject = crowdastro.data.db.radio_subjects.find_one({'zooniverse_id': 'ARG0003rq4'})
crowdastro.show.ir(subject)
matplotlib.pyplot.show()
ir = crowdastro.data.get_ir(subject)[:200, :200] # This image is, for some reason, 1 px too big.
The most accurate blob detection skimage offers is Laplacian of Gaussian, so I will try using this. It's hard to tell what parameters to use, so I'll have to experiment a bit.
In [49]:
blobs = skimage.feature.blob_log(ir/255, min_sigma=5, max_sigma=15, num_sigma=10, threshold=0.00002, overlap=0.75, log_scale=False)
blobs[:, 2] = blobs[:, 2] * numpy.sqrt(2) # Convert sigma to radii.
print('Found {} blobs.'.format(blobs.shape[0]))
crowdastro.show.ir(subject)
ax = matplotlib.pyplot.gca()
for y, x, r in blobs:
c = matplotlib.pyplot.Circle((x, y), r, color='green', linewidth=1, fill=False)
ax.add_patch(c)
matplotlib.pyplot.show()
That seems good enough for now. Let's try it on some random examples.
In [50]:
def plot_blobs(subject):
ir = crowdastro.data.get_ir(subject)[:200, :200]
blobs = skimage.feature.blob_log(ir/255, min_sigma=5, max_sigma=15, num_sigma=10, threshold=0.00002, overlap=0.75,
log_scale=False)
blobs[:, 2] = blobs[:, 2] * numpy.sqrt(2) # Convert sigma to radii.
print('Found {} blobs.'.format(blobs.shape[0]))
crowdastro.show.ir(subject)
ax = matplotlib.pyplot.gca()
for y, x, r in blobs:
c = matplotlib.pyplot.Circle((x, y), r, color='green', linewidth=1, fill=False)
ax.add_patch(c)
matplotlib.pyplot.show()
for _ in range(10):
plot_blobs(crowdastro.data.get_random_subject())
In [ ]: