Whole-slide images often contain artifacts like marker or acellular regions that need to be avoided during analysis. In this example we show how HistomicsTK can be used to develop saliency detection algorithms that segment the slide at low magnification to generate a map to guide higher magnification analyses. Here we show how how colorspace analysis can detect various elements such as inking or blood, as well as dense cellular regions, to improve the quality of subsequent image analysis tasks.
This uses a thresholding and stain unmixing based pipeline to detect
highly-cellular regions in a slide. The run()
method of the
CDT_single_tissue_piece()
class has the key steps of the pipeline.
Additional functionality includes contour extraction to get the final segmentation boundaries and to visualize them in DSA using one's preferred styles.
Here are some sample results:
Where to look?
|_ histomicstk/
|_saliency/
|_cellularity_detection_thresholding.py
|_tests/
|_test_saliency.py
In [1]:
import tempfile
import girder_client
import numpy as np
from pandas import read_csv
from histomicstk.annotations_and_masks.annotation_and_mask_utils import (
delete_annotations_in_slide)
from histomicstk.saliency.cellularity_detection_thresholding import (
Cellularity_detector_thresholding)
import matplotlib.pylab as plt
from matplotlib.colors import ListedColormap
%matplotlib inline
In [2]:
APIURL = 'http://candygram.neurology.emory.edu:8080/api/v1/'
SAMPLE_SLIDE_ID = "5d8c296cbd4404c6b1fa5572"
gc = girder_client.GirderClient(apiUrl=APIURL)
gc.authenticate(apiKey='kri19nTIGOkWH01TbzRqfohaaDWb6kPecRqGmemb')
# This is where the run logs will be saved
logging_savepath = tempfile.mkdtemp()
# read GT codes dataframe
GTcodes = read_csv('../../histomicstk/saliency/tests/saliency_GTcodes.csv')
In [3]:
# deleting existing annotations in target slide (if any)
delete_annotations_in_slide(gc, SAMPLE_SLIDE_ID)
In [4]:
GTcodes
Out[4]:
In [5]:
print(Cellularity_detector_thresholding.__doc__)
The only required arguments to initialize are gc
, slide_id
, and GTcodes
.
Everything else is optional and assigned defaults, but you may want to read up on
what each argument does to adjust to your specific needs. The default behavior
is defined at the beginning of the __init__()
method.
In [6]:
print(Cellularity_detector_thresholding.__init__.__doc__)
In [7]:
# init cellularity detector
cdt = Cellularity_detector_thresholding(
gc, slide_id=SAMPLE_SLIDE_ID, GTcodes=GTcodes,
verbose=2, monitorPrefix='test',
logging_savepath=logging_savepath)
By default, color normalization is performed using the macenko method and standardizing to a hematoxylin and eosin standard from the target image TCGA-A2-A3XS-DX1_xmin21421_ymin37486 from Amgad et al, 2019.
If you don't like this behavior, and would prefer to use your own target image or a different color normalization method, use the set_color_normalization_method() below.
In [8]:
print(cdt.set_color_normalization_target.__doc__)
In [9]:
tissue_pieces = cdt.run()
In [10]:
print(
'Tissue piece 0: ',
'xmin', tissue_pieces[0].xmin,
'xmax', tissue_pieces[0].xmax,
'ymin', tissue_pieces[0].ymin,
'ymax', tissue_pieces[0].ymax,
)
In [11]:
# color map
tmp = tissue_pieces[0].labeled.copy()
tmp[0, :256] = np.arange(256)
vals = ['black'] * 256
vals[6] = 'cyan' # sharpie / ink
vals[7] = 'yellow' # blood
vals[8] = 'grey' # whitespace
vals[9] = 'indigo' # maybe cellular
vals[10] = 'green' # salient / top cellular
cMap = ListedColormap(vals)
plt.figure(figsize=(10,10))
plt.imshow(tmp, cmap=cMap)
plt.show()