Analyzing remotely hosted images with the girder client

This example describes how to run image analysis tasks in a workflow process on collections of slides hosted on a DSA server. The API is used to retrieve pixel data from the server and the analysis is performed locally, with results pushed back to the server as anotations for visualization. Note that this is not the same approach as using girder tasks to run jobs remotely as a user through HistomicsUI. This is a utility intended for use by developers of image analysis and machine learning algorithms.

In this example, we will be running a cellularity detection workflow on all slides in the following source girder directory and the results are posted to the following results girder directory.

Where to look?

|_ histomicstk/
   |_workflows/
      |_workflow_runner.py 
      |_specific_workflows.py 
      |_tests/
         |_test_workflow_runner.py

In [1]:
import shutil
import tempfile
import girder_client
# import numpy as np
from pandas import read_csv
from histomicstk.workflows.workflow_runner import Slide_iterator
# from histomicstk.saliency.cellularity_detection import (
#     Cellularity_detector_superpixels)
from histomicstk.saliency.cellularity_detection_thresholding import (
    Cellularity_detector_thresholding)
from histomicstk.workflows.workflow_runner import (
    Workflow_runner, Slide_iterator)
from histomicstk.workflows.specific_workflows import (
    cellularity_detection_workflow)

Connect girder client and set analysis parameters


In [2]:
APIURL = 'http://candygram.neurology.emory.edu:8080/api/v1/'
SAMPLE_SOURCE_FOLDER_ID = "5d5c28c6bd4404c6b1f3d598"
SAMPLE_DESTINATION_FOLDER_ID = "5d9246f6bd4404c6b1faaa89"

# girder client
gc = girder_client.GirderClient(apiUrl=APIURL)
# gc.authenticate(interactive=True)
gc.authenticate(apiKey='kri19nTIGOkWH01TbzRqfohaaDWb6kPecRqGmemb')

# This is where the run logs will be saved
logging_savepath = tempfile.mkdtemp()

# params for cellularity thresholding
cdt_params = {
    'gc': gc,
    'slide_id': '',  # this will be handled by the slide iterator
    'GTcodes': read_csv('../../histomicstk/saliency/tests/saliency_GTcodes.csv'),
    'MAG': 3.0,
    'visualize': True,
    'verbose': 2,
    'logging_savepath': logging_savepath,
}

Explore the docs

Workflow_runner

You will need to use the Workflow_runner() object.


In [3]:
print(Workflow_runner.__init__.__doc__)


Init Workflow_runner object.

        Arguments
        -----------
        slide_iterator : object
            Slide_iterator object
        workflow : method
            method whose parameters include slide_id and monitorPrefix,
            which is called for each slide
        workflow_kwargs : dict
            keyword arguments for the workflow method
        kwargs : key-value pairs
            The following are already assigned defaults by Base_HTK_Class
            but can be passed here to override defaults
            [verbose, monitorPrefix, logging_savepath, suppress_warnings]

        

Note how this requires:

  • Slide_iterator instance - which yields information about the slides you want to the run the workflow on.
  • workflow - a method that you define, which runs on a single slide.
  • workflow_kwargs - parameters for your defined method.

In this example, we will be using cellularity_detection_workflow() as our workflow to run, which is defined in the histomicstk.workflows.specific_workflows module.

Slide_iterator


In [4]:
print(Slide_iterator.__init__.__doc__)


Init Slide_iterator object.

        Arguments
        -----------
        gc : object
            girder client object
        source_folder_id : str
            girder ID of folder in which slides are located
        keep_slides : list
            List of slide names to keep. If None, all are kept.
        discard_slides : list
            List of slide names to discard.
        kwargs : key-value pairs
            The following are already assigned defaults by Base_HTK_Class
            but can be passed here to override defaults
            [verbose, monitorPrefix, logger, logging_savepath,
            suppress_warnings]

        

Specific workflow to run


In [5]:
print(cellularity_detection_workflow.__doc__)


Run cellularity detection for single slide.

    The cellularity detection algorithm can either be
    Cellularity_detector_superpixels or Cellularity_detector_thresholding.

    Arguments
    -----------
    gc : object
        girder client object
    cdo : object
        Cellularity_detector object instance. Can either be
        Cellularity_detector_superpixels() or
        Cellularity_detector_thresholding(). The thresholding-based workflow
        seems to be more robust, despite being simpler.
    slide_id : str
        girder id of slide on which workflow is done
    monitoPrefix : str
        this will set the cds monitorPrefix attribute
    destination_folder_id : str or None
        if not None, copy slide to this girder folder and post results
        there instead of original slide.
    keep_existing_annotations : bool
        keep existing annotations in slide when posting results?

    

Initialize the workflow runner


In [6]:
# Init specific workflow (Cellularity_detector_thresholding)
cdt = Cellularity_detector_thresholding(**cdt_params)

# Init workflow runner
workflow_runner = Workflow_runner(
    slide_iterator=Slide_iterator(
        gc, source_folder_id=SAMPLE_SOURCE_FOLDER_ID,
        # keep_slides=None),  # run all slides in girder directory
        keep_slides=[  # run specific slides only
            'TCGA-A1-A0SK-01Z-00-DX1_POST.svs', 
            'TCGA-A2-A04Q-01Z-00-DX1_POST.svs', 
        ]),
    workflow=cellularity_detection_workflow,
    workflow_kwargs={
        'gc': gc,
        'cdo': cdt,
        'destination_folder_id': SAMPLE_DESTINATION_FOLDER_ID,
        'keep_existing_annotations': False, },
    logging_savepath=cdt.logging_savepath,
    monitorPrefix='test')


Saving logs to: /tmp/tmpz7t9bo3m/2019-10-27_18-53.log
Saving logs to: /tmp/tmpz7t9bo3m/2019-10-27_18-53.log

Run the detector


In [7]:
workflow_runner.run()


test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): copying slide to destination folder
test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): set_slide_info_and_get_tissue_mask()
test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1
test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: set_tissue_rgb()
test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: initialize_labeled_mask()
test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: assign_components_by_thresholding()
test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: -- get HSI and LAB images ...
test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: -- thresholding blue_sharpie ...
test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: -- thresholding blood ...
test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: -- thresholding whitespace ...
test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: color_normalize_unspecified_components()
test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: -- macenko normalization ...
test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: find_potentially_cellular_regions()
test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: find_top_cellular_regions()
test: slide 1 of 2 (TCGA-A1-A0SK-01Z-00-DX1_POST.svs): Tissue piece 1 of 1: visualize_results()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): copying slide to destination folder
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): set_slide_info_and_get_tissue_mask()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: set_tissue_rgb()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: initialize_labeled_mask()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: assign_components_by_thresholding()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: -- get HSI and LAB images ...
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: -- thresholding blue_sharpie ...
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: -- thresholding blood ...
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: -- thresholding whitespace ...
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: color_normalize_unspecified_components()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: -- macenko normalization ...
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: find_potentially_cellular_regions()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: find_top_cellular_regions()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 1 of 2: visualize_results()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: set_tissue_rgb()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: initialize_labeled_mask()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: assign_components_by_thresholding()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: -- get HSI and LAB images ...
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: -- thresholding blue_sharpie ...
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: -- thresholding blood ...
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: -- thresholding whitespace ...
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: color_normalize_unspecified_components()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: -- macenko normalization ...
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: find_potentially_cellular_regions()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: find_top_cellular_regions()
test: slide 2 of 2 (TCGA-A2-A04Q-01Z-00-DX1_POST.svs): Tissue piece 2 of 2: visualize_results()

Check the DSA/HistomicsTK visualization

Now you may go to the Digital Slide Archive and check the posted results at the results girder directory.