Local backup and SQL querying of annotation data

Overview:

Annotations represent a significant time investment for the users who generate them and they should be backed up frequently. The simplest way to backup the annotations in a DSA database is to perform a mongodump operation. While frequent mongodump operations are always important to guard against failures they have the following disadvantages:

You need to have access on the server where the annotations are hosted.
The entire Mongo database is backed up, not just the folder you care about.
You cannot query the database using SQL queries. HistomicsTK has utility functions that allow the recursive backup of a girder database locally as a combination of .json files (most similar to the raw format), tabular files (.csv), and/or an SQLite database.

The SQLite database can easily be viewed using, for example, an offline sqlite viewer or even an online sqlite viewer.

Where to look:

|_histomicstk/
   |_annotations_and_masks/
      |_annotation_database_parser.py
      |_annotation_and_mask_utils.py -> parse_slide_annotations_into_tables()
      |_tests/
         |_test_annotation_database_parser.py
         |_test_annotation_and_mask_utils.py -> test_parse_slide_annotations_into_table()



In [1]:

    
import os
import pandas as pd
import sqlalchemy as db

from histomicstk.utils.girder_convenience_utils import connect_to_api
from histomicstk.annotations_and_masks.annotation_database_parser import (
    dump_annotations_locally, parse_annotations_to_local_tables)

Connect and set parameters

We use an api key to connect to the remote server, set the girder ID of the folder we want to backup, and set the local path where the backup will be stored.



In [2]:

    
gc = connect_to_api(
    apiurl='http://candygram.neurology.emory.edu:8080/api/v1/',
    apikey='kri19nTIGOkWH01TbzRqfohaaDWb6kPecRqGmemb')

# This is the girder ID of the folder we would like to backup and parse locally
SAMPLE_FOLDER_ID = "5e24c20dddda5f8398695671"

# This is where the annotations and sqlite database will be dumped locally
savepath = '/home/mtageld/Desktop/tmp/concordance/'

Examine functions for pulling annotation data

This is the main function you will be using to walk the folder and pull the annotations from the remote server



In [3]:

    
print(dump_annotations_locally.__doc__)









    



Dump annotations of folder and subfolders locally recursively.

    This reproduces this tiered structure locally and (possibly) dumps
    annotations there. Adapted from Lee A.D. Cooper

    Parameters
    -----------
    gc : girder_client.GirderClient
        authenticated girder client instance

    folderid : str
        girder id of source (base) folder

    local : str
        local path to dump annotations

    save_json : bool
        whether to dump annotations as json file

    save_sqlite : bool
        whether to save the backup into an sqlite database

    dbcon : sqlalchemy.create_engine.connect() object
        IGNORE THIS PARAMETER!! This is used internally.

    callback : function
        function to call that CAN accept AT LEAST the following params
        - item: girder response with item information
        - annotations: loaded annotations
        - local: local directory
        - monitorPrefix: string
        - dbcon: sqlalchemy.create_engine.connect() object
        You can just add kwargs at the end of your callback definition
        for simplicity.

    callback_kwargs : dict
        kwargs to pass along to callback. DO NOT pass any of the parameters
        item, annotations, local, monitorPrefix, or dbcon as these will be
        internally passed. Just include any specific paremeters for the
        callback. See parse_annotations_to_local_tables() above for
        an example of a callback and the unir test of this function.

This optionally calls the following function to parse annotations into tables that are added to an sqlite database.



In [4]:

    
print(parse_annotations_to_local_tables.__doc__)









    



Parse loaded annotations for slide into tables.

    Parameters
    ----------
    item : dict
        girder response with item information

    annotations : dict
        loaded annotations

    local : str
        local directory

    save_csv : bool
        whether to use histomicstk.annotations_and_masks.annotation_and_mask.
        parse_slide_annotations_into_tables() to get a tabular representation
        (including some simple calculations like bounding box) and save
        the output as two csv files, one representing the annotation documents
        and the other representing the actual annotation elements (polygons).

    save_sqlite : bool
        whether to save the backup into an sqlite database

    dbcon : sqlalchemy.create_engine.connect() object
        IGNORE THIS PARAMETER!! This is used internally.

    monitorPrefix : str
        text to prepend to printed statements

Case 1: Simple backup

The simplest case is to backup the information about the girder folders, items, and annotations as .json files, with a folder structure replicated locally as it is in the girder database. The user may also elect to save the folder and item/slide information (but not the annotations) as the following tables in a SQLite database:

folders: all girder folders contained within the folder that the user wants to backup. This includes an 'absolute girder path' convenience column. The column '_id' is the unique girder ID.
items: all items (slide). The column '_id' is the unique girder ID, and is linked to the folders table by the 'folderId' column.

Here is the syntax:



In [5]:

    
# recursively save annotations -- JSONs + sqlite for folders/items
dump_annotations_locally(
    gc, folderid=SAMPLE_FOLDER_ID, local=savepath,
    save_json=True, save_sqlite=True)









    



: save folder info
Participant_1: save folder info
Participant_1: slide 1 of 5 (TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs): save item info
Participant_1: slide 1 of 5 (TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs): load annotations
Participant_1: slide 1 of 5 (TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs): save annotations
Participant_1: slide 2 of 5 (TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs): save item info
Participant_1: slide 2 of 5 (TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs): load annotations
Participant_1: slide 2 of 5 (TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs): save annotations
Participant_1: slide 3 of 5 (TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs): save item info
Participant_1: slide 3 of 5 (TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs): load annotations
Participant_1: slide 3 of 5 (TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs): save annotations
Participant_1: slide 4 of 5 (TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs): save item info
Participant_1: slide 4 of 5 (TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs): load annotations
Participant_1: slide 4 of 5 (TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs): save annotations
Participant_1: slide 5 of 5 (TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs): save item info
Participant_1: slide 5 of 5 (TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs): load annotations
Participant_1: slide 5 of 5 (TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs): save annotations
Participant_2: save folder info
Participant_2: slide 1 of 5 (TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs): save item info
Participant_2: slide 1 of 5 (TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs): load annotations
Participant_2: slide 1 of 5 (TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs): save annotations
Participant_2: slide 2 of 5 (TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs): save item info
Participant_2: slide 2 of 5 (TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs): load annotations
Participant_2: slide 2 of 5 (TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs): save annotations
Participant_2: slide 3 of 5 (TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs): save item info
Participant_2: slide 3 of 5 (TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs): load annotations
Participant_2: slide 3 of 5 (TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs): save annotations
Participant_2: slide 4 of 5 (TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs): save item info
Participant_2: slide 4 of 5 (TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs): load annotations
Participant_2: slide 4 of 5 (TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs): save annotations
Participant_2: slide 5 of 5 (TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs): save item info
Participant_2: slide 5 of 5 (TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs): load annotations
Participant_2: slide 5 of 5 (TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs): save annotations

Check the results



In [6]:

    
!tree '/home/mtageld/Desktop/tmp/concordance/'









    



/home/mtageld/Desktop/tmp/concordance/
├── Concordance.json
├── Concordance.sqlite
├── Participant_1
│   ├── Participant_1.json
│   ├── TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs_annotations.json
│   ├── TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs.json
│   ├── TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs_annotations.json
│   ├── TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs.json
│   ├── TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs_annotations.json
│   ├── TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs.json
│   ├── TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs_annotations.json
│   ├── TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs.json
│   ├── TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs_annotations.json
│   └── TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs.json
└── Participant_2
    ├── Participant_2.json
    ├── TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs_annotations.json
    ├── TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs.json
    ├── TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs_annotations.json
    ├── TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs.json
    ├── TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs_annotations.json
    ├── TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs.json
    ├── TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs_annotations.json
    ├── TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs.json
    ├── TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs_annotations.json
    └── TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs.json

2 directories, 24 files

Query the database



In [7]:

    
# Connect to the database
sql_engine = db.create_engine(
    'sqlite:///%s/Concordance.sqlite' % savepath)
dbcon = sql_engine.connect()



In [8]:

    
# folders table
folders_df = pd.read_sql_query(
    """
    SELECT "_id", "name", "folder_path"
    FROM "folders"
    ;""", dbcon)

folders_df









    Out[8]:







  
    
      
      _id
      name
      folder_path
    
  
  
    
      0
      5e24c20dddda5f8398695671
      Concordance
      UncrossPolygonTest/Concordance/
    
    
      1
      5e24c0dfddda5f839869556c
      Participant_1
      UncrossPolygonTest/Concordance/Participant_1/
    
    
      2
      5e24c0d3ddda5f8398694f06
      Participant_2
      UncrossPolygonTest/Concordance/Participant_2/



In [9]:

    
# items table
items_df = pd.read_sql_query(
    """
    SELECT "_id", "name", "folderid"
    FROM "items"
    ;""", dbcon)

items_df









    Out[9]:







  
    
      
      _id
      name
      folderId
    
  
  
    
      0
      5e24c0dfddda5f8398695571
      TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD...
      5e24c0dfddda5f839869556c
    
    
      1
      5e24c0dfddda5f8398695586
      TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B...
      5e24c0dfddda5f839869556c
    
    
      2
      5e24c0dfddda5f83986955b1
      TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA2...
      5e24c0dfddda5f839869556c
    
    
      3
      5e24c0dfddda5f83986955c1
      TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E4...
      5e24c0dfddda5f839869556c
    
    
      4
      5e24c0e0ddda5f83986955d8
      TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F7...
      5e24c0dfddda5f839869556c
    
    
      5
      5e24c0dbddda5f839869531a
      TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD...
      5e24c0d3ddda5f8398694f06
    
    
      6
      5e24c0dbddda5f8398695342
      TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B...
      5e24c0d3ddda5f8398694f06
    
    
      7
      5e24c0dbddda5f8398695372
      TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA2...
      5e24c0d3ddda5f8398694f06
    
    
      8
      5e24c0dcddda5f8398695387
      TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E4...
      5e24c0d3ddda5f8398694f06
    
    
      9
      5e24c0dcddda5f83986953aa
      TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F7...
      5e24c0d3ddda5f8398694f06



In [10]:

    
# cleanup
import shutil
shutil.rmtree(os.path.join(savepath))
os.mkdir(savepath)

Case 2: Parse annotations to tables

Besides everything outlined above, we could also parse the annotations into tables in the SQLite database and not just save the raw JSON files. This is a little slower because loops through each annotation element. Beside the tables above, the following extra tables are saved into the SQLite database:

annotation_docs: Information about all the annotation documents (one document is a collection of elements like polygons, rectangles etc). The column 'annotation_girder_id' is the unique girder ID, and is linked to the 'items' table by the 'itemid' column.
annotation_elements: Information about the annotation elements (polygons, rectangles, points, etc). The column 'element_girder_id' is the unique girder ID, and is linked to the 'annotation_docs' table by the 'annotation_girder_id' column.

Here's the syntax:



In [11]:

    
# recursively save annotations -- parse sqlite
dump_annotations_locally(
    gc, folderid=SAMPLE_FOLDER_ID, local=savepath,
    save_json=False, save_sqlite=True,
    callback=parse_annotations_to_local_tables,
    callback_kwargs={
        'save_csv': False,
        'save_sqlite': True,
    }
)









    



Participant_1: slide 1 of 5 (TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs): load annotations
Participant_1: slide 1 of 5 (TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs): run callback
Participant_1: slide 1 of 5 (TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs): parse to tables
Participant_1: slide 2 of 5 (TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs): load annotations
Participant_1: slide 2 of 5 (TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs): run callback
Participant_1: slide 2 of 5 (TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs): parse to tables
Participant_1: slide 3 of 5 (TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs): load annotations
Participant_1: slide 3 of 5 (TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs): run callback
Participant_1: slide 3 of 5 (TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs): parse to tables
Participant_1: slide 4 of 5 (TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs): load annotations
Participant_1: slide 4 of 5 (TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs): run callback
Participant_1: slide 4 of 5 (TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs): parse to tables
Participant_1: slide 5 of 5 (TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs): load annotations
Participant_1: slide 5 of 5 (TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs): run callback
Participant_1: slide 5 of 5 (TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs): parse to tables
Participant_2: slide 1 of 5 (TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs): load annotations
Participant_2: slide 1 of 5 (TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs): run callback
Participant_2: slide 1 of 5 (TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD7-A61535786297.svs): parse to tables
Participant_2: slide 2 of 5 (TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs): load annotations
Participant_2: slide 2 of 5 (TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs): run callback
Participant_2: slide 2 of 5 (TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B7-F0F92AE56533.svs): parse to tables
Participant_2: slide 3 of 5 (TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs): load annotations
Participant_2: slide 3 of 5 (TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs): run callback
Participant_2: slide 3 of 5 (TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA24-62340E108B17.svs): parse to tables
Participant_2: slide 4 of 5 (TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs): load annotations
Participant_2: slide 4 of 5 (TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs): run callback
Participant_2: slide 4 of 5 (TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E40-B18CAAC52B81.svs): parse to tables
Participant_2: slide 5 of 5 (TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs): load annotations
Participant_2: slide 5 of 5 (TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs): run callback
Participant_2: slide 5 of 5 (TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F70-D136A1063383.svs): parse to tables

Check the result



In [12]:

    
!tree '/home/mtageld/Desktop/tmp/concordance/'









    



/home/mtageld/Desktop/tmp/concordance/
├── Concordance.sqlite
├── Participant_1
└── Participant_2

2 directories, 1 file

Query the database



In [13]:

    
# Connect to the database
sql_engine = db.create_engine(
    'sqlite:///%s/Concordance.sqlite' % savepath)
dbcon = sql_engine.connect()



In [14]:

    
# annotation documents
docs_df = pd.read_sql_query(
    """
    SELECT "annotation_girder_id", "itemId", "item_name", "element_count"
    FROM 'annotation_docs'
    ;""", dbcon)
docs_df.head()









    Out[14]:







  
    
      
      annotation_girder_id
      itemId
      item_name
      element_count
    
  
  
    
      0
      5e24c0dfddda5f8398695573
      5e24c0dfddda5f8398695571
      TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD...
      1
    
    
      1
      5e24c0dfddda5f8398695575
      5e24c0dfddda5f8398695571
      TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD...
      4
    
    
      2
      5e24c0dfddda5f839869557a
      5e24c0dfddda5f8398695571
      TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD...
      5
    
    
      3
      5e24c0dfddda5f8398695580
      5e24c0dfddda5f8398695571
      TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD...
      1
    
    
      4
      5e24c0dfddda5f8398695582
      5e24c0dfddda5f8398695571
      TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD...
      1



In [15]:

    
# annotation elements
elements_summary = pd.read_sql_query(
    """
    SELECT "group", count(*)
    FROM 'annotation_elements'
    GROUP BY "group"
    ;""", dbcon)
elements_summary









    Out[15]:







  
    
      
      group
      count(*)
    
  
  
    
      0
      Necrosis_or_Debris
      6
    
    
      1
      Mostly_Blood
      3
    
    
      2
      Mostly_Tumor
      10
    
    
      3
      Arteriole_or_Veinule
      6
    
    
      4
      Evaluation
      10
    
    
      5
      Exclude
      20
    
    
      6
      Exclude
      23
    
    
      7
      Mostly_Blood
      3
    
    
      8
      Mostly_Fat
      9
    
    
      9
      Mostly_Lymph
      2
    
    
      10
      Mostly_Lymphocytic_Infiltrate
      36
    
    
      11
      Mostly_PlasmaCells
      9
    
    
      12
      Mostly_Tumor
      83
    
    
      13
      Necrosis_or_Debris
      10

	_id	name	folder_path
0	5e24c20dddda5f8398695671	Concordance	UncrossPolygonTest/Concordance/
1	5e24c0dfddda5f839869556c	Participant_1	UncrossPolygonTest/Concordance/Participant_1/
2	5e24c0d3ddda5f8398694f06	Participant_2	UncrossPolygonTest/Concordance/Participant_2/

	_id	name	folderId
0	5e24c0dfddda5f8398695571	TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD...	5e24c0dfddda5f839869556c
1	5e24c0dfddda5f8398695586	TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B...	5e24c0dfddda5f839869556c
2	5e24c0dfddda5f83986955b1	TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA2...	5e24c0dfddda5f839869556c
3	5e24c0dfddda5f83986955c1	TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E4...	5e24c0dfddda5f839869556c
4	5e24c0e0ddda5f83986955d8	TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F7...	5e24c0dfddda5f839869556c
5	5e24c0dbddda5f839869531a	TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD...	5e24c0d3ddda5f8398694f06
6	5e24c0dbddda5f8398695342	TCGA-A2-A0YM-01Z-00-DX1.A48B4C96-2CC5-464C-98B...	5e24c0d3ddda5f8398694f06
7	5e24c0dbddda5f8398695372	TCGA-A7-A0DA-01Z-00-DX1.5F087009-16E9-4A07-BA2...	5e24c0d3ddda5f8398694f06
8	5e24c0dcddda5f8398695387	TCGA-AR-A1AY-01Z-00-DX1.6AC0BE3B-FFC5-4EDA-9E4...	5e24c0d3ddda5f8398694f06
9	5e24c0dcddda5f83986953aa	TCGA-BH-A0BG-01Z-00-DX1.0838FB7F-8C85-4687-9F7...	5e24c0d3ddda5f8398694f06

	annotation_girder_id	itemId	item_name	element_count
0	5e24c0dfddda5f8398695573	5e24c0dfddda5f8398695571	TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD...	1
1	5e24c0dfddda5f8398695575	5e24c0dfddda5f8398695571	TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD...	4
2	5e24c0dfddda5f839869557a	5e24c0dfddda5f8398695571	TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD...	5
3	5e24c0dfddda5f8398695580	5e24c0dfddda5f8398695571	TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD...	1
4	5e24c0dfddda5f8398695582	5e24c0dfddda5f8398695571	TCGA-A1-A0SK-01Z-00-DX1.A44D70FA-4D96-43F4-9DD...	1

	group	count(*)
0	Necrosis_or_Debris	6
1	Mostly_Blood	3
2	Mostly_Tumor	10
3	Arteriole_or_Veinule	6
4	Evaluation	10
5	Exclude	20
6	Exclude	23
7	Mostly_Blood	3
8	Mostly_Fat	9
9	Mostly_Lymph	2
10	Mostly_Lymphocytic_Infiltrate	36
11	Mostly_PlasmaCells	9
12	Mostly_Tumor	83
13	Necrosis_or_Debris	10

Local backup and SQL querying of annotation data

Connect and set parameters

Examine functions for pulling annotation data

Case 1: Simple backup

Here is the syntax:

Check the results

Query the database

Case 2: Parse annotations to tables

Here's the syntax:

Check the result

Query the database

Sample screenshots