Create Orphan Assignments for Forgotten Border Synapses

Ting generated a list of synapse bodies less than 100,000 voxels that touch but do not cross a substack boundary. These bodies were not traced previously but should have been.

Input

  • Ting's list of of unexamined boundaries (first body in Ting's list represents the unexamined body)
  • The number of orphans desired for each proofreader task
  • The desired output directory

Output

  • Print the number of orphans
  • Output the orphan assignments in the desired location
  • Assume that the unextended stack will be used

In [5]:
# function that generates orphan assignment
def create_orphan_assignments(assignment_size, orphan_file, output_dir):
    import os
    import json
    import random
    
    try:
        os.makedirs(output_dir)
    except OSError:
        pass    
    
    data = json.load(open(orphan_file))
    
    goodbodies = set()
    body_locations = {}
    
    # find all the good bodies (might be possible to have a body over 100000 that doesn't cross but redundancy is okay)
    for body in data["Bodies"]:
        body_locations[body["id"]] = body["marker"]
        if body["cross_substack"]:
            if body["num_synapses"] > 0 or body["size"] >= 100000:
                goodbodies.add(body["id"])

    body_list = []

    # go through all edges and grab orphans
    for edge in data["Overlap"]:
        id1 = edge["id1"]
        id2 = edge["id2"]

        # it is okay to have duplicate edges since candidate edge might not be correct
        # if the edge wasn't examined before by focused stitch (or bookmark if given a bad sight) -- add to orphan list
        if id2 not in goodbodies:
            body_list.append(body_locations[id1])

    random.shuffle(body_list)

    # boilerplate for orphan json output
    json_out = {}
    json_out["metadata"] = {"description": "point list", "file version": 2}
    json_out["data"] = {}
    json_out["data"]["threshold"] = 100000
    json_out["data"]["threshold comparison"] = "target"

    # break body list into several assignments
    num_assignments = len(body_list) / assignment_size + 1
    for assign in range(0, num_assignments):
        start = assign * assignment_size
        finish = start + assignment_size
        if finish > len(body_list):
            finish = len(body_list)

        new_body_list = []
        for bodynum in range(start, finish):
            new_body_list.append(body_list[bodynum])

        json_out["data"]["point list"] = new_body_list

        fout = open(output_dir + "/%d.json" % assign, 'w')
        fout.write(json.dumps(json_out, indent=4))
    
    print "Assignment size: ", assignment_size
    print "Orphan file used: ", orphan_file
    print "Output directory: ", output_dir
    print "Number of assignments: ", num_assignments
    print "Number of total orphans: ", len(body_list)

In [3]:
# common prefix for data files
prefix="/groups/flyem/data/medulla-FIB-Z1211-25-production/align2/base_stacks/shinya1-13_20140516"

In [4]:
# create random orphan assignments with 200 per file
create_orphan_assignments(200, prefix + "/focused-debug/input/input.json", prefix + "/orphan-assignments")


Assignment size:  200
Orphan file used:  /groups/flyem/data/medulla-FIB-Z1211-25-production/align2/base_stacks/shinya1-13_20140516/focused-debug/input/input.json
Output directory:  /groups/flyem/data/medulla-FIB-Z1211-25-production/align2/base_stacks/shinya1-13_20140516/orphan-assignments
Number of assignments:  73
Number of total orphans:  14544

In [6]:
# create random orphan assignments with 200 per file -- rerun with file version 2 instead of 1
create_orphan_assignments(200, prefix + "/focused-debug/input/input.json", prefix + "/orphan-assignments")


Assignment size:  200
Orphan file used:  /groups/flyem/data/medulla-FIB-Z1211-25-production/align2/base_stacks/shinya1-13_20140516/focused-debug/input/input.json
Output directory:  /groups/flyem/data/medulla-FIB-Z1211-25-production/align2/base_stacks/shinya1-13_20140516/orphan-assignments
Number of assignments:  73
Number of total orphans:  14544

In [ ]: