Handling of experimental metadata using odMLtables

This notebook presents the interaction with metadata from an experiment using odMLtables. Here we use the metadata of a complex electrophysiology experiment published in Brochier T. et al. Massively parallel recordings in macaque motor cortex during an instructed delayed reach-to-grasp task. Sci. Data 5:180055. In the first step, we show how to extract a subset of information from the complete metadata collection for visualization. In the second step, we compare numerous similar values available with in one metadata file and generate an overview sheet, eg. for addition into a lab book.

Download of published metadata

The published datasets are hosted on GIN and can be accessed directly via the webinterface or the local gin client. Here we download the metadata files of the two datasets directly.



In [ ]:

    
# helper function by Ali Faki, https://stackoverflow.com/questions/7243750/download-file-from-web-in-python-3
from requests import get
def download(url, file_name):
    # open in binary mode
    with open(file_name, "wb") as file:
        # get request
        response = get(url)
        # write to file
        file.write(response.content)

# location of the repository and metadata files on GIN
gin_repo = "https://web.gin.g-node.org/INT/multielectrode_grasp/raw/master/"
filenames = ["i140703-001.odml", "l101210-001.odml"]
filepaths = ["datasets/" + f for f in filenames]

# download metadata files from GIN
for filepath, filename in zip(filepaths, filenames):
    download(gin_repo + filepath, filename)

Extraction of a subset of metadata

The metadata collections of the electrophysiological experiment contain due to the complexity of the experiment thousands of entries organized in a hierarchical odML structure. This is of advantage for the storage of the metadata in an organized fashion, but for human interaction and visualization the amout of data needs to be reduced to a subset relevant for the current question. This can be achieved using the filter functionality of odMLtables. Here we extract all metadata related to the subject performing the task, which includes information such as the species, name, birthdate and handedness.

Since the publication of the datasets odML was developed further, so an updated odML version exists. In a first step we are converting the metadata files from odML file version 1.0 to file version 1.1 using the version conversion tool provided by odML. In a second step we use odMLtables to select only information related to the subject of the experiment and store this in a separate metadata file.



In [ ]:

    
# inplace conversion of odMl files to latest odML version
from odml.tools.version_converter import VersionConverter
for filename in filenames:
    converter = VersionConverter(filename)
    converter.convert()
    converter.write_to_file(filename)


# load odml file using odmltables and extract subset of information
import odmltables as odt

# extract only directly related subject information
for filename in filenames:
    odmlfile = odt.OdmlTable(filename)
    odmlfile.filter(SectionName='Subject')
    new_filename = filename.split('.')[0] + '_filtered.odml'
    odmlfile.write2odml(new_filename)

The generated odML files, which contain only a subset of the metadata can now be visualized in a browser using the odML style sheet. For this just place the style sheet in the same folder as your odML file and open the odML file in your browser. Here we use a helper function to display the odML content using the odML style sheet.



In [ ]:

    
# This is utility code for displaying the odML file as html representation here.
# You can also just open the odML file in your browser having the style sheet in the same location as your odML file and
# will get the same result
from IPython.display import display, HTML
import lxml.etree as ET

def display_odML_as_html(odML_file, xsl_file='odml.xsl'):
    # generate html representation from odML file and style sheet
    dom = ET.parse(odML_file)
    xslt = ET.parse(xsl_file)
    transform = ET.XSLT(xslt)
    newdom = transform(dom)
    # display html
    display(HTML(ET.tostring(newdom, pretty_print=True).decode()))



In [ ]:

    
# display the extracted subsection for one of the datasets
display_odML_as_html(filenames[0].split('.')[0] + '_filtered.odml')

Comparison of metadata

In many cases metadata collections contain repeating structures, containing similar information for different contexts. Comparing these information located in different parts of the odML structure in a single table helps gaining an overview of the metadata and is sometimes required to be tracked in lab book tables. odMLtables offers a comparison function within an odML document, which generates such an overview table for a selected number of subsections. The published metadata collection contains information about 96 active recording electrodes, characterizing each in a separate odML section. The properties in these sections are always the same and only the structure of the subsections differs depending on the preprocessing (spikesorting) of the data. The basic electrode properties are a variety of different IDs associated with each electrode, it's impedance and length.

The odMLtables comparison generates a xls or csv table listing the shared properties against the sections to compare.



In [ ]:

    
for filename in filenames:
    # create an odMLtables table for comparison with functionality to write to xls
    comparetable = odt.CompareSectionXlsTable()
    comparetable.load_from_file(filename)
    # specify which section to compare, here the first 5 of 97 electrodes
    comparetable.choose_sections(*['Electrode_{:03d}'.format(i) for i in range(1,6)])
    # write comparison table to xls format
    comparetable.write2file(filename.split('.')[0] + '_electrode_comparison.xls')

The resulting overview table looks like shown below, listing the shared properties in different rows and the selected electrodes in columns. Comparison tables only contain a subset of the metadata from the original document and are lacking the corresponding metadata for reconstruction of the original odML file. Therefore these are only for visualization and documentation purposes, but not for further enrichment or storage of metadata.