Building communities

micom will construct communities from a specification via a Pandas DataFrame. Here, the DataFrame needs at least two columns: "id" and "file" which specify the ID of the organism/tissue and a file containing the actual individual model.

To make more sense of that we can look at a small example. micom comes with a function that generates a simple example community specification consisting of several copies of a small E. coli model containing only the central carbon metabolism.


In [1]:
from micom.data import test_taxonomy

taxonomy = test_taxonomy()
taxonomy


Out[1]:
id genus species reactions metabolites file
0 Escherichia_coli_1 Escherichia Eschericia coli 95 72 /home/cdiener/code/micom/micom/data/e_coli_cor...
1 Escherichia_coli_2 Escherichia Eschericia coli 95 72 /home/cdiener/code/micom/micom/data/e_coli_cor...
2 Escherichia_coli_3 Escherichia Eschericia coli 95 72 /home/cdiener/code/micom/micom/data/e_coli_cor...
3 Escherichia_coli_4 Escherichia Eschericia coli 95 72 /home/cdiener/code/micom/micom/data/e_coli_cor...
4 Escherichia_coli_5 Escherichia Eschericia coli 95 72 /home/cdiener/code/micom/micom/data/e_coli_cor...

As we see this specification contains the required fields and some more information. In fact the specification may contain any number of additional information which will be saved along with the community model. One special example is "abundance" which we will get to know soon :)

In order to convert the specification in a community model we will use the Community class from micom which derives from the cobrapy Model class.


In [2]:
from micom import Community

com = Community(taxonomy)
print("Build a community with a total of {} reactions.".format(len(com.reactions)))


Build a community with a total of 495 reactions.

This includes the correctly scaled exchange reactions with the internal medium and initializes the external imports to the maximum found in all models. The original taxonomy is maintained in the com.taxonomy attribute.

Note that micom can figure out how to read a variety of different file types based on the extension. It curently supports:

  • .pickle for pickled models
  • .xml or .gz for XML models
  • .json for JSON models
  • .mat for COBRAtoolbox models

In [3]:
com.taxonomy


Out[3]:
id genus species reactions metabolites file abundance
id
Escherichia_coli_1 Escherichia_coli_1 Escherichia Eschericia coli 95 72 /home/cdiener/code/micom/micom/data/e_coli_cor... 0.2
Escherichia_coli_2 Escherichia_coli_2 Escherichia Eschericia coli 95 72 /home/cdiener/code/micom/micom/data/e_coli_cor... 0.2
Escherichia_coli_3 Escherichia_coli_3 Escherichia Eschericia coli 95 72 /home/cdiener/code/micom/micom/data/e_coli_cor... 0.2
Escherichia_coli_4 Escherichia_coli_4 Escherichia Eschericia coli 95 72 /home/cdiener/code/micom/micom/data/e_coli_cor... 0.2
Escherichia_coli_5 Escherichia_coli_5 Escherichia Eschericia coli 95 72 /home/cdiener/code/micom/micom/data/e_coli_cor... 0.2

As you can notice we have gained a new column called abundance. This column quantifies the relative quantity of each individual in the community. Since we did not specify this in the original taxonomy micom assumes that all individuals are present in the same quantity.

The presented community here is pretty simplistic. For microbial communities micom includes a larger taxonomy for 773 microbial species from the AGORA paper. Here the "file" column only contains the base names for the files but you can easily prepend any path as demonstrated in the following:


In [4]:
from micom.data import agora

tax = agora.copy()
tax.file = "models/" + tax.file  # assuming you have downloaded the AGORA models to the "models" folder
tax.head()


Out[4]:
organism id kingdom phylum class order family genus species oxygen_status ... draft_created platform kbase_genome_id pubseed_id ncbi_id genome_size genes reactions metabolites file
0 Abiotrophia defectiva ATCC 49176 Abiotrophia_defectiva_ATCC_49176 Bacteria Firmicutes Bacilli Lactobacillales Aerococcaceae Abiotrophia Abiotrophia defectiva Facultative anaerobe ... 07/01/14 ModelSEED NaN Abiotrophia defectiva ATCC 49176 (592010.4) 592010.0 2041839 598 1069 840 models/Abiotrophia_defectiva_ATCC_49176.xml
1 Acidaminococcus fermentans DSM 20731 Acidaminococcus_fermentans_DSM_20731 Bacteria Firmicutes Negativicutes Acidaminococcales Acidiaminococcaceae Acidaminococcus Acidaminococcus fermentans Obligate anaerobe ... 04/17/16 Kbase kb|g.2555 Acidaminococcus fermentans DSM 20731 (591001.3) 591001.0 2329769 646 1090 903 models/Acidaminococcus_fermentans_DSM_20731.xml
2 Acidaminococcus intestini RyC-MR95 Acidaminococcus_intestini_RyC_MR95 Bacteria Firmicutes Negativicutes Selenomonadales Acidaminococcaceae Acidaminococcus Acidaminococcus intestini Obligate anaerobe ... 08/03/14 ModelSEED NaN Acidaminococcus intestini RyC-MR95 (568816.4) 568816.0 2487765 599 994 827 models/Acidaminococcus_intestini_RyC_MR95.xml
3 Acidaminococcus sp. D21 Acidaminococcus_sp_D21 Bacteria Firmicutes Negativicutes Selenomonadales Acidaminococcaceae Acidaminococcus unclassified Acidaminococcus Obligate anaerobe ... 06/29/12 ModelSEED NaN Acidaminococcus sp. D21 (563191.3) 563191.0 2238973 598 851 768 models/Acidaminococcus_sp_D21.xml
4 Acinetobacter calcoaceticus PHEA-2 Acinetobacter_calcoaceticus_PHEA_2 Bacteria Proteobacteria Gammaproteobacteria Pseudomonadales Moraxellaceae Acinetobacter Acinetobacter calcoaceticus Aerobe ... 04/18/16 Kbase kb|g.3519 Acinetobacter calcoaceticus PHEA-2 (871585.3) 871585.0 3862530 1026 1561 1165 models/Acinetobacter_calcoaceticus_PHEA_2.xml

5 rows × 24 columns