In order to start a reconstruction CORDA requires you to assign a confidence score to each reaction in your base model. This can be done by a variety of methods, and even by hand, but the most common way is to assign confidence based on proteome or gene expression data.
CORDA manages a total of 5 confidence levels:
The most tedious step here is usaully mapping the confidence for genes or proteins to the distinct reactions. Many of the larger models come with gene-reaction rules in the form
gene1 and gene2 or (gene3 and gene4)
and the individual confidence values have to be mapped from the gene confidence levels. Here "and" is evaluated by the minimum confidence and "or" by the maximum confidence. The Python package includes a handy function to do this for you automatically in a safe manner. For that you will require the gene-reaction rule (Recon 1 and 2 include them in their model for instance) and a dictionary mapping genes/proteins to their confidence values. For examples:
In [1]:
from corda import reaction_confidence
gene_conf = {"gene1": 1, "gene2": 3, "gene4": -1} # missing entries are automatically assigned zeroes
rule = "gene1 and gene2 or (gene3 and gene4)"
reaction_confidence(rule, gene_conf)
Out[1]:
Now we will perform a small reconstruction. The package includes a small model in SBML format describing the general central carbon metabolism.
In [2]:
from corda import test_model
mod = test_model()
len(mod.reactions)
Out[2]:
Here the last reaction is a biomass reaction.
In [3]:
mod.reactions[59].reaction
Out[3]:
We can now reconstruct the smallest model that still allows for the production of biomass. Let's start by setting the biomass reaction as high confidence and all other reactions as "not include".
In [4]:
conf = {}
for r in mod.reactions: conf[r.id] = -1
conf["r60"] = 3
Now we reconstruct the model.
In [5]:
from corda import CORDA
opt = CORDA(mod, conf)
opt.build()
print(opt)
In order to see the list of reactions in the reconstructed model without specifically creating the new model you can use the following snippet.
In [6]:
print([opt.model.reactions.get_by_id(k).reaction for k, used in opt.included.items() if used])
This gives a reconstruction that looks like this (orange denotes included reactions):
We can also define additional metabolic functions that should be possible in the reconstruction. Let's assume we want to be able to produce pep.
In [7]:
opt = CORDA(mod, conf, met_prod="pep")
opt.build()
print(opt)
This time we included the production of pep:
In [8]:
rec = opt.cobra_model("plus_pep")
use = rec.metabolites.pep.reactions
print("# of redundant pathway for pep =", opt.redundancies["EX_CORDA_0"])
for r in use: print(r.reaction)
By default CORDA uses redundancy. This means, in case there are several minimal pathways to reach your objective, CORDA will include all of those (which is good since it gives your model some robustness). In this case this did happen and yields two different forms to produce pep. If we do not want that feature we can modify the parameter n in the CORDA initializer which denotes the maximum number of redundant pathways to include.
In [9]:
opt = CORDA(mod, conf, met_prod="pep", n=1)
opt.build()
rec_min = opt.cobra_model("plus_pep_nored")
print("used", len(rec_min.reactions), "reactions")
print("# of redundant pathway for pep =", opt.redundancies["EX_CORDA_0"])
use = rec_min.metabolites.pep.reactions
for r in use: print(r.reaction)
Which now includes only a single pathway to activate each high confidence flux.