Getting Started

Loading a model and inspecting it

To begin with, cobrapy comes with bundled models for Salmonella and E. coli, as well as a "textbook" model of E. coli core metabolism. To load a test model, type



In [1]:

    
from __future__ import print_function

import cobra
import cobra.test

# "ecoli" and "salmonella" are also valid arguments
model = cobra.test.create_test_model("textbook")

The reactions, metabolites, and genes attributes of the cobrapy model are a special type of list called a cobra.DictList, and each one is made up of cobra.Reaction, cobra.Metabolite and cobra.Gene objects respectively.



In [2]:

    
print(len(model.reactions))
print(len(model.metabolites))
print(len(model.genes))

When using Jupyter notebook this type of information is rendered as a table.



In [3]:

    
model









    Out[3]:





        
            
                Name
                e_coli_core
            

                Memory address
                0x01116ea9e8
            

                Number of metabolites
                72
            

                Number of reactions
                95
            

                Objective expression
                -1.0*Biomass_Ecoli_core_reverse_2cdba + 1.0*Biomass_Ecoli_core
            

                Compartments
                cytosol, extracellular

Just like a regular list, objects in the DictList can be retrieved by index. For example, to get the 30th reaction in the model (at index 29 because of 0-indexing):



In [4]:

    
model.reactions[29]









    Out[4]:





        
            
                Reaction identifier EX_glu__L_e
            

                Name L-Glutamate exchange
            

                Memory address
                0x011b8643c8
            

                Stoichiometry
                
                    glu__L_e --> 
                    L-Glutamate --> 
                
            

                GPR 
            

                Lower bound 0.0
            

                Upper bound 1000.0

Additionally, items can be retrieved by their id using the DictList.get_by_id() function. For example, to get the cytosolic atp metabolite object (the id is "atp_c"), we can do the following:



In [5]:

    
model.metabolites.get_by_id("atp_c")









    Out[5]:





        
            
                Metabolite identifier atp_c
            

                Name ATP
            

                Memory address
                0x011b7f82b0
            

                Formula C10H12N5O13P3
            

                Compartment c
            

                In 13 reaction(s) 
                    PYK, GLNS, ATPS4r, SUCOAS, PPCK, GLNabc, ATPM, ACKr, Biomass_Ecoli_core, ADK1, PPS, PFK, PGK

As an added bonus, users with an interactive shell such as IPython will be able to tab-complete to list elements inside a list. While this is not recommended behavior for most code because of the possibility for characters like "-" inside ids, this is very useful while in an interactive prompt:



In [6]:

    
model.reactions.EX_glc__D_e.bounds









    Out[6]:





(-10.0, 1000.0)

Reactions

We will consider the reaction glucose 6-phosphate isomerase, which interconverts glucose 6-phosphate and fructose 6-phosphate. The reaction id for this reaction in our test model is PGI.



In [7]:

    
pgi = model.reactions.get_by_id("PGI")
pgi









    Out[7]:





        
            
                Reaction identifier PGI
            

                Name glucose-6-phosphate isomerase
            

                Memory address
                0x011b886a90
            

                Stoichiometry
                
                    g6p_c <=> f6p_c
                    D-Glucose 6-phosphate <=> D-Fructose 6-phosphate
                
            

                GPR b4025
            

                Lower bound -1000.0
            

                Upper bound 1000.0

We can view the full name and reaction catalyzed as strings



In [8]:

    
print(pgi.name)
print(pgi.reaction)









    



glucose-6-phosphate isomerase
g6p_c <=> f6p_c

We can also view reaction upper and lower bounds. Because the pgi.lower_bound < 0, and pgi.upper_bound > 0, pgi is reversible.



In [9]:

    
print(pgi.lower_bound, "< pgi <", pgi.upper_bound)
print(pgi.reversibility)









    



-1000.0 < pgi < 1000.0
True

We can also ensure the reaction is mass balanced. This function will return elements which violate mass balance. If it comes back empty, then the reaction is mass balanced.



In [10]:

    
pgi.check_mass_balance()









    Out[10]:





{}

In order to add a metabolite, we pass in a dict with the metabolite object and its coefficient



In [11]:

    
pgi.add_metabolites({model.metabolites.get_by_id("h_c"): -1})
pgi.reaction









    Out[11]:





'g6p_c + h_c <=> f6p_c'

The reaction is no longer mass balanced



In [11]:

    
pgi.check_mass_balance()









    Out[11]:





{'H': -1.0, 'charge': -1.0}

We can remove the metabolite, and the reaction will be balanced once again.



In [12]:

    
pgi.subtract_metabolites({model.metabolites.get_by_id("h_c"): -1})
print(pgi.reaction)
print(pgi.check_mass_balance())









    



g6p_c <=> f6p_c
{}

It is also possible to build the reaction from a string. However, care must be taken when doing this to ensure reaction id's match those in the model. The direction of the arrow is also used to update the upper and lower bounds.



In [13]:

    
pgi.reaction = "g6p_c --> f6p_c + h_c + green_eggs + ham"









    



unknown metabolite 'green_eggs' created
unknown metabolite 'ham' created



In [14]:

    
pgi.reaction









    Out[14]:





'g6p_c --> f6p_c + green_eggs + h_c + ham'



In [15]:

    
pgi.reaction = "g6p_c <=> f6p_c"
pgi.reaction









    Out[15]:





'g6p_c <=> f6p_c'

Metabolites

We will consider cytosolic atp as our metabolite, which has the id "atp_c" in our test model.



In [16]:

    
atp = model.metabolites.get_by_id("atp_c")
atp









    Out[16]:





        
            
                Metabolite identifier atp_c
            

                Name ATP
            

                Memory address
                0x011b7f82b0
            

                Formula C10H12N5O13P3
            

                Compartment c
            

                In 13 reaction(s) 
                    PYK, GLNS, ATPS4r, SUCOAS, PPCK, GLNabc, ATPM, ACKr, Biomass_Ecoli_core, ADK1, PPS, PFK, PGK

We can print out the metabolite name and compartment (cytosol in this case) directly as string.



In [17]:

    
print(atp.name)
print(atp.compartment)









    



ATP
c

We can see that ATP is a charged molecule in our model.



In [18]:

    
atp.charge









    Out[18]:





-4

We can see the chemical formula for the metabolite as well.



In [19]:

    
print(atp.formula)









    



C10H12N5O13P3

The reactions attribute gives a frozenset of all reactions using the given metabolite. We can use this to count the number of reactions which use atp.



In [20]:

    
len(atp.reactions)









    Out[20]:





13

A metabolite like glucose 6-phosphate will participate in fewer reactions.



In [21]:

    
model.metabolites.get_by_id("g6p_c").reactions









    Out[21]:





frozenset({<Reaction G6PDH2r at 0x11b870c88>,
           <Reaction GLCpts at 0x11b870f98>,
           <Reaction PGI at 0x11b886a90>,
           <Reaction Biomass_Ecoli_core at 0x11b85a5f8>})

Genes

The gene_reaction_rule is a boolean representation of the gene requirements for this reaction to be active as described in Schellenberger et al 2011 Nature Protocols 6(9):1290-307.

The GPR is stored as the gene_reaction_rule for a Reaction object as a string.



In [22]:

    
gpr = pgi.gene_reaction_rule
gpr









    Out[22]:





'b4025'

Corresponding gene objects also exist. These objects are tracked by the reactions itself, as well as by the model



In [23]:

    
pgi.genes









    Out[23]:





frozenset({<Gene b4025 at 0x11b844cc0>})



In [24]:

    
pgi_gene = model.genes.get_by_id("b4025")
pgi_gene









    Out[24]:





        
            
                Gene identifier b4025
            

                Name pgi
            

                Memory address
                0x011b844cc0
            

                Functional True
            

                In 1 reaction(s) 
                    PGI

Each gene keeps track of the reactions it catalyzes



In [25]:

    
pgi_gene.reactions









    Out[25]:





frozenset({<Reaction PGI at 0x11b886a90>})

Altering the gene_reaction_rule will create new gene objects if necessary and update all relationships.



In [26]:

    
pgi.gene_reaction_rule = "(spam or eggs)"
pgi.genes









    Out[26]:





frozenset({<Gene spam at 0x11b850908>, <Gene eggs at 0x11b850eb8>})



In [27]:

    
pgi_gene.reactions









    Out[27]:





frozenset()

Newly created genes are also added to the model



In [28]:

    
model.genes.get_by_id("spam")









    Out[28]:





        
            
                Gene identifier spam
            

                Name 
            

                Memory address
                0x011b850908
            

                Functional True
            

                In 1 reaction(s) 
                    PGI

The delete_model_genes function will evaluate the GPR and set the upper and lower bounds to 0 if the reaction is knocked out. This function can preserve existing deletions or reset them using the cumulative_deletions flag.



In [29]:

    
cobra.manipulation.delete_model_genes(
    model, ["spam"], cumulative_deletions=True)
print("after 1 KO: %4d < flux_PGI < %4d" % (pgi.lower_bound, pgi.upper_bound))

cobra.manipulation.delete_model_genes(
    model, ["eggs"], cumulative_deletions=True)
print("after 2 KO:  %4d < flux_PGI < %4d" % (pgi.lower_bound, pgi.upper_bound))









    



after 1 KO: -1000 < flux_PGI < 1000
after 2 KO:     0 < flux_PGI <    0

The undelete_model_genes can be used to reset a gene deletion



In [30]:

    
cobra.manipulation.undelete_model_genes(model)
print(pgi.lower_bound, "< pgi <", pgi.upper_bound)









    



-1000 < pgi < 1000

Making changes reversibly using models as contexts

Quite often, one wants to make small changes to a model and evaluate the impacts of these. For example, we may want to knock-out all reactions sequentially, and see what the impact of this is on the objective function. One way of doing this would be to create a new copy of the model before each knock-out with model.copy(). However, even with small models, this is a very slow approach as models are quite complex objects. Better then would be to do the knock-out, optimizing and then manually resetting the reaction bounds before proceeding with the next reaction. Since this is such a common scenario however, cobrapy allows us to use the model as a context, to have changes reverted automatically.



In [31]:

    
model = cobra.test.create_test_model('textbook')
for reaction in model.reactions[:5]:
    with model as model:
        reaction.knock_out()
        model.optimize()
        print('%s blocked (bounds: %s), new growth rate %f' %
              (reaction.id, str(reaction.bounds), model.objective.value))









    



ACALD blocked (bounds: (0, 0)), new growth rate 0.873922
ACALDt blocked (bounds: (0, 0)), new growth rate 0.873922
ACKr blocked (bounds: (0, 0)), new growth rate 0.873922
ACONTa blocked (bounds: (0, 0)), new growth rate -0.000000
ACONTb blocked (bounds: (0, 0)), new growth rate -0.000000

If we look at those knocked reactions, see that their bounds have all been reverted.



In [32]:

    
[reaction.bounds for reaction in model.reactions[:5]]









    Out[32]:





[(-1000.0, 1000.0),
 (-1000.0, 1000.0),
 (-1000.0, 1000.0),
 (-1000.0, 1000.0),
 (-1000.0, 1000.0)]

Nested contexts are also supported



In [33]:

    
print('original objective: ', model.objective.expression)
with model:
    model.objective = 'ATPM'
    print('print objective in first context:', model.objective.expression)
    with model:
        model.objective = 'ACALD'
        print('print objective in second context:', model.objective.expression)
    print('objective after exiting second context:',
          model.objective.expression)
print('back to original objective:', model.objective.expression)









    



original objective:  -1.0*Biomass_Ecoli_core_reverse_2cdba + 1.0*Biomass_Ecoli_core
print objective in first context: -1.0*ATPM_reverse_5b752 + 1.0*ATPM
print objective in second context: 1.0*ACALD - 1.0*ACALD_reverse_fda2b
objective after exiting second context: -1.0*ATPM_reverse_5b752 + 1.0*ATPM
back to original objective: -1.0*Biomass_Ecoli_core_reverse_2cdba + 1.0*Biomass_Ecoli_core

Most methods that modify the model are supported like this including adding and removing reactions and metabolites and setting the objective. Supported methods and functions mention this in the corresponding documentation.

While it does not have any actual effect, for syntactic convenience it is also possible to refer to the model by a different name than outside the context. Such as



In [34]:

    
with model as inner:
    inner.reactions.PFK.knock_out

Reaction identifier	EX_glu__L_e
Name	L-Glutamate exchange
Memory address	0x011b8643c8
Stoichiometry	glu__L_e --> L-Glutamate -->
GPR
Lower bound	0.0
Upper bound	1000.0

Metabolite identifier	atp_c
Name	ATP
Memory address	0x011b7f82b0
Formula	C10H12N5O13P3
Compartment	c
In 13 reaction(s)	PYK, GLNS, ATPS4r, SUCOAS, PPCK, GLNabc, ATPM, ACKr, Biomass_Ecoli_core, ADK1, PPS, PFK, PGK

Metabolite identifier	atp_c
Name	ATP
Memory address	0x011b7f82b0
Formula	C10H12N5O13P3
Compartment	c
In 13 reaction(s)	PYK, GLNS, ATPS4r, SUCOAS, PPCK, GLNabc, ATPM, ACKr, Biomass_Ecoli_core, ADK1, PPS, PFK, PGK

Gene identifier	spam
Name
Memory address	0x011b850908
Functional	True
In 1 reaction(s)	PGI

Name	e_coli_core
Memory address	0x01116ea9e8
Number of metabolites	72
Number of reactions	95
Objective expression	-1.0Biomass_Ecoli_core_reverse_2cdba + 1.0Biomass_Ecoli_core
Compartments	cytosol, extracellular

Reaction identifier	PGI
Name	glucose-6-phosphate isomerase
Memory address	0x011b886a90
Stoichiometry	g6p_c <=> f6p_c D-Glucose 6-phosphate <=> D-Fructose 6-phosphate
GPR	b4025
Lower bound	-1000.0
Upper bound	1000.0