Gapfillling

Model gap filling is the task of figuring out which reactions have to be added to a model to make it feasible. Several such algorithms have been reported e.g. Kumar et al. 2009 and Reed et al. 2006. Cobrapy has a gap filling implementation that is very similar to that of Reed et al. where we use a mixed-integer linear program to figure out the smallest number of reactions that need to be added for a user-defined collection of reactions, i.e. a universal model. Briefly, the problem that we try to solve is

Minimize: $$\sum_i c_i * z_i$$

subject to

$$Sv = 0$$$$v^\star \geq t$$$$l_i\leq v_i \leq u_i$$$$v_i = 0 \textrm{ if } z_i = 0$$

Where l, u are lower and upper bounds for reaction i and z is an indicator variable that is zero if the reaction is not used and otherwise 1, c is a user-defined cost associated with using the ith reaction, $v^\star$ is the flux of the objective and t a lower bound for that objective. To demonstrate, let's take a model and remove some essential reactions from it.


In [1]:
import cobra.test
from cobra.flux_analysis import gapfill
model = cobra.test.create_test_model("salmonella")

In this model D-Fructose-6-phosphate is an essential metabolite. We will remove all the reactions using it, and at them to a separate model.


In [2]:
universal = cobra.Model("universal_reactions")
for i in [i.id for i in model.metabolites.f6p_c.reactions]:
    reaction = model.reactions.get_by_id(i)
    universal.add_reaction(reaction.copy())
    model.remove_reactions([reaction])

Now, because of these gaps, the model won't grow.


In [3]:
model.optimize().objective_value


Out[3]:
0.0

We will use can use the model's original objective, growth, to figure out which of the removed reactions are required for the model be feasible again. This is very similar to making the 'no-growth but growth (NGG)' predictions of Kumar et al. 2009.


In [4]:
solution = gapfill(model, universal, demand_reactions=False)
for reaction in solution[0]:
    print(reaction.id)


GF6PTA
F6PP
TKT2
FBP
MAN6PI

We can obtain multiple possible reaction sets by having the algorithm go through multiple iterations.


In [5]:
result = gapfill(model, universal, demand_reactions=False, iterations=4)
for i, entries in enumerate(result):
    print("---- Run %d ----" % (i + 1))
    for e in entries:
        print(e.id)


---- Run 1 ----
GF6PTA
F6PP
TKT2
FBP
MAN6PI
---- Run 2 ----
GF6PTA
TALA
PGI
F6PA
MAN6PI
---- Run 3 ----
GF6PTA
F6PP
TKT2
FBP
MAN6PI
---- Run 4 ----
GF6PTA
TALA
PGI
F6PA
MAN6PI

We can also instead of using the original objective, specify a given metabolite that we want the model to be able to produce.


In [6]:
with model:
    model.objective = model.add_boundary(model.metabolites.f6p_c, type='demand')
    solution = gapfill(model, universal)
    for reaction in solution[0]:
        print(reaction.id)


FBP

Finally, note that using mixed-integer linear programming is computationally quite expensive and for larger models you may want to consider alternative gap filling methods and reconstruction methods.