Model gap filling is the task of figuring out which reactions have to be added to a model to make it feasible. Several such algorithms have been reported e.g. Kumar et al. 2009 and Reed et al. 2006. Cobrapy has a gap filling implementation that is very similar to that of Reed et al. where we use a mixed-integer linear program to figure out the smallest number of reactions that need to be added for a user-defined collection of reactions, i.e. a universal model. Briefly, the problem that we try to solve is
Minimize: $$\sum_i c_i * z_i$$
subject to
Where l, u are lower and upper bounds for reaction i and z is an indicator variable that is zero if the reaction is not used and otherwise 1, c is a user-defined cost associated with using the ith reaction, $v^\star$ is the flux of the objective and t a lower bound for that objective. To demonstrate, let's take a model and remove some essential reactions from it.
In [1]:
import cobra.test
from cobra.flux_analysis import gapfill
model = cobra.test.create_test_model("salmonella")
In this model D-Fructose-6-phosphate is an essential metabolite. We will remove all the reactions using it, and at them to a separate model.
In [2]:
universal = cobra.Model("universal_reactions")
for i in [i.id for i in model.metabolites.f6p_c.reactions]:
reaction = model.reactions.get_by_id(i)
universal.add_reaction(reaction.copy())
model.remove_reactions([reaction])
Now, because of these gaps, the model won't grow.
In [3]:
model.optimize().objective_value
Out[3]:
We will use can use the model's original objective, growth, to figure out which of the removed reactions are required for the model be feasible again. This is very similar to making the 'no-growth but growth (NGG)' predictions of Kumar et al. 2009.
In [4]:
solution = gapfill(model, universal, demand_reactions=False)
for reaction in solution[0]:
print(reaction.id)
We can obtain multiple possible reaction sets by having the algorithm go through multiple iterations.
In [5]:
result = gapfill(model, universal, demand_reactions=False, iterations=4)
for i, entries in enumerate(result):
print("---- Run %d ----" % (i + 1))
for e in entries:
print(e.id)
We can also instead of using the original objective, specify a given metabolite that we want the model to be able to produce.
In [6]:
with model:
model.objective = model.add_boundary(model.metabolites.f6p_c, type='demand')
solution = gapfill(model, universal)
for reaction in solution[0]:
print(reaction.id)
Finally, note that using mixed-integer linear programming is computationally quite expensive and for larger models you may want to consider alternative gap filling methods and reconstruction methods.