In [1]:
from cameo import config
config.default_view = config.SequentialView()
from pandas import options
options.display.max_rows = 8
Cameo uses Evolutionary Algorithms, which allow the search of near-optimal solutions very fast by combining linear programming to simulate flux distributions and Evolutionary Algorithms to find combinations that improve, for example the yield of a desired product. The OptKnock[1] method uses this approach.
The evolutionary algorithms are iterative algorithms. In each iteration multiple solutions are evaluated and assigned a fitness value. The solutions that improve the objective are kept. Some of them are altered and reassigned to the next iteration. After some rounds the objective has been improved and if the algorithm is ran long enought all possible solutions will be covered.
Thanks to the inspyred
library, we implemented a low level interface to allow the implementation of more elaborate strategies than OptGene.
In [4]:
from cameo import models
iJO1366 = models.bigg.iJO1366
In [5]:
from cameo.strain_design.heuristic.evolutionary.objective_functions import biomass_product_coupled_yield
of = biomass_product_coupled_yield(iJO1366.reactions.BIOMASS_Ec_iJO1366_core_53p95M,
iJO1366.reactions.EX_ac_e,
iJO1366.reactions.EX_glc__D_e)
of
Out[5]:
Other fitness functions such as yield or number of knockouts are also available in cameo.
In [6]:
from cameo.strain_design.heuristic.evolutionary.objective_functions import product_yield, number_of_knockouts
Costumized objectives can be implemented by extending the base class ObjectiveFunction.
During the evaluation phase, all objective functions will be called with the follwing parameters of(model, flux_distribution, decoded_solution)
.
In this example, we are looking for gene knockouts leading to biomass coupled acetate production with E. coli through iJO1366 model. This is very similar to OptGene.
There are multiple configurations for this strategy. The most basic configuration requires a model and an objective funtion. More parameters can be used, such as the method to simulate the flux distributions (FBA is the defualt), the Evolutionary Computation (Genetic Algorithm inspyred.ec.GA
is the default). The implemetation removes the essential genes from the search, as they won't yield soutions that are viable. More genes can be removed from the search if defined (either due to biological knowlege or user strategy).
In [8]:
from cameo.strain_design.heuristic.evolutionary import GeneKnockoutOptimization
ko = GeneKnockoutOptimization(model=iJO1366, objective_function=of)
In [9]:
res1 = ko.run(max_evaluations=15000, view=config.default_view)
In [10]:
res1
Out[10]:
In this example, we are looking for gene knockouts for biomass coupled succinate prodution with S. cerevisiae through iMM904 model. Number of mutations necesary to generate the predicted strain is the secondary objective. This allows searching for strategies that minimize the number of changes.
In [6]:
iMM904 = models.bigg.iMM904
In [7]:
objective1 = biomass_product_coupled_yield(iMM904.reactions.BIOMASS_SC5_notrace,
iMM904.reactions.EX_succ_e,
iMM904.reactions.EX_glc__D_e)
objective2 = number_of_knockouts()
multi_objective = [objective1, objective2]
In [ ]:
ko = GeneKnockoutOptimization(model=model,
objective_function=multi_objective,
heuristic_method=inspyred.ec.emo.PAES)
In [ ]:
res2 = ko.run(max_evaluations=15000)
In [ ]:
res2
The implemented approach makes use of linear programming in the evaluation phase, which means that different methods can be used to compute the flux distributions.
All methods found in cameo.flux_analysis.simulation
can be used as a simulation method.
Alternativly users can give any method as long as they follow the signture simulation_method(model, **kwargs)
.
The required keyword arguments can be preset on the Optimization class. besides those arguments a ProblemCache will be passed as cache=cache_object
for optimized performance.
In [8]:
from cameo.flux_analysis.simulation import lmoma
In [ ]:
ko.simulation_method = lmoma
In [ ]:
ko.simulation_kwargs
In [ ]:
res3 = ko.run(max_evaluations=15000)
In this example, we are looking for reaction knockouts for biomass coupled succinate prodution with E. coli through iJO1366 model.
In [ ]:
model = iJO.copy()
of = biomass_product_coupled_yield(model.reactions.Ec_biomass_iJO1366_core_53p95M,
model.reactions.EX_glu__L_e,
model.reactions.EX_glc__D_e)
In [ ]:
ko = ReactionKnockoutOptimization(model=model,
objective_function=of,
heuristic_method=inspyred.ec.GA
essential_reactions=["ATPM"])
The knockout search using reactions will try to remove the Maintenance ATP reaction. The Maintenance ATP reaction represents the non-growth associated ATP cost. It is not essential to growth, but it is relevant to keep the model predictability. For that reason is added as essential.
In [ ]:
results_3 = ko.run(max_evaluations=5000, mutation_rate=0.15, indel_rate=0.185)
In [ ]:
results_3
In this example, we are looking for reaction knockouts for high yield acetate prodution with E. coli through iJO1366 model. As before, number of knockouts necessary to construct the strain is a secondary objective.
In [ ]:
of1 = product_yield(model.reactions.EX_ac_e.id, model.reactions.EX_glc__D_e.id)
of2 = number_of_knockouts()
In [ ]:
ko = ReactionKnockoutOptimization(model=model,
objective_function=[of1, of2],
simulation_method=fba,
heuristic_method=inspyred.ec.emo.NSGA2)
In [ ]:
results_4 = ko.run(max_evaluations=5000, n=1, mutation_rate=0.3, populations_size=100, crossover_rate=0.2)
In [ ]:
results_4
In [ ]:
In [ ]:
[1] Patil, K. R., Rocha, I., Förster, J., & Nielsen, J. (2005). Evolutionary programming as a platform for in silico metabolic engineering. BMC Bioinformatics, 6, 308. doi:10.1186/1471-2105-6-308
[2] Knowles, J., & Corne, D. (n.d.). The Pareto archived evolution strategy: a new baseline algorithm for Pareto multiobjective optimisation. In Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406) (pp. 98–105). IEEE. doi:10.1109/CEC.1999.781913
In [ ]: