A major advance in bioinformatics in the last decade is the
rapidity with which we can now create quantitative metabolic models from
sequenced genomes. In this and future
blog posts we will examine several applications of metabolic modeling. This post introduces metabolic modeling,
considers its use for validation of genome annotations, and proposes that construction
of metabolic models can form a routine part of the genome annotation process.
Introduction to Steady-State Metabolic Modeling
In these blog posts we describe steady-state metabolic
models (as opposed to kinetic models).
Steady-state models describe a cell whose metabolic machinery is at
equilibrium, steadily churning out energy and the end products of biosynthesis
from nutrients that the cell takes in at steady rates. At steady state, the fluxes that produce each
cellular metabolite equal the fluxes that consume each metabolite. The fluxes are balanced, hence the name for a
major modeling technique in this field: flux-balance analysis.
Unlike the other main approach to metabolic modeling –
kinetic modeling – steady-state models do not predict how the metabolic state
of the cell changes over time. That
drawback is counter-balanced by the fact that steady-state models are orders of
magnitude easier to create than kinetic models because steady-state models do
not require the large number of difficult-to-measure quantitative parameters
that kinetic models require. Thus it is
practical to create steady-state models at the genome scale, based on genome
annotations.
Steady-state metabolic flux models have five components. Examples of each can be found in our recent
paper on EcoCyc as an E. coli
metabolic model [1]:
1.
The set of nutrients available as inputs to the
metabolism. These include one or more
sources of carbon, nitrogen, phosphorus, and sulfur. Our E.
coli model uses 14 nutrients.
2.
The set of metabolites created as end products
of metabolism, called the biomass metabolites.
These include amino acids, nucleotides, lipids, polysaccharides, and
other cell constituents; our E. coli
model produces 83 biomass metabolites. The
relative molar mass of each biomass metabolite can be provided to model the
cell composition in detail.
3.
The set of waste products secreted by the cellular
metabolism. Examples include carbon
dioxide, methane, hydrogen gas, excess water and protons, and fermentation
products such as acetate and ethanol.
4.
The set of reactions constituting the metabolic
network. Our genome-scale model of E.
coli covers more than 2000 reactions.
5.
An optional set of constraints on the fluxes
within the networks, such as constraints on the uptake rates of different
nutrients, and limits on the flux rates of reactions within the network.
Given the preceding inputs, a steady-state metabolic model
predicts the steady-state specific flux rates (the number of moles of the
reaction products created per gram of cells per second) of every reaction
within the network. For many of those
reactions, their flux rate will be zero because they are not used by the cell during
growth on the specified set of nutrients.
For simulations of E. coli
growth on glucose under aerobic conditions, only about 20% of the reactions in
the model carry flux [1].
Validation of Genome Annotation and of Genome-Based Metabolic Reconstruction
Metabolic models can be developed to varying levels of
accuracy and validation. The simplest
level of validation underlies our use case here, namely verifying that the model
can produce all biomass metabolites from the input nutrients, or put another
way, demonstrating that “the model can grow.”
In our experience, metabolic models never exhibit growth the first time
they are run, just as computer programs rarely work the first time they are
run. The most frequent reason that
models fail to grow is because they are incomplete – they lack one or more
critical metabolic reactions. That
incompleteness is typically due to incompleteness in the genome annotation. Genes whose function was not predicted at all
– or were predicted incorrectly during sequence analysis – lead to missing
reactions in the metabolic network, referred to as network gaps.
Not all gaps prevent model growth, because the cell can
circumvent some gaps using other routes through the metabolic network. But a gap that prevents the production by the
network of any one biomass metabolite will prevent model growth. Thus, model growth becomes a test for the
validity of the genome annotation, and of the metabolic reconstruction (the
metabolic reaction set) derived from the genome annotation by software such as
SRI’s Pathway Tools.
Identifying the missing reactions in a metabolic network is
quite a difficult problem when approached manually. Therefore, the MetaFlux modeling tool within
Pathway Tools provides a gap filler
module that automatically suggests what reactions are missing. More precisely, the gap filler computes a
minimal set of reactions from our MetaCyc database that, if added to the
organism’s metabolic model, will enable growth of the model (meaning production
of all biomass metabolites). Given the
suggested set of reaction gap fillers, you can use Pathway Tools’ Pathway Hole
Filler module to search for genes within the genome than may code for the
enzymes catalyzing those reactions, or you can use your own sequence-analysis
methodology to search for those enzymes.
In our experience, even highly curated genome annotations contain
network gaps that can be identified using metabolic modeling. A metabolic model that simply shows growth
under a given nutrient set is still at an early stage of development and
probably requires further development before it will produce accurate
quantitative predictions (such as predicting the growth rate accurately). However, this approach is likely to improve
the quality of genome annotations, and would set a new bar for publications on
completely sequenced genomes.
Learn More about MetaFlux
MetaFlux and the full Pathway Tools software are freely
available from SRI for academic use. You
can learn more about how to use MetaCyc by attending SRI’s metabolic modeling
tutorials (the next tutorial is scheduled for March 18-19, 2015 in Menlo Park,
CA), and by reading the Pathway Tools User’s Guide.
[1] Weaver
DS, Keseler IM, Mackie A, Paulsen IT, Karp
PD. A genome-scale
metabolic flux model of Escherichia coli K-12 derived from the EcoCyc database.
BMC Syst Biol. 2014 Jun 30;8:79. doi: 10.1186/1752-0509-8-79.
No comments:
Post a Comment