Tuesday, April 30, 2013

Metagenomics, PathoLogic and Pathway Abundance

Pathway abundance is a new parameter computed by PathoLogic, the tool that generates PGDBs in Pathway Tools. It is available starting with version 17.0 (March 2013) of Pathway Tools. Pathway abundances are computed from gene abundances supplied in metagenomics datasets, and are useful for comparing the metabolic profiles of different microbial communities.

Gene abundances are specified in the annotated genome file. Only the PathoLogic file format supports the specification of gene abundances. That is, gene abundance specification is not supported for the Genbank format. See Section “The PathoLogic File Format” in the Pathway Tools User Guide for more information about the PathoLogic format and how to specify the abundance attribute for a gene.

No preprocessing of the gene abundances is done by PathoLogic. That is, all gene abundances are taken as specified without doing any filtering such as outlier removal. We assume that any preprocessing of gene abundances has been done prior to generation of the annotated genome file. In particular, if some gene abundances are considered too low to be considered for the pathway abundances, these gene abundances should be omitted.

The abundance of a pathway is computed based on the gene abundances involved in the pathway. More precisely, assume that R is the set of reactions in pathway P for which gene abundances are specified, |R| the size of R and ga is the given abundance of gene g. The abundance of pathway P is

 
That is, the abundance of a pathway is the sum of the abundances of the genes catalyzing the reactions of the pathway, divided by the number of reactions of the pathway for which gene abundances are given. Notice that this formula does take into account all the known isozymes catalyzing a reaction and the spontaneous reactions do not take part in the computation.

Once PathoLogic has inferred the pathways from the annotated genome file, the computed abundances of the pathways can be found in the file pathways-report.txt under the subdirectory report of your PGDB. This report file lists all pathways that were inferred present in the PGDB alongside various computed parameters (e.g., confidence factor) including the computed abundances.

No comments:

Post a Comment