Like many problems in bioinformatics, accurate prediction of metabolic pathways depends on a tight coordination between an algorithm and a database. The pathway-prediction algorithm used in the Pathway Tools software that powers BioCyc.org is called PathoLogic; the database is MetaCyc. Here we provide an overview of the latest version of PathoLogic pathway prediction; [
1] describes an older version of the algorithm.
The input to PathoLogic is an annotated genome, meaning gene locations and gene functions have been predicted. The genome can be supplied in the form of a GenBank (.gbk or .gbff) or GFF3 file.
The essence of the PathoLogic algorithm is to recognize known pathways from MetaCyc in the genome being analyzed. PathoLogic performs two steps: reactome inference and pathway inference.