Monday, October 22, 2018

A 7th EC Class for Translocases

Six enzyme classes have been recognized since the first Enzyme classification and nomenclature list was first approved by the International Union of Biochemistry in 1961. These were based on the type of reaction catalyzed:

Oxidoreductases (EC 1)
Transferases (EC 2)
Hydrolases (EC 3)
Lyases (EC 4)
Isomerases (EC 5)
Ligases (EC 6)

However, it has become apparent that none of these could describe the important group of enzymes that catalyse the movement of ions or molecules across membranes or their separation within membranes. Several of these involve the hydrolysis of ATP and had been previously classified as ATPases (EC 3.6.3.-), although the hydrolytic reaction is not their primary function.

These enzymes have now been classified under a new EC class of Translocases (EC 7). The reactions catalyzed are designated as transfers from ‘side 1’ to ‘side 2’ because the designations ‘in’ and ‘out’ (or ‘cis’ and ‘trans’), which had been used previously, lack clarity and can be ambiguous. The comments associated with each entry then describe the specific translocations catalysed.

The subclasses designate the types of ion or molecule translocated:
  • EC 7.1 contains enzymes catalysing the translocation of hydrons (hydron being the general name for H+ in its natural abundance),
  • EC 7.2 contains those catalysing the translocation of inorganic cations and their chelates,
  • EC 7.3 contains those catalysing the translocation of inorganic anions,
  • EC 7.4 contains those catalysing the translocation of amino acids and peptides,
  • EC 7.5 contains those catalysing the translocation of carbohydrates and their derivatives
  • EC 7.6 contains those catalysing the translocation of other compounds.
The sub-subclasses concern the reaction that provides the driving force for the translocation, where these are relevant:
  • EC 7.x.1 translocations linked to oxidoreductase reactions
  • EC 7.x.2 translocations linked to the hydrolysis of a nucleoside triphosphate
  • EC 7.x.3 translocations linked to the hydrolysis of a diphosphate
  • EC 7.x.4 translocations linked to a decarboxylation reaction
Exchange transporters that are not dependent on enzyme-catalyzed reactions, such as the exchange of ions across membranes, are not included. Pores that change conformation between open and closed states in response to phosphorylation or some other catalyzed reaction are classified under EC 5.6 (Macromolecular conformational isomerases).

(prepared by Keith Tipton, originally published at the ExplorEnz enzyme database)

Friday, October 12, 2018

Pan-Genome PGDBs Unify Genomic and Metabolic Data across Related Strains

Pan-Genome PGDBs are a relatively new feature of BioCyc. These BioCyc databases combine in one place information about multiple sequenced genomes for a given species. For example, the Helicobacter pylori Pan-Genome database covers 158 sequenced strains.

The Pan-Genome PGDBs contain one gene object for each orthologous group of genes in the organism. They also contain the union of all metabolic pathways across all the strains. Thus, a Pan-Genome PGDB allows you to quickly assess the full set of gene functions and metabolic pathways across the the known strains.  For example, the gene page shows all orthologs across all strains in the Pan-Genome.  The page for the ftsX gene illustrates this for a gene with relatively few orthologs.

Other gene pages list hundreds of orthologs and synonyms.

You can visit a page for the orthologous genes by following the links in the 'Relationship Links' area near the bottom of the page.

Pan-Genomes provide a way to visualize genes in the genome browser as well.   It's easier to understand this visualization if you know that Pan-Genomes are constructed starting with the PGDB of a base strain and adding other members of the collection of strains one-by-one.  For example, the H. pylori Pan-Genome was constructed by starting with strain 26695.  The visualization is based on dividing groups of orthologous genes into two sets: ortholog groups that include genes that occur in the base strain, and ortholog groups that only include genes in other strains.  

Ortholog groups that include genes from the base strain are collected on a 'chromosome' that preserves the location and direction of the genes from the base strain.  The remaining ortholog groups are mapped in arbitrary order onto an 'artificial replicon'.  The names in the artificial replicon display are based on the (arbitrary) order that PGDBs were added to the Pan-Genome.

You can search for genes either by name (e.g., 'abc') or an identifier combining a strain name and id joined with an underscore (e.g., HPHPP11_0013) by entering the name in the search box at the top right.  Either quick search or gene search will take you to the page containing the gene and its orthologs in the Pan-Genome PGDB. 

The cellular overview diagram provides a way to visualize reactions associated with genes shared by all members of the Pan-Genome. You can also visualize those reactions that are unique to a single organism within the Pan-Genome.  In the screen shot below, the reactions shared by all organisms in the H. pylori Pan-Genome are shown in red and those unique to a single organism are purple.

To create a diagram like this on BioCyc web site, select a Pan-Genome PGDB, and bring up the cellular overview. In the operations menu, choose Highlight Genes -> By Pan-Genome Core Genes.  The core genes are the set of genes shared by all organism databases in the Pan-Geneome.  Chose Highlight Genes -> By Pan-Genome Unique Genes, to highlight reactions associate with genes that are unique to a single organism database.

In the desktop, show the Cellular Overview, then from Overviews -> Highlight -> Highlight the core genome followed by Overviews -> Highlight the unique genes will achieve the same highlighting.

More details on how Pan-Genome PGDBs are created and how to use them are provided here.

You can select a Pan-Genome PGDB by entering the phrase “pan-genome” in the change organism database dialogue. Here's our current list of Pan-Genome PGDBs and the number of strains that each one contains. Over time we will be adding Pan-Genome PGDBs for additional species, and regenerating existing Pan-Genome PGDBs to include additional strains.

Clostridioides difficile 10 strains

Escherichia coli 374 strains

Helicobacter pylori 158 strains

Listeria monocytogenes 35 strains

Mycobacterium tuberculosis 24 strains

Pseudomonas aeruginosa 24 strains

Salmonella enterica 113 strains

Shigella flexneri 9 strains

Vibrio cholerae 81 strains