Wednesday, March 27, 2019

Regulation Roundup

BioCyc includes a range of different kinds of regulation data, but because there are many different ways that biological processes are regulated, that data is distributed over different parts of the site and is accessed in different ways.  This post is an attempt to summarize all the different types of regulation we represent, to help you find the information you are seeking.

BioCyc contains the following types of regulatory information:
  • Regulation of enzymatic activity, usually by small molecules or metal ions.  This includes required enzyme cofactors, allosteric activation and inhibition, competitive inhibition, and other forms of enzyme modulation.
  • Regulation of gene transcription or translation.  This includes regulation of transcription by transcription factors and sigma factors, attenuation, and regulation of translation by small RNAs, riboswitches, and regulatory proteins.
  • Regulation by protein modification, such as phosphorylation or binding of a ligand, to determine whether or not a protein is in its active form.
When looking at a gene page, the various regulatory influences on the gene and its product are summarized in the Regulation Summary Diagram at the top of the Summary tab (if this diagram is not present, then there is no regulatory data available for that organism). Mousing over any element of this diagram describes its particular mode of regulatory action.  More detailed information is available in the other tabs, as described below.

Regulation Summary Diagrams for three EcoCyc genes, illustrating a range of regulation types.
a) Transcription of trpD is inhibited by TrpR bound to tryptophan; translation of trpD is attenuated by trp-tRNA; the TrpDE enzyme requires Mg2+as a cofactor, and its activity is inhibited by tryptophan.
b) Transcription of  oppA is inhibited by several different transcription factors, and requires sigma factor σ28; translation of oppA is activated by a spermidine riboswitch and inhibited by small RNA GcvB with accessory protein Hfq.
c) Transcription of uvrY is inhibited by LexA; translation of uvrY is activated by regulatory protein DeaD; the protein UvrY is converted to its phosphorylated form by BarA-P.

Thursday, March 7, 2019

BioCyc Gene Essentiality Data

BioCyc contains gene essentiality datasets for 30 organisms (see listing at end).  These datasets have been both collected from individual publications, and harvested in bulk from the OGEE database (see  In some cases, multiple essentiality datasets are available for a given organism (e.g., for E. coli K-12).

BioCyc essentiality datasets are keyed to the conditions of growth (e.g., MOPS medium with 0.4% glucose), since different sets of genes will be essential under different conditions of growth.  If essentiality data is available for a given gene, it will be shown on the gene page, under the essentiality tab.

The Essentiality tab for gene trpA in EcoCyc

In addition, BioCyc provides access to the full lists of essential and non-essential genes for a given dataset.  To obtain such lists:
  1. Select your organism of interest using "change organism database" in the upper right corner.
  2. Click Analysis menu → Summary Statistics
  3. In the bottom table, the table will contain an entry "Gene Essentiality Datasets" if any essentiality datasets are present. Click on those words.
  4. In the resulting page, click on the entry in the "Growth Medium" column for the dataset of interest.
  5. The resulting pages will show lists of genes that are essential and/or non-essential for growth on this medium.  To manipulate those gene lists, click "Turn into a SmartTable".  SmartTable results can be downloaded as a file and explored interactively.
A section of the page for growth medium M9 medium with 1% glycerol in EcoCyc.

Current BioCyc organisms with gene essentiality data are as follows. To generate the list:
  1. Click "change organism database" in upper right corner of page
  2. Click "By Organism Properties" tab
  3. In "Select Property" selector, select "# Genes with Essentiality Data"
  4. Change "has values" selector to "has any value"
  5. Click "Find Organisms"
The organism selector showing all databases with gene essentiality datasets

  • Agrobacterium fabrum C58
  • Bacillus subtilis subtilis 168
  • Brevundimonas subvibrioides ATCC 15264
  • Burkholderia cenocepacia J2315
  • Caulobacter crescentus NA1000
  • Escherichia coli CFT073
  • Escherichia coli K-12 substr. MG1655 (EcoCyc)
  • Escherichia coli O25b:H4-ST131
  • Francisella tularensis novicida U112
  • Haemophilus influenzae Rd KW20
  • Helicobacter pylori 26695
  • Homo sapiens
  • Mycobacterium tuberculosis H37Rv
  • Mycoplasma genitalium G-37
  • Mycoplasma pneumoniae M129
  • Mycoplasma pulmonis UAB CTIP
  • Porphyromonas gingivalis ATCC 33277
  • Pseudomonas aeruginosa PAO1
  • Pseudomonas aeruginosa UCBPP-PA14
  • Rhizobium leguminosarum bv. viciae 3841
  • Saccharomyces cerevisiae S288c
  • Salmonella enterica enterica CT18
  • Salmonella enterica enterica LT2
  • Salmonella enterica enterica SL1344
  • Salmonella enterica enterica Ty2
  • Shewanella oneidensis MR-1
  • Sphingomonas wittichii RW1
  • Streptococcus pneumoniae D39
  • Streptococcus pneumoniae TIGR4; ATCC BAA-334
  • Synechococcus elongatus PCC 7942