Thursday, April 28, 2011

Talkin' 'Bout My Regulation

The Regulation Summary Diagram for the bglG gene
in EcoCyc


You may have seen our new regulation summary diagrams on our EcoCyc gene pages. Or played with the regulatory overview. And you may have wished that you could get those visualizations for your own PGDBs. Unfortunately, there is as of yet no equivalent of PathoLogic for regulation -- no tool that will infer regulatory relationships, transcription factor sites, etc. from the genome annotation. Much of the regulatory data in EcoCyc was painstakingly curated from the primary literature (either by us or by the folks at RegulonDB) and entered piece by piece using our curation tools. That makes it an extremely valuable resource, but makes it difficult to replicate in other PGDBs without expending an equivalent amount of effort.
The Regulatory Overview Diagram for EcoCyc

Fortunately, there are other, faster ways to generate regulation data. High-throughput experiments and computational prediction programs can identify regulatory relationships and/or transcription factor binding sites en masse, and a number of groups have generated such data for their own organisms. The question that remains is how to bulk-load that data into a PGDB.