Wednesday, June 8, 2016

Bulk Updates to Your PGDB

One question that we frequently receive is about how to apply bulk updates to a PGDB. This kind of situation can come about for several reasons:
  • When a group maintains and curates organism data on an ongoing basis using their own software or database environment, and then wants to update a PGDB with all their changes in a single batch operation.
  • When a revised annotation for an organism is made available, and a user wishes to update their PGDB with the new data without losing any existing curation.
  • When a user has some systematic change that they want to apply to large number of objects, such as a change to the locus id format, the addition of a new set of synonyms, or adding links to a new external database.
  • When a user wants to import a large dataset obtained via a high-throughput experiment or computational prediction, such as for protein cellular location or transcription factor binding sites.
Because these are all common scenarios, it seems worthwhile to provide an overview of the various ways that Pathway Tools supports bulk updating of PGDBs.  Note that none of the features discussed here are particularly new, and all have been supported by Pathway Tools for several years.  All User Guide section numbers referenced below are for version 20.0.

It should first be noted that Pathway Tools comes with a full suite of editing and curation tools, so if you have only a handful of changes to make, you should use those to make the edits interactively. The techniques described in this article would normally only be used if you have so many updates that it would be tedious to make the edits manually.