OWWProject:Content Tagging

Description
A feature of MediaWiki that has been used on occasion in the past allows us to automatically search for regular patterns in text and to create links "on the fly" to external references related to the specific tags.

I recently added support for BioBrick parts from the registry using the syntax, "BBa LNNNN" where L is a letter and NNNN are numbers. Any such detected sting will be displayed to the user as a link to the registry page related to that tag.

This same method can be applied to other data sets. In addition, a recently added OWW extension would allow the display of a "pop-up" when a user hovers over a link. This can be formatted to include a synopsis of the information available. The processing speed to do this can be an issue; if a dozen references are present, it's possible that 12 network round-trips for dats will be required, thus significantly adding to the time to render a document. To get around this, caching can be implemented using existing services to eliminate such delays.

Another option is to cache a set of data from services to allow for the formatting of specific user requests. In other words, a user might be able to choose which data needs to be added based upon a a list of options.

Status
In progress.

Currently,


 * BioBrick tags are working.
 * PMID Tags using data found in Pubmed and stored by Pubget are working.
 * RFC (Internet RFC) tags was disabled. But we can enable support of RFC tags to the BioBrick RFC list.
 * Genbank support has been implemented but is currently disabled. It can be turned on upon request.
 * Restricting enzyme support was added but disabled. It used data from the Rebase.org website.
 * ISBN references work but only to display search pages. Options exist to directly use the Library of Congress listings. This has not been implemented.
 * DOI tags have not been implemented. Partly this is because of their format. The standard is to prepend "doi:" to a number. The expression that contains the full DOI is not specifically described as part of the standard and therefore can be complicated to isolate. We can use the same mechanism as PMID and lookup the info via Pugget.
 * ISSN tags can also be identified using an online catalog. These are not as prevalent as the ones above.

One useful feature would be to collect all references found within a page and display them at the end of any page using the same format as the current Biblio extension. In fact all of the work to do this could be done by Biblio. In addition, Biblio would now support these additional formats.

Comments
There seems to be a way to allow labs to provide their own set of tags that would be used by their members. If the Biblio extension option were used, another possibility would be to create a lab-wide list of references.

The time to add support for services varies.

Here are some estimates:


 * BioBrick: done
 * PMID vis Pubget: done
 * RFC (Biobricks): 1 day
 * Genbank: 1 day
 * Restricting enzymes: 1 day
 * ISBN: 1 day
 * DOI: 1 day for test. There may be problems related to specific formats. This will emerge in testing
 * ISSN: 1 day

Issues
Some of the work is predicated upon using Pubget. More discussions are needed to evaluate what they may best be used for now and over time.