User:Vincent Rouilly/Distributed Annotation System (DAS) for DNA Part Registries: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
mNo edit summary
 
(3 intermediate revisions by the same user not shown)
Line 20: Line 20:
** retrieve subparts from a given part
** retrieve subparts from a given part
** retrieve superparts from a given part
** retrieve superparts from a given part
* Statistics about parts
* Source code @ Gut


==Software architecture==
==Software architecture==
Line 39: Line 41:
** Biodatabase, Bioentry, Biosequence, Bioentry_Qualifier_Value, Seqfeature, Location  
** Biodatabase, Bioentry, Biosequence, Bioentry_Qualifier_Value, Seqfeature, Location  


===Implement Dazzle plugin supporting BioSQL/datastore queries===
===Implement a Dazzle plugin to support BioSQL/datastore queries===
* You can find here instructions about how to [http://www.biojava.org/wiki/Dazzle:writeplugin write a new Dazzle plugin].
* The new plugin implements the following methods:
** ...
** ...


===Process and Upload data from MIT Part Registry to Google App Engine (GAE)===
===Process and Upload data from MIT Part Registry to Google App Engine (GAE)===
 
* The [http://partsregistry.org MIT Part Registry] implements a limited [http://partsregistry.org/Registry_API API] to access its data:
** limited FASTA description of parts ([http://partsregistry.org/fasta/parts/All_Parts part dump in FASTA])
** limited [http://partsregistry.org/DAS_-_Distributed_Annotation_System DAS description] of parts (no assembly information for example)
* A Biopython script was used to process the FASTA dump file to generate GAE Upload files. Below is the BioBrick information that was processed:
** BioBrick Sequence
** BioBrick Author
** BioBrick Category
** BioBrick DNA Status
** BioBrick Short Description
** BioBrick Assembly information (subpart + superparts from BLAST queries within Biopython script)


==Project resources==
==Project resources==

Latest revision as of 03:58, 28 August 2009

Distributed Annotation System for DNA Part Registries

Vincent 05:55, 28 August 2009 (EDT): This is a work in progress. If you are interested to contribute, or if you want some more info, please feel free to contact me.

Overview

  • ...
  • ...

Objectives

  • ...
  • ...

DNA Part DAS Server

  • Server address:
  • Typical queries:
    • retrieve all parts
    • retrieve all supported annotation types
    • retrieve DNA from a given part
    • retrieve all annotation from a given part
    • retrieve subparts from a given part
    • retrieve superparts from a given part
  • Statistics about parts
  • Source code @ Gut

Software architecture

Implementation Steps

We summarise here the different steps undertaken during this project.

Run Dazzle on the Google App Engine (GAE)

  • Dazzle is a Java application that usually runs on a Tomcat server. However, GAE support Java applications, and no tweaking is necessary to run Dazzle on GAE.
  • Instructions@BioJava

Implement a BioSQL subset on top of the Google datastore

  • BioSQL is a popular relational database model to store DNA sequences and annotations.
  • BioPython, BioJava, and BioPerl projects provide easy connectivity to the schema.
  • Google datastore is not a relational database. BioSQL schema has to be reformated into a more object oriented data model.
  • Only a BioSQL subset was considered for this project. Below is listed the implemented BioSQL tables:
    • Ontology and Term
    • Biodatabase, Bioentry, Biosequence, Bioentry_Qualifier_Value, Seqfeature, Location

Implement a Dazzle plugin to support BioSQL/datastore queries

  • You can find here instructions about how to write a new Dazzle plugin.
  • The new plugin implements the following methods:
    • ...
    • ...

Process and Upload data from MIT Part Registry to Google App Engine (GAE)

  • The MIT Part Registry implements a limited API to access its data:
  • A Biopython script was used to process the FASTA dump file to generate GAE Upload files. Below is the BioBrick information that was processed:
    • BioBrick Sequence
    • BioBrick Author
    • BioBrick Category
    • BioBrick DNA Status
    • BioBrick Short Description
    • BioBrick Assembly information (subpart + superparts from BLAST queries within Biopython script)

Project resources