|
|
Line 8: |
Line 8: |
| |- | | |- |
| ! Jul 22 | | ! Jul 22 |
| | Finish the modification of Proportal DB Schema. Re-create all tables in a new database with all the missing foreign keys and their cascade properties. || The new database schema has been created by adding all missing foreign keys in the current version. A number of errors are to be fixed in next week to synchronize the datasets stored in the development and production databases. || Add your comments | | | Finish the modification of Proportal DB Schema. Re-create all tables in a new database with all the missing foreign keys and their cascade properties. || The new [[Proportal DB Schema]] has been created by adding all missing foreign keys in the current version. A number of errors are to be fixed in next week to synchronize the datasets stored in the development and production databases. || Add your comments |
| |- | | |- |
| ! Jul 29 | | ! Jul 29 |
| | Finish the update of datasets in Proportal public website. Migrate/merge all datasets stored in the development and production databases. Verify the integrity and consistency between the development and production databases and websites. || Ongoing || Add your comments | | | Finish the update of datasets in Proportal public website. Migrate/merge all datasets stored in the development and production databases. Verify the integrity and consistency between the development and production databases and websites. || Refer to the [[Release Notes]] for a number of issues resolved. || Add your comments |
| |- | | |- |
| ! Aug 5 | | ! Aug 5 |
| | Finish the modification of cluster analysis using static/manual processing. || Ongoing || Add your comments | | | Finish the modification of cluster analysis using static/manual processing. || Refer to the [[Release Notes]] for a number of issues resolved. || Add your comments |
| |- | | |- |
| ! Aug 15 | | ! Aug 15 |
| | Due date for submitting the database paper || TBD || Add your comments | | | Due date for submitting the database paper || Refer to the [[To-do List]] for a number of issues that should be addressed before the submission. || Add your comments |
| |} | | |} |
|
| |
|
| =[[Release Notes]]= | | =Proportal DB Schema= |
| | The new [[Proportal DB Schema]] is created by adding all the missing foreign keys back into the database. Some orphan records have been identified from this process and will be fixed/removed from next release of Proportal. |
| | |
| | =Release Notes= |
| | Some issues in the current version of Proportal have been resolved. Refer to the [[Release Notes]] for more details. |
|
| |
|
| =[[To-do List]]= | | =[[To-do List]]= |
|
| |
| =[[Proportal DB Schema]]=
| |
| [[Image:Ocean-DBschema-v2.gif|alt text]]
| |
| ==User Module==
| |
| ==Project Module==
| |
| ===Table: data_project===
| |
| A list of projects
| |
|
| |
| * 72 projects, as of 07-21-2011 (To be updated: 58 in PRO DB while 72 in DEV DB)
| |
| * Last updated: 2010-12-10
| |
| * No foreign key
| |
| Notes
| |
|
| |
| The following distinct "type" can be moved into a separated table for a clear definition,
| |
| * cpm, Cyanophage genomes part 1 (<B>To be updated: 18 records in PRO DB, 28 in DEV DB</B>)
| |
| * cpp, Cyanophage genomes part 2 (<B>To be upadted: 8 records in PRO DB, 11 in DEV DB</B>)
| |
| * cps, Cyanophage genomes part 3 (2 records in both DBs)
| |
| * ma, physiology experiments (4 records in both DBs): Light Sensing, Nitrogen Availability, Phage Infection, and Phosphate Starvation)
| |
| * mt, expression experiment (1 record in both DBs): Microbial community gene expression in ocean surface waters
| |
| * p, Prochlorococcus genomes(13 genomes in both DBs)
| |
| * pb, Prochlorococcus Publications (1 record in both DBs)
| |
| * s, Synechococcus genomes(11 genomes in both DBs)
| |
|
| |
| The link for "tax_id" is defined in data_url table.
| |
| * type_id = 59919
| |
| * source = tax
| |
| * url = http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=59919
| |
|
| |
| ===Table: data_projectpub===
| |
| A list of publications from various projects.
| |
|
| |
| * 32 publications as of 07-21-2011
| |
| * Foreign keys:
| |
| o project_id: = data_project.id
| |
| o pubmed_id: = data_publication.id
| |
|
| |
| Notes
| |
|
| |
| This table is used for mapping projects with related publications, which is displayed on the strain page in the "Genomes" section, for instance: [http://proportal.mit.edu/genome/id=1/ MED4].
| |
|
| |
| ===Table: data_genepub===
| |
| This table is empty. Consider to use data_publication table instead?
| |
|
| |
| * Foreign key: data_project
| |
|
| |
| ===Table: data_publication===
| |
| A list of publications related to Prochlorococcus, Cyanophage, and Synechococcus.
| |
|
| |
| * 2528 publications listed as of 08-1-2011 (both DEV and PRO DBs are updated)
| |
| * Not refered by any other table
| |
| * pubmed_id can be used as a foreign key.
| |
| * "year": last updated 2010
| |
|
| |
| ===Table: data_url_map===
| |
| This table is empty.
| |
| ===Table: data_url===
| |
| The list of data links or data folders.
| |
| ==Meta Data Module==
| |
| ===Table data_bats_ts===
| |
| Information about field investigation.
| |
|
| |
| * No foreign key
| |
|
| |
| ===Table: data_meta_data===
| |
| Information about field investigation for each project
| |
|
| |
| * 66 meta data sets, as of 07-22-2011
| |
| * Foreign key: data_project.id, to be fixed.
| |
| <B>Error</B>
| |
| * One project_id 26 is missing in data_project table.
| |
| * Six projects defined in data_project table (all in year 2008) do not have meta data defined in this table.
| |
|
| |
| ==Genome Data Module==
| |
| ===Table: data_scaffold===
| |
| A list of strains/genomes used in various projects.
| |
|
| |
| * Last updated: 12-10-2010
| |
| * 213 strains, as of 07-22-2011
| |
| * Fireigh key: data_project.id
| |
| Questions
| |
| * "refseg_id" not defined
| |
| * "seq" field can be removed because its content is further defined in data_dna and data_protein tables.
| |
|
| |
| ===Table: data_position===
| |
| List of start and end positions of gene/DNA for each strain defined in Table data_scaffold.
| |
|
| |
| * 67516 pair of positions, as of 07-22-2011
| |
| * 9 types of sequences are defined: 16s, 23s, 5s, as, m, n, orf, ps, t
| |
| * Foreign key: data_scaffold.id.
| |
|
| |
| ===Table: data_dna===
| |
| A list of DNA sequesnces in correspondence to sequence postion information defined in data_position table.
| |
|
| |
| * 67516 pieces of DNA sequences stored, as of 07-22-2011
| |
| * Three foreign keys: data_position.id, data_scaffold.id and data_protein.id
| |
| Error
| |
| * Foreign key pos_id has error:
| |
| o Two position ids in data_position table: 37163 and 46814 are missing in this table
| |
| o Two pos_id: 36978 and 37113 do not exist in data_position table.
| |
|
| |
| ===Table: data_protein===
| |
| A list of protein sequences.
| |
|
| |
| * 65909 proteins defined, as of 07-22-2011 (1607 DNA sequences are not present in this table)
| |
| * Two foreign keys: data_scaffold.id and data_protein.id
| |
| Notes
| |
| * "cluster_id" should be removed from this table
| |
|
| |
| ===Table: data_ortholog===
| |
| Protein orthologs.
| |
|
| |
| * 830944 orthology pairs defined, as of 07-22-2011
| |
| * Foreign keys: protein_id and ortholog_id
| |
|
| |
| ===Table: data_protein_xref===
| |
| Definition: ?
| |
| * 36774 records stored, as of 07-22-2011
| |
| * Foreign key: data_protein.id, to be fixed,
| |
| Error
| |
| * Two records have missing protein_id: 36950 and 45482 in data_protein table
| |
|
| |
| ==Affychip Expression Module==
| |
| ===Table: data_affychip===
| |
| Information about each affychip used.
| |
|
| |
| * 1 chip defined, as of 07-22-2011 (same for DEV and PRO DBs)
| |
| * No foreign key
| |
|
| |
| ===Table: data_affyexp===
| |
| A list of affychip experiments.
| |
|
| |
| * 20 affychip experiments, as of 07-22-2011 (same for DEV and PRO DBs)
| |
| * Foreign key: project_id, only three projects involved affychip experiments.
| |
|
| |
| ===Table: data_affyprobeset===
| |
| A list of probe sets for various affychip experiments.
| |
|
| |
| * 9966 records, as of 07-22-2011
| |
| * Three foreign keys:
| |
| o chip_id:
| |
| o scaffold_id: has missing keys
| |
| o feature_id: not defined
| |
| Notes
| |
| * feature_id not defined, but is actually in one to one correspondence to gene_id in data_diel table, which is further linked to data_protein table using protein_id.
| |
| * Use "begin" and "end" to match DNA\gene\protein?
| |
|
| |
| ===Table: data_affyprobe===
| |
| A list of probes for various affychip experiments.
| |
| * 89749 records, as of 07-22-2011
| |
| * Foreign key: probeset_id
| |
|
| |
| ===Table: data_affydata===
| |
| The expression results of Affychip experiments.
| |
|
| |
| * 110848 records, as of 07-22-2011
| |
| * Foreign keys,
| |
| o exp_id
| |
| o probeset_id
| |
| Notes
| |
| * No DNA\gene\protein info, use probeset_id?
| |
|
| |
| ===Table: data_diel===
| |
| The mapping between gene id and protein id, which is used in Affychip expression experiments.
| |
|
| |
| * 1695 records, as of 07-22-2011
| |
| * Foreign keys,
| |
| o probeset_id
| |
| o protein_id
| |
| o gene_id: not defined
| |
| Notes
| |
| * gene_id not defined, but is actually in one to one correspondence to feature_id in data_probeset
| |
|
| |
| ===Table: data_dieltimepoint===
| |
| Time courses of Affychip experiemnts.
| |
|
| |
| * 42375 records, as of 07-22-2011
| |
| * Foreign key: diel_id
| |
|
| |
| ==Cog Module==
| |
| ===Table: data_cog_fun===
| |
| A list of Cog gene functions.
| |
|
| |
| * 24 funtion categoriess, as of 07-22-2011
| |
| * No foreign key
| |
|
| |
| ===Table: data_cog===
| |
| A list of Cog genome annotations
| |
|
| |
| * 4874 records, as of 07-22-2011
| |
| * Foreign key: data_cog_fun.funcode ?
| |
| Notes
| |
| * data_cog_fun.funcode can't be regarded as a foreign key becuase some of funcodes in this table are missing in data_cog_fun table.
| |
|
| |
| ===Table: data_protein_cog===
| |
| The mapping between Cog genome and proteins.
| |
|
| |
| * 18498 records, as of 07-22-2011
| |
| * Foreign keys: data_protein.id and data_cog.id
| |
|
| |
| ==Microarray Module==
| |
| ===Table: data_gos_site===
| |
| A list of Gos field experiments, such as sites of experiments etc.
| |
|
| |
| * 78 records, as of 07-22-2011
| |
| * No foreign key
| |
|
| |
| ===Table: data_gos_read===
| |
| A list of field reads for various Gos experiments.
| |
|
| |
| * 9893120 records, as of 07-22-2011
| |
| * Foreign key: data_gos_site.id, no error
| |
|
| |
| ===Table: data_gos_to_protein===
| |
| The mapping between Gos genomes and proteins.
| |
|
| |
| * 926072 records, as of 07-22-2011
| |
| * Foreign keys:
| |
| o data_protein.id, has error, to be fixed
| |
| o data_gos_read.id, has error, to be fixed
| |
| Error
| |
|
| |
| * The foreign key: read_id=0 is not defined in data_gos_read table for id=1 and id=705172 in this table
| |
| * The foreign key: protein_id=0 is not defined in data_protein table for id=1 and id=705172 in this table
| |
|
| |
| ===Table: data_gos_blastn===
| |
| A list of sequences from Gos experiments.
| |
|
| |
| * 8666847 records, as of 07-22-2011
| |
| * Foreign keys:
| |
| o data_scaffold.id, has error, to be fixed
| |
| o data_gos_read.id, has error, to be fixed
| |
| Error
| |
|
| |
| * The foreign key: scaffold_id=0 is not defined in data_gos_read table for 211 records in this table
| |
| * The foreign key: read_id=0 is not defined in data_gos_read table for 56438 records in this table
| |
|
| |
| ==Cluster Module==
| |
| ===Table: data_protein_cluster===
| |
| A list of protein clusters.
| |
|
| |
| * 5597 records, as of 07-22-2011
| |
| * No foreign key
| |
| Notes
| |
| * Two distinct "type": phCOG and CyCog
| |
| * "gene_name" not in use
| |
|
| |
| ===Table: data_protein_cluster_synonym===
| |
| The table is empty.
| |
| ===Table: data_protein_cluster_xref===
| |
|
| |
| * 1100 records, as of 07-22-2011
| |
| * Foreign key: data_protein_cluster.id, has error, to be fixed
| |
| Notes
| |
| * Only one "type": c
| |
| * "xref": COG reference id, which may correspond to multiple cluster ids
| |
| Error
| |
| * The foreign key: some cluster_ids are not defined in data_protein_cluster table for about 880 records.
| |
|
| |
| ===Table: data_protein_cluster_cog===
| |
| This table is empty.
| |
| ===Table: data_clusterlink===
| |
| A list of pairs of clusters.
| |
|
| |
| * 71 records, as of 07-22-2011
| |
| * Foreign key: data_protein_cluster.id,has error, to be fixed
| |
| Notes
| |
| * "evidence" is not in use
| |
| Error
| |
| * The foreign key: cluster_id=0 is not defined in data_protein_cluster table.
| |