Revision as of 03:03, 23 October 2007

Home About Resources Research & Projects Softwares Publications ImpLinks Contact

DATABASES

Nucleic Acids Research Volume 35 | Database issue | 2007

PharmGED: Pharmacogenetic Effect Databas

Synopsis

Prediction and elucidation of pharmacogenetic effects is important for facilitating the development of personalized medicines. Knowledge of polymorphism-induced and other types of drug-response variations is needed for facilitating such studies. Although databases of pharmacogenetic knowledge, polymorphism and toxicogenomic information have appeared, some of the relevant data are provided in separate web-pages and in terms of relatively long descriptions quoted from literatures. To facilitate easy and quick assessment of the relevant information, it is helpful to develop databases that provide all of the information related to a pharmacogenetic effect in the same web-page and in brief descriptions. We developed a database, Pharmacogenetic Effect Database (PharmGED), for providing sequence, function, polymorphism, affected drugs and pharmacogenetic effects. PharmGED can be accessed at http://bidd.cz3.nus.edu.sg/phg/ free of charge for academic use. It currently contains 1825 entries covering 108 disease conditions, 266 distinct proteins, 693 polymorphisms, 414 drugs/ligands cited from 856 references.

Nucleic Acids Research Volume 35 | Database issue | 2007

PhenomicDB: a new cross-species genotype/phenotype resource

Synopsis

Phenotypes are an important subject of biomedical research for which many repositories have already been created. Most of these databases are either dedicated to a single species or to a single disease of interest. With the advent of technologies to generate phenotypes in a high-throughput manner, not only is the volume of phenotype data growing fast but also the need to organize these data in more useful ways. We have created PhenomicDB (freely available at http://www.phenomicdb.de), a multi-species genotype/phenotype database, which shows phenotypes associated with their corresponding genes and grouped by gene orthologies across a variety of species. We have enhanced PhenomicDB recently by additionally incorporating quantitative and descriptive RNA interference (RNAi) screening data, by enabling the usage of phenotype ontology terms and by providing information on assays and cell lines. We envision that integration of classical phenotypes with high-throughput data will bring new momentum and insights to our understanding. Modern analysis tools under development may help exploiting this wealth of information to transform it into knowledge and, eventually, into novel therapeutic approaches.

Nucleic Acids Research Volume 35 | Database issue | 2007

PROTCOM: searchable database of protein complexes enhanced with domain–domain structures

Synopsis

The database of protein complexes (PROTCOM) is a compilation of known 3D structures of protein–protein complexes enriched with artificially created domain–domain structures using the available entries in the Protein Data Bank. The domain–domain structures are generated by parsing single chain structures into loosely connected domains and are important features of the database. The database (http://www.ces.clemson.edu/compbio/protcom) could be used for benchmarking purposes of the docking and other algorithms for predicting 3D structures of protein–protein complexes. The database can be utilized as a template database in the homology or threading methods for modeling the 3D structures of unknown protein–protein complexes. PROTCOM provides the scientific community with an integrated set of tools for browsing, searching, visualizing and downloading a pool of protein complexes. The user is given the option to select a subset of entries using a combination of up to 10 different criteria. As on July 2006 the database contains 1770 entries, each of which consists of the known 3D structures and additional relevant information that can be displayed either in text-only or in visual mode.

Nucleic Acids Research Volume 35 | Database issue | 2007

CellCircuits: a database of protein network models

Synopsis

CellCircuits (http://www.cellcircuits.org) is an open-access database of molecular network models, designed to bridge the gap between databases of individual pairwise molecular interactions and databases of validated pathways. CellCircuits captures the output from an increasing number of approaches that screen molecular interaction networks to identify functional subnetworks, based on their correspondence with expression or phenotypic data, their internal structure or their conservation across species. This initial release catalogs 2019 computationally derived models drawn from 11 journal articles and spanning five organisms (yeast, worm, fly, Plasmodium falciparum and human). Models are available either as images or in machine-readable formats and can be queried by the names of proteins they contain or by their enriched biological functions. We envision CellCircuits as a clearinghouse in which theorists may distribute or revise models in need of validation and experimentalists may search for models or specific hypotheses relevant to their interests. We demonstrate how such a repository of network models is a novel systems biology resource by performing several meta-analyses not currently possible with existing databases.

Nucleic Acids Research Volume 35 | Database issue | 2007

HMDB: the Human Metabolome Database

Synopsis

The Human Metabolome Database (HMDB) is currently the most complete and comprehensive curated collection of human metabolite and human metabolism data in the world. It contains records for more than 2180 endogenous metabolites with information gathered from thousands of books, journal articles and electronic databases. In addition to its comprehensive literature-derived data, the HMDB also contains an extensive collection of experimental metabolite concentration data compiled from hundreds of mass spectra (MS) and Nuclear Magnetic resonance (NMR) metabolomic analyses performed on urine, blood and cerebrospinal fluid samples. This is further supplemented with thousands of NMR and MS spectra collected on purified, reference metabolites. Each metabolite entry in the HMDB contains an average of 90 separate data fields including a comprehensive compound description, names and synonyms, structural information, physico-chemical data, reference NMR and MS spectra, biofluid concentrations, disease associations, pathway information, enzyme data, gene sequence data, SNP and mutation data as well as extensive links to images, references and other public databases. Extensive searching, relational querying and data browsing tools are also provided. The HMDB is designed to address the broad needs of biochemists, clinical chemists, physicians, medical geneticists, nutritionists and members of the metabolomics community. The HMDB is available at: www.hmdb.ca

Nucleic Acids Research Volume 35 | Database issue | 2007

LMSD: LIPID MAPS structure database

Synopsis

The LIPID MAPS Structure Database (LMSD) is a relational database encompassing structures and annotations of biologically relevant lipids. Structures of lipids in the database come from four sources: (i) LIPID MAPS Consortium's core laboratories and partners; (ii) lipids identified by LIPID MAPS experiments; (iii) computationally generated structures for appropriate lipid classes; (iv) biologically relevant lipids manually curated from LIPID BANK, LIPIDAT and other public sources. All the lipid structures in LMSD are drawn in a consistent fashion. In addition to a classification-based retrieval of lipids, users can search LMSD using either text-based or structure-based search options. The text-based search implementation supports data retrieval by any combination of these data fields: LIPID MAPS ID, systematic or common name, mass, formula, category, main class, and subclass data fields. The structure-based search, in conjunction with optional data fields, provides the capability to perform a substructure search or exact match for the structure drawn by the user. Search results, in addition to structure and annotations, also include relevant links to external databases. The LMSD is publicly available at www.lipidmaps.org/data/structure/

Nucleic Acids Research Volume 35 | Database issue | 2007

BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities

Synopsis

BindingDB (http://www.bindingdb.org) is a publicly accessible database currently containing ~20 000 experimentally determined binding affinities of protein–ligand complexes, for 110 protein targets including isoforms and mutational variants, and ~11 000 small molecule ligands. The data are extracted from the scientific literature, data collection focusing on proteins that are drug-targets or candidate drug-targets and for which structural data are present in the Protein Data Bank. The BindingDB website supports a range of query types, including searches by chemical structure, substructure and similarity; protein sequence; ligand and protein names; affinity ranges and molecular weight. Data sets generated by BindingDB queries can be downloaded in the form of annotated SDfiles for further analysis, or used as the basis for virtual screening of a compound database uploaded by the user. The data in BindingDB are linked both to structural data in the PDB via PDB IDs and chemical and sequence searches, and to the literature in PubMed via PubMed IDs.

Nucleic Acids Research Volume 35 | Database issue | 2007

Phospho3D: a database of three-dimensional structures of protein phosphorylation sites

Synopsis

Phosphorylation is the most common protein post-translational modification. Phosphorylated residues (serine, threonine and tyrosine) play critical roles in the regulation of many cellular processes. Since the amount of data produced by screening assays is growing continuously, the development of computational tools for collecting and analysing experimental data has become a pivotal task for unravelling the complex network of interactions regulating eukaryotic cell life. Here we present Phospho3D, http://cbm.bio.uniroma2.it/phospho3d, a database of 3D structures of phosphorylation sites, which stores information retrieved from the phospho.ELM database and is enriched with structural information and annotations at the residue level. The database also collects the results of a large-scale structural comparison procedure providing clues for the identification of new putative phosphorylation sites.

http://www.3dcomplex.org/

PLoS Computational Biology Volume 2 | Issue 10 | OCTOBER 2006

3D Complex: a Structural Classification of Protein Complexes

Synopsis

Most of the proteins in a cell assemble into complexes to carry out their function. It is therefore crucial to understand the physico-chemical properties as well as the evolution of interactions between proteins. The Protein Data Bank represents an important source of information for such studies, because more than half of the structures are homo- or heteromeric protein complexes. Here Emmanuel D Levy,etl. propose the first hierarchical classification of whole protein complexes of known three-dimensional structure, based on representing their fundamental structural features as a graph. This classification provides the first overview of all the complexes in the Protein Data Bank and allows non-redundant sets to be derived at different levels of detail. This reveals that between one half and two thirds of known structures are multimeric, depending on the level of redundancy accepted. We also analyse the structures in terms of the topological arrangement of their subunits, and find that they form a small number of arrangements compared to all theoretically possible ones. This is because most complexes contain four subunits or less, and the large majority are homomeric. In addition, there is a strong tendency for symmetry in complexes, even for heteromeric complexes. Finally, through comparison of Biological Units in the Protein Data Bank with the Protein Quaternary Structure database, they identified many possible errors in quaternary structure assignments. Our classification, available as a database and web server at http://www.3Dcomplex.org, will be a starting point for future work aimed at understanding the structure and evolution of protein complexes.

Research & Projects Softwares Publications ImpLinks Contact

Welcome to web page of Abhishek Tiwari

Welcome to my OpenWetWare page.

Conferences and Events In News

14 January 2008 Poster submission due

You can get more information about upcoming Confrences and Events related to Computaional Biology

ISCB Event Board http://www.iscb.org/events/event_board.php

Top Hits

Applications of Workflow technology in Cheminformatics ,Bioinformatics and Drug Discovery

Workflow technology is a mechanism to integrate data, application and services. Workflow technology enables scientist to dynamically construct their own research protocol for scientific analytics and decision making by connecting various resources and software applications together in an innovative way. Workflow technology is being increasingly applied in discovery informatics to organize and analyze data. SciTegic's Pipeline Pilot is a one of the chemically intelligent implementation of a workflow technology known as data pipelining. It allows scientists to construct and execute workflows using components that encapsulate many cheminformatics based algorithms. Workflow technology is generic so analytics work flow can be built for any areas like gene expression analysis, sequence analysis, proteomics, system biology and so on. Workflow technology provides an interface where software from different vendors can assemble according to scientific requirement. Read More

Chemical Informatics or Cheminformatics Toolkits

"Any idiot can stand up and say that virtual screening doesn't work. It takes real brains to show how to improve it!" - Mark McGann

Currently a lot of toolkits (Daylight Toolkit, Chemaxon Toolkit, OpenEye Toolkit, MOE SVL, Accord, CDK, JOELib etc) are available from different vendors and organization. Most of them are equally good but choice may vary based on user prospective. I will try to give a summarized overview of some commonly used Chemical Informatics Toolkits. For academics user Chemaxon JChem (which is free for academic user) and Open Source Toolkits like CDK and JOELib will be a better choice. If your budget permits then you can use Daylight, Accord , OpenEye, MOE SVL or any other depending on your needs but Chemaxon and MOE are low budget high quality options. Read More

Bioinformatics Toolkits

As compared to Chemistry and Chemoinformatics where a lot of commercial as well as open source toolkits are available Bioinformatics has very few options. Most of toolkits are specialized in Sequence manipulations and Databases access related and they are not very diverse. Some major toolkits are Geneious API,BioPerl API,BioJava API,BioPython API, Bioconductor, Mathworks Bioinformatics Toolbox,MBT,NCBI Toolkit and many more. At this moment none of these toolkits are complete and mostly they have some specialized applications. Read More

Hot Computational Biology Papers

Hot Computational Biology Papers is one of the most highlighted sections of my Open Wetware page. Section provides synopsis and comments on the selected article related to computaional biology. Readers are also invited to comment and modify the section.

Using SOAP/WSDL to access the Biological/Bioinformatics Databases

SOAP (Simple Object Access Protocol) (http://www.w3.org/TR/soap) based Web Services technology (http://www.w3.org/TR/wsdl) has gained much attention as an open standard enabling interoperability among applications across heterogeneous architectures and different networks. When large amounts of data need to be retrieved and analysed, this often proves to be tedious and impractical. Today, biological databases are large collections of data that are relatively difficult to maintain outside the centres and institutions that produce them. These data are traditionally accessed using browser-based World Wide Web interfaces. Read More

Spotfire based Analytic Solutions for Life Sciences Informatics

The Spotfire Enterprise Analytics platform offers a radically faster business intelligence experience and is far more adaptable to specific industry and business challenges than traditional alternatives. Read More

Visual Programming Series

Learn what visual programming is and how to use different freely available tools for visual programming in Computational Biology and Chemistry. Read More

@@ Line 35: / Line 35: @@
 The database of protein complexes (PROTCOM) is a compilation of known 3D structures of protein–protein complexes enriched with artificially created domain–domain structures using the available entries in the Protein Data Bank. The domain–domain structures are generated by parsing single chain structures into loosely connected domains and are important features of the database. The database (http://www.ces.clemson.edu/compbio/protcom) could be used for benchmarking purposes of the docking and other algorithms for predicting 3D structures of protein–protein complexes. The database can be utilized as a template database in the homology or threading methods for modeling the 3D structures of unknown protein–protein complexes. PROTCOM provides the scientific community with an integrated set of tools for browsing, searching, visualizing and downloading a pool of protein complexes. The user is given the option to select a subset of entries using a combination of up to 10 different criteria. As on July 2006 the database contains 1770 entries, each of which consists of the known 3D structures and additional relevant information that can be displayed either in text-only or in visual mode.
+<html>
+<a href="http://www.cellcircuits.org/"><img src="http://www.cellcircuits.org/CC-logo-large.jpg" alt="CellCircuits"  border="0"></a>
+</html>
 '''''Nucleic Acids Research''''' Volume 35 | Database issue | 2007
@@ Line 80: / Line 84: @@
 Most of the proteins in a cell assemble into complexes to carry out their function. It is therefore crucial to understand the physico-chemical properties as well as the evolution of interactions between proteins. The Protein Data Bank represents an important source of information for such studies, because more than half of the structures are homo- or heteromeric protein complexes. Here Emmanuel D Levy,etl. propose the first hierarchical classification of whole protein complexes of known three-dimensional structure, based on representing their fundamental structural features as a graph. This classification provides the first overview of all the complexes in the Protein Data Bank and allows non-redundant sets to be derived at different levels of detail. This reveals that between one half and two thirds of known structures are multimeric, depending on the level of redundancy accepted. We also analyse the structures in terms of the topological arrangement of their subunits, and find that they form a small number of arrangements compared to all theoretically possible ones. This is because most complexes contain four subunits or less, and the large majority are homomeric. In addition, there is a strong tendency for symmetry in complexes, even for heteromeric complexes. Finally, through comparison of Biological Units in the Protein Data Bank with the Protein Quaternary Structure database, they identified many possible errors in quaternary structure assignments. Our classification, available as a database and web server at http://www.3Dcomplex.org, will be a starting point for future work aimed at understanding the structure and evolution of protein complexes.
+[[Abhishek Tiwari:Projects | <font face="trebuchet ms" style="color:#ffffff"> '''Research & Projects''' </font>]]
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+[[Abhishek Tiwari:Softwares | <font face="trebuchet ms" style="color:#ffffff"> '''Softwares''' </font>]] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+[[Abhishek Tiwari:Reprints | <font face="trebuchet ms" style="color:#ffffff"> '''Publications''' </font>]] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+[[Abhishek Tiwari:Links | <font face="trebuchet ms" style="color:#ffffff"> '''ImpLinks''' </font>]] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+[[Abhishek Tiwari:Contact | <font face="trebuchet ms" style="color:#ffffff"> '''Contact''' </font>]] &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
+</center>
+</div>
+=Welcome to web page of Abhishek Tiwari=
+Welcome to my OpenWetWare page.
+=Conferences and Events In News=
+<html>
+<a href="http://www.iscb.org/ismb2008/"><img src="http://www.iscb.org/ismb2008/images/ISMB08_web.gif" alt="ISMB 2008"  border="0"></a>
+</html>
+----
+<html>
+<a href="http://eccb08.org/"><img src="http://eccb08.org/logo_A.png" alt="ECCB 2008"  border="0"></a>
+</html>
+----
+<html>
+<a href="http://www.comp.nus.edu.sg/~recomb08/"><img src="http://openwetware.org/images/c/c2/RECOMB.PNG" alt="RECOMB 2008"  border="0"></a>
+</html>
+'''14 January 2008  Poster submission due'''
+'''You can get more information about upcoming Confrences and Events related to Computaional Biology'''
+*ISCB Event Board http://www.iscb.org/events/event_board.php
+=Top Hits=
+==Applications of Workflow technology in Cheminformatics ,Bioinformatics and Drug Discovery==
+[[Image:Data_Pipelining.jpg]]
+Workflow technology is a mechanism to integrate data, application and services. Workflow technology enables scientist to dynamically construct their own research protocol for scientific analytics and decision making by connecting various resources and software applications together in an innovative way. Workflow technology is being increasingly applied in discovery informatics to organize and analyze data. SciTegic's Pipeline Pilot is a one of the chemically intelligent implementation of a workflow technology known as data pipelining. It allows scientists to construct and execute workflows using components that encapsulate many cheminformatics based algorithms.
+	Workflow technology is generic so analytics work flow can be built for any areas like gene expression analysis, sequence analysis, proteomics, system biology and so on. Workflow technology provides an interface where software from different vendors can assemble according to scientific requirement.
+[[Abhishek Tiwari:Workflow technology|Read More]]
+== Chemical Informatics or  Cheminformatics Toolkits==
+"Any idiot can stand up and say that virtual screening doesn't work. It takes real brains to show how to improve it!" - Mark McGann
+Currently a lot of toolkits (Daylight Toolkit, Chemaxon Toolkit, OpenEye Toolkit, MOE SVL, Accord, CDK, JOELib etc) are available from different vendors and organization. Most of them are equally good but choice may vary based on user prospective. I will try to give a summarized overview of some commonly used Chemical Informatics Toolkits. For academics user Chemaxon JChem (which is free for academic user) and Open Source Toolkits like CDK and JOELib will be a better choice. If your budget permits then you can use Daylight, Accord , OpenEye, MOE SVL or any other depending on your needs but Chemaxon and MOE are low budget high quality options.
+[[Abhishek Tiwari:Chemical Informatics Toolkits|Read More]]
+==Bioinformatics Toolkits ==
+As compared to Chemistry and Chemoinformatics where a lot of commercial as well as open source toolkits are available Bioinformatics has very few options. Most of toolkits are specialized in Sequence manipulations and Databases access related and they are not very diverse. Some major toolkits are  Geneious API,BioPerl API,BioJava API,BioPython API, Bioconductor, Mathworks Bioinformatics Toolbox,MBT,NCBI Toolkit and many more. At this moment none of these toolkits are complete and mostly they have some specialized applications.
+[[Abhishek Tiwari:Bioinformatics Toolkits|Read More]]
+==Hot Computational Biology Papers==
+[[Abhishek Tiwari:Hot Computational Biology Papers-By Category|Hot Computational Biology Papers]] is one of the most highlighted sections of my Open Wetware page. Section provides synopsis and comments on the selected article related to computaional biology. Readers are also invited to comment and modify the section.
+==Using SOAP/WSDL to access the Biological/Bioinformatics Databases==
+SOAP (Simple Object Access Protocol) (http://www.w3.org/TR/soap) based Web Services technology (http://www.w3.org/TR/wsdl) has gained much attention as an open standard enabling interoperability among applications across heterogeneous architectures and different networks. When large amounts of data need to be retrieved and analysed, this often proves to be tedious and impractical. Today, biological databases are large collections of data that are relatively difficult to maintain outside the centres and institutions that produce them. These data are traditionally accessed using browser-based World Wide Web interfaces.
+[[Abhishek Tiwari:SOAP/WSDL based webservices for Biological Databases|Read More]]
+== Spotfire based Analytic Solutions for Life Sciences Informatics ==
+The Spotfire Enterprise Analytics platform offers a radically faster business intelligence experience and is far more adaptable to specific industry and business challenges than traditional alternatives.
+[[Abhishek Tiwari:Spotfire|Read More]]
+== Visual Programming Series ==
+Learn what visual programming is and how to use different freely available tools for visual programming in Computational Biology and Chemistry. [[Abhishek Tiwari:Visual Programming|Read More]]
+----
+<html>
+<a href="http://www.ploscompbiol.org"><img src="http://www.plos.org/images/pcbi_234x60.png" alt="PLoS Computational Biology - www.ploscompbiol.org"  border="0"></a>
+<a href="http://www.plos.org/journals/"><img src="http://www.plos.org/images/banners/library_button.gif" alt="I Support the Public Library of Science"  border="0"></a>
+<a href="http://www.biomedcentral.com/home/"><img src="http://www.biomedcentral.com/graphics/advocacy/bmclogo2.gif" alt="Support Open Access. Spread the Word"  border="0"></a>
+<a href="http://www.iscb.org/"><img src="http://iscb.org/images/index_r3_c1.gif" alt=" The International Society for Computational Biology (ISCB)"  border="0"></a>
+<a href="http://www.prchecker.info/" target="_blank">
+<img src="http://openwetware.org/images/e/ee/Pr1.gif" alt="PageRank 7" border="0" /></a>
+</html>

Abhishek Tiwari:DATABASES: Difference between revisions

Revision as of 03:03, 23 October 2007

Contents

DATABASES

Welcome to web page of Abhishek Tiwari

Conferences and Events In News

Top Hits

Applications of Workflow technology in Cheminformatics ,Bioinformatics and Drug Discovery

Chemical Informatics or Cheminformatics Toolkits

Bioinformatics Toolkits

Hot Computational Biology Papers

Using SOAP/WSDL to access the Biological/Bioinformatics Databases

Spotfire based Analytic Solutions for Life Sciences Informatics

Visual Programming Series

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

research

Tools