User:Tthakuri: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
No edit summary
Line 1: Line 1:
{| border="1"
{| border="1"
! [[Image:Z.jpg|thumb|center|300x600px| (M/Z)]] !! [[Image:thaman.jpg|thumb|center|300x600px| (Digging)]]
! [[Image:Z.jpg|thumb|center|300x300px| (M/Z)]] !! [[Image:thaman.jpg|thumb|center|300x300px| (Digging)]]
|}
|}



Revision as of 06:34, 15 September 2011

(M/Z)
(Digging)

Current Status

Doing Master’s thesis at Proteomics Department (Turku Center for Biotechnology) under Garry Carthols supervision. My main purpose of thesis is developing database for different file formats contents. Identifying monoisotopic mass by quering in database through Graphic user interface (GUI). I learned about OpenWetWare from online, and I've joined because To maintain diary of my Master's Thesis.

My Notebook

My Research Topic

Background

The proteomics facility identifies hundreds to thousands of proteins in numerous studies. For each study there are a specific set of identified proteins, which are related to a health or biological state, or cell state. The experiments are often conducted in a reiterative manner, where there can be overlap in information; not to be confused with redundancy. For example, the increase of a protein in a tissue over time may result in the identification of that protein, as well as the quantitative difference (amount of protein). There are numerous sophisticated computational techniques available in proteomics to mine data, and ultimately convert data to protein identifications or quantitative values. Other analytical tools also exist, yet mostly they (only) enable the analysis of single files (experimental measurements) or comparison of "sets of files".

Questioning the whole data set is currently difficult, and not at all customisable, and a simple query such as "has this protein been identified before in this study?" is not possible. Thus this represents a considerable shortcoming and is where the research should be focussed on.

Essentially what we need is to have a set of tools or scripts accessible from a singular GUI that can initiate searches and display information from an in house database that contains proteome experimental information from various studies and public repositories. Obviously we need more meta-information than just protein identifications, such as patient information, which tissue(s), sample treatment, etc.

Aims

  • Construction of a database.
  • Design a set of tool or scripts accessible from a singular GUI that can initiate searches and display information from the database.
  • Widen the search functionalities to public repositories and meta-information.


Phases

  • In the first phase of the project a database is designed and built to accomodate "RAW" mass spectrometry data. This database could be used for queries such as: "Have we seen this monoisotopic mass before in other experiments?"
  • Second phase would be the addition of processed data (Mascot, Sequest or Scaffold) that allow searching peptide sequences.
  • In the third phase, the search function could be developed further to support protein identification data or other meta-information, as well as relevant data from external sources.

So, two main points have to be addressed here: the construction of a database, and the design of a GUI (that would initiate scripts) to access and query the DB.

The DATABASE should

  • In a semi automated manner import different files types (.RAW, .WIFF, .DTA,.pepXML), mine and store selected information from these files.
  • Allow advanced searches that would support additional parameters such as instrumentation, signal intensity threshold, retention time etc.
  • Import selected data from public resources and repositories, such as (e.g) PRIDE; Tranceh and PeptideAtlas, (e.g) publications and supplementary information typically (.xls &.xlsx files).
  • Run on an institute housed server and password protected access to information on the server.
  • Allow for basic queries (to be elaborated on) and the exporting of the results.

Research interests

  1. Development of bioinformatics pipelines for the analysis of high-throughput data
  2. Designed and analyzed gene expression and genotyping
  3. Next Generation Sequencing and Microarray data analysis
  4. Gene ontology (GO) and network analysis
  5. Proteomics & Comparative Genomics
  6. Protein-Protein Interaction network
  7. Evolutionary Sequence Analysis and alignment

Education

  • 2011, MS, University of Turku
  • 2006, BS, Kantipur College of Management & Information Technology (KCMIT)

Contact Info

Thaman Chand
University of Turku
Turku Centre for Biotechnology Tykistokatu 6, 20520
Turku,Finland
E-mail : tchand (at) btk (dot) fi

Useful links