Wikiomics:Pathway analysis: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
 
(20 intermediate revisions by 4 users not shown)
Line 8: Line 8:
* [http://biit.cs.ut.ee/gprofiler/ g:Profiler] a web-based toolset for functional profiling of gene lists from large-scale experiments. Easy to use web server
* [http://biit.cs.ut.ee/gprofiler/ g:Profiler] a web-based toolset for functional profiling of gene lists from large-scale experiments. Easy to use web server


* [http://kobas.cbi.pku.edu.cn. KOBAS] server used for i.e. elucidating pathways in addicion
* [http://kobas.cbi.pku.edu.cn. KOBAS] server used for i.e. elucidating pathways in addiction
** takes both FASTA files and lists of genes
** takes both FASTA files and lists of genes
** caveat: excise '''gi|''' from typical FASTA NCBI entry to get unique IDs
** caveats
*** excise '''gi|''' from typical FASTA NCBI entry to get unique IDs
*** only about 1/3 of genes will get annotated in the first step
** Li, Chuan-Yun, Xizeng Mao, and Liping Wei. “Genes and (Common) Pathways Underlying Drug Addiction.” PLoS Computational Biology 4, no. 1 (1, 2008) [http://compbiol.plosjournals.org/perlserv/?request=get-document&doi=10.1371%2Fjournal.pcbi.0040002 HTML]
** Li, Chuan-Yun, Xizeng Mao, and Liping Wei. “Genes and (Common) Pathways Underlying Drug Addiction.” PLoS Computational Biology 4, no. 1 (1, 2008) [http://compbiol.plosjournals.org/perlserv/?request=get-document&doi=10.1371%2Fjournal.pcbi.0040002 HTML]
* GSEA http://www.broad.mit.edu/gsea/software/software_index.html
* [http://www.broad.mit.edu/gsea/index.jsp GSEA]  withMSigDB "Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states"
 
objections (Damian D, Gorfine M. Statistical concerns about the GSEA procedure): http://www.nature.com/ng/journal/v36/n7/full/ng0704-663a.html
objections (Damian D, Gorfine M. Statistical concerns about the GSEA procedure): http://www.nature.com/ng/journal/v36/n7/full/ng0704-663a.html
and reply: http://www.nature.com/ng/journal/v36/n7/full/ng0704-663b.html
and reply: http://www.nature.com/ng/journal/v36/n7/full/ng0704-663b.html
* [http://www.bioinformatics.ubc.ca/ermineJ/ ErmineJ] "ErmineJ performs analyses of gene sets in expression microarray data. A typical goal is to determine whether particular biological pathways are "doing something interesting" in the data. The software is designed to be used by biologists with little or no informatics background."
* [http://www.bioconductor.org/packages/release/bioc/html/gage.html GAGE] is applicable independent of sample sizes, experimental design, assay platforms, and other types of heterogeneity ([http://www.biomedcentral.com/1471-2105/10/161 paper]). This Biocondutor package also provides functions and data for pathway, GO and gene set analysis in general. Tutorials describe both [http://bioconductor.org/packages/release/bioc/vignettes/gage/inst/doc/RNA-seqWorkflow.pdf RNA-Seq] and [http://bioconductor.org/packages/release/bioc/vignettes/gage/inst/doc/gage.pdf microarray] data analysis workflows.


=Other tools to check=
=Other tools to check=
* [http://gepat.sourceforge.net/ GEPAT] Genome Expression Pathway Analysis Tool. Performs standard microarray analyzes plus "Ensembl database and provides information about gene names, chromosomal location, GO categories and enzymatic activity for each probe on the chip.". Complex installation of java jars/MySQL etc.   
* [http://gepat.sourceforge.net/ GEPAT] Genome Expression Pathway Analysis Tool. Performs standard microarray analyzes plus "Ensembl database and provides information about gene names, chromosomal location, GO categories and enzymatic activity for each probe on the chip.". Complex installation of java jars/MySQL etc.   
* [http://www.bioinformatics.ubc.ca/ermineJ/ ErmineJ] Java stand-alone program "designed to be used by biologists with little or no informatics background" + command line for expert
 
* [http://www.biomedcentral.com/1471-2105/6/144#B12 PAGE]  Parametric Analysis of Gene Set Enrichment
* [http://www.biomedcentral.com/1471-2105/6/144#B12 PAGE]  Parametric Analysis of Gene Set Enrichment
* [http://cbio.mskcc.org/software/cpath/ CPath] database and software suite for storing, visualizing, and analyzing biological pathways [http://cbio.mskcc.org/cpath/ demo page]
* [http://cbio.mskcc.org/software/cpath/ CPath] database and software suite for storing, visualizing, and analyzing biological pathways [http://cbio.mskcc.org/cpath/ demo page]
* EASE (old?) http://www.pubmedcentral.gov/articlerender.fcgi?tool=pubmed&pubmedid=14519205
* EASE (old but highly cited) http://www.pubmedcentral.gov/articlerender.fcgi?tool=pubmed&pubmedid=14519205


* nonparametric multivariate analysis Nettleton et al. [http://bioinformatics.oxfordjournals.org/cgi/content/full/24/2/192 HTML]. R code availebla from author.
* nonparametric multivariate analysis Nettleton et al. [http://bioinformatics.oxfordjournals.org/cgi/content/full/24/2/192 HTML]. R code available from author.


=Pathway/graph visualisation=
=Pathway/graph visualisation=
* [Cytoscape http://www.cytoscape.org/ Cytoscape] leader in the field
* [http://www.cytoscape.org/ Cytoscape] leader in the field
* [http://ondex.sourceforge.net/ ONDEX] [http://bioinformatics.oxfordjournals.org/cgi/content/full/22/11/1383 HTML] "enables data from diverse biological data sets to be linked, integrated and visualised through graph analysis techniques"
* [http://www.ondex.org/ ONDEX] [http://bioinformatics.oxfordjournals.org/cgi/content/full/22/11/1383 HTML] "enables data from diverse biological data sets to be linked, integrated and visualised through graph analysis techniques"
* [http://bioconductor.org/packages/release/bioc/html/pathview.html Pathview] R/Bioconductor tool for pathway based data integration and visualization, easy to integrate in pathway analysis [http://bioconductor.org/packages/release/bioc/vignettes/pathview/inst/doc/pathview.pdf workflows].  [http://pathview.r-forge.r-project.org/ R-Forge]has an overview with some nice example plots. The work has been published in [http://bioinformatics.oxfordjournals.org/content/29/14/1830.full Bioinformatics].
 
=Protein interactions=
* [http://sbi.imim.es/web/BIANA.php BIANA] biological database integration and network management framework, successor of PIANA
 
*[http://acgt.cs.tau.ac.il/matisse MATISSE] Modular Analysis for Topology of Interactions and Similarity SEts
** automating the analysis of protein-protein interactions networks.


=Pathway Databases=
=Pathway Databases=
Line 38: Line 52:
* [http://nemo-cyclone.sourceforge.net Cyclone] - provides an open source Java API for easier access to BioCyc.
* [http://nemo-cyclone.sourceforge.net Cyclone] - provides an open source Java API for easier access to BioCyc.
* [http://regulondb.ccg.unam.mx/ RegulonDB] E.coli K12 DB (operons/genes/regulatory elements)
* [http://regulondb.ccg.unam.mx/ RegulonDB] E.coli K12 DB (operons/genes/regulatory elements)
* [http://www.wikipathways.org/index.php/WikiPathways WikiPathways] open curation of biological pathways
* [http://www.pathwaycommons.org/pc/ Pathway Commons] access to biological pathway information collected from public pathway databases.


=Pathway specific languages=
=Pathway specific languages=
Line 53: Line 70:
==Pathway analysis==
==Pathway analysis==
* [http://www.patika.org/ PATIKA]  and [http://www.cs.bilkent.edu.tr/~patikaweb/ PATIKAweb]
* [http://www.patika.org/ PATIKA]  and [http://www.cs.bilkent.edu.tr/~patikaweb/ PATIKAweb]
=Related pages on OpenWetWare=
* [http://openwetware.org/wiki/Summer_2006_Workshop Summer_2006_Workshop]


=Biography=
=Biography=
#Luo W, Friedman M, Shedden K, Hankenson KD, Woolf JP (2009). "GAGE: generally applicable gene set enrichment for pathway analysis". BMC Bioinformatics 10: 161: http://www.biomedcentral.com/1471-2105/10/161.
#Aittokallio, Tero, and Benno Schwikowski. “Graph-based methods for analysing networks in cell biology.” Brief Bioinform 7, no. 3 (September 1, 2006): 243-255.
#Aittokallio, Tero, and Benno Schwikowski. “Graph-based methods for analysing networks in cell biology.” Brief Bioinform 7, no. 3 (September 1, 2006): 243-255.
#Li, Chuan-Yun, Xizeng Mao, and Liping Wei. “Genes and (Common) Pathways Underlying Drug Addiction.” PLoS Computational Biology 4, no. 1 (1, 2008): e2 EP -.
#Li, Chuan-Yun, Xizeng Mao, and Liping Wei. “Genes and (Common) Pathways Underlying Drug Addiction.” PLoS Computational Biology 4, no. 1 (1, 2008): e2 EP -.
Line 61: Line 82:
#Stromback, Lena, Vaida Jakoniene, He Tan, and Patrick Lambrix. “Representing, storing and accessing molecular interaction data: a review of models and tools.” Brief Bioinform 7, no. 4 (December 1, 2006): 331-338.
#Stromback, Lena, Vaida Jakoniene, He Tan, and Patrick Lambrix. “Representing, storing and accessing molecular interaction data: a review of models and tools.” Brief Bioinform 7, no. 4 (December 1, 2006): 331-338.
#“Tools for visually exploring biological networks -- Suderman and Hallett 23 (20): 2651 -- Bioinformatics.” http://bioinformatics.oxfordjournals.org/cgi/content/full/23/20/2651.
#“Tools for visually exploring biological networks -- Suderman and Hallett 23 (20): 2651 -- Bioinformatics.” http://bioinformatics.oxfordjournals.org/cgi/content/full/23/20/2651.
#"Pathways to the analysis of microarray data",Trends in Biotechnology, Volume 23, Issue 8, August 2005, Pages 429-435 R.Keira Curtis, Matej Oresic, Antonio Vidal-Puig
#"Bioinformatics applications for pathway analysis of microarray data",Current Opinion in Biotechnology, Volume 19, Issue 1, February 2008, Pages 50-54,Thomas Werner






{{stub}}
{{stub}}
[[Category:Protocol]] [[Category:In silico]] [[Category:Data analysis]]

Latest revision as of 19:00, 11 November 2013

After determining a list of genes involved in a given biological process the next step is to map these genes to known pathways/Gene Ontology terms and determine i.e. which pathways are overrepresented in a given set of genes.

Recent review (Jan 2008 !): Nam, Dougu, and Seon-Young Kim. “Gene-set approach for expression pattern analysis.” Brief Bioinform (17, 2008): bbn001. HTML See table 1 for complete list of tools.

Recommended

  • g:Profiler a web-based toolset for functional profiling of gene lists from large-scale experiments. Easy to use web server
  • KOBAS server used for i.e. elucidating pathways in addiction
    • takes both FASTA files and lists of genes
    • caveats
      • excise gi| from typical FASTA NCBI entry to get unique IDs
      • only about 1/3 of genes will get annotated in the first step
    • Li, Chuan-Yun, Xizeng Mao, and Liping Wei. “Genes and (Common) Pathways Underlying Drug Addiction.” PLoS Computational Biology 4, no. 1 (1, 2008) HTML
  • GSEA withMSigDB "Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states"

objections (Damian D, Gorfine M. Statistical concerns about the GSEA procedure): http://www.nature.com/ng/journal/v36/n7/full/ng0704-663a.html and reply: http://www.nature.com/ng/journal/v36/n7/full/ng0704-663b.html

  • ErmineJ "ErmineJ performs analyses of gene sets in expression microarray data. A typical goal is to determine whether particular biological pathways are "doing something interesting" in the data. The software is designed to be used by biologists with little or no informatics background."
  • GAGE is applicable independent of sample sizes, experimental design, assay platforms, and other types of heterogeneity (paper). This Biocondutor package also provides functions and data for pathway, GO and gene set analysis in general. Tutorials describe both RNA-Seq and microarray data analysis workflows.

Other tools to check

  • GEPAT Genome Expression Pathway Analysis Tool. Performs standard microarray analyzes plus "Ensembl database and provides information about gene names, chromosomal location, GO categories and enzymatic activity for each probe on the chip.". Complex installation of java jars/MySQL etc.
  • nonparametric multivariate analysis Nettleton et al. HTML. R code available from author.

Pathway/graph visualisation

  • Cytoscape leader in the field
  • ONDEX HTML "enables data from diverse biological data sets to be linked, integrated and visualised through graph analysis techniques"
  • Pathview R/Bioconductor tool for pathway based data integration and visualization, easy to integrate in pathway analysis workflows. R-Forgehas an overview with some nice example plots. The work has been published in Bioinformatics.

Protein interactions

  • BIANA biological database integration and network management framework, successor of PIANA
  • MATISSE Modular Analysis for Topology of Interactions and Similarity SEts
    • automating the analysis of protein-protein interactions networks.

Pathway Databases

  • KEGG first choice for scope
  • Reactome human + model organisms pathways. Expert annotations from literature.
  • PID Pathway Interaction Database @NIH
  • BioCyc
  • Cyclone - provides an open source Java API for easier access to BioCyc.
  • RegulonDB E.coli K12 DB (operons/genes/regulatory elements)
  • WikiPathways open curation of biological pathways
  • Pathway Commons access to biological pathway information collected from public pathway databases.

Pathway specific languages

  • BioPAX Biological Pathway Exchange Language

Stuff 2 check

  • GenMapp, Pathway Processor GeneXpress see:
Cavalieri D, De Filippo C. Bioinformatic methods for integrating whole-genome expression results into cellular networks. Drug Discov Today. 2005;10:727–734. doi: 10.1016/S1359-6446(05)03433-1
  • KaPPA-View
  • VANTED
  • [1] HTML OSML Editor

Pathway analysis

Related pages on OpenWetWare

Biography

  1. Luo W, Friedman M, Shedden K, Hankenson KD, Woolf JP (2009). "GAGE: generally applicable gene set enrichment for pathway analysis". BMC Bioinformatics 10: 161: http://www.biomedcentral.com/1471-2105/10/161.
  2. Aittokallio, Tero, and Benno Schwikowski. “Graph-based methods for analysing networks in cell biology.” Brief Bioinform 7, no. 3 (September 1, 2006): 243-255.
  3. Li, Chuan-Yun, Xizeng Mao, and Liping Wei. “Genes and (Common) Pathways Underlying Drug Addiction.” PLoS Computational Biology 4, no. 1 (1, 2008): e2 EP -.
  4. Nam, Dougu, and Seon-Young Kim. “Gene-set approach for expression pattern analysis.” Brief Bioinform (17, 2008): bbn001.
  5. Resources for integrative systems biology: from data through databases to networks and dynamic system models -- Ng et al. 7 (4): 318 -- Briefings in Bioinformatics.” http://bib.oxfordjournals.org/cgi/content/full/7/4/318.
  6. Stromback, Lena, Vaida Jakoniene, He Tan, and Patrick Lambrix. “Representing, storing and accessing molecular interaction data: a review of models and tools.” Brief Bioinform 7, no. 4 (December 1, 2006): 331-338.
  7. “Tools for visually exploring biological networks -- Suderman and Hallett 23 (20): 2651 -- Bioinformatics.” http://bioinformatics.oxfordjournals.org/cgi/content/full/23/20/2651.
  8. "Pathways to the analysis of microarray data",Trends in Biotechnology, Volume 23, Issue 8, August 2005, Pages 429-435 R.Keira Curtis, Matej Oresic, Antonio Vidal-Puig
  9. "Bioinformatics applications for pathway analysis of microarray data",Current Opinion in Biotechnology, Volume 19, Issue 1, February 2008, Pages 50-54,Thomas Werner