User talk:Darek Kedra/sandbox 1: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 214: Line 214:
== chip-chip ==
== chip-chip ==


* Ringo (R package): http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1906858
check (from ringo paper)
 
MAT [5], TileMap [6], HGMM [8],
check:
mpeak [4], TiMAT http://bdtnp.lbl.gov/TiMAT, MAT [5], TileMap [6], ACME [7], HGMM [8], and ChIPOTle [9]


== Mass spec progs + sites ==
== Mass spec progs + sites ==

Revision as of 14:07, 23 November 2007

Pages to review:


microarrays

databases

overview [1]

Videos bioinformatics

Positive selection

How to detect a pssitive selection of certain AA in a set of homologues sequences?:


Sequence assembly

First generation

Genome assemblers used in current genomic projects

  • JAZZ -> @JGI in house only
  • RAMEN (not published yet, used for medaka and silkworm projects)

New Programs

  • Minimus suitable for bacterial genomes
  • EULER P.Pevzner graph algorithm producing superior contigs

requires phrap and patched ReAligner

  • MIRA Version 2.9.8 enables true hybrid sequence assembly (454 data with Sanger reads).

Mutation/SNP detection

  • PolyScan ref New program claimed to be superior to MutationSurveyor/PolyPhred/SNPdetector
  • MutationSurveyor $$$ program. 30 days demo available. MS3 has a limit of MS3 400 traces in a single project. GUI. Used in several diagnostic labs.
  • novoSNP ref Windows/Linux 2.0.3. GUI & command line on Linux.
  • InSNP windows only, ABI base calls? detects substitution and indel SNPs in sequencing traces

ESTs clustering

Paper:

Programs:

  • Lucy2 standalone (Win Linux and MacOS)

Sequence assembly with ESTs enhancements=


See also table with less frequently used algorithms: http://biolinfo.org/EST/assembly.htm

Alternative splicing

Sequence comparisons

Protein orthologues

Genomic DNA

GATA: a graphic alignment tool for comparative sequence analysis

Orthologues retrival

Gene ontology

tutorials 2 check

Genome annotation

  • Jigsaw meta-program designed to use the output from gene finders, splice site prediction programs and sequence alignments to predict gene models.


de novo predictions (eucariote, genomic level)

citation: http://genomebiology.com/2006/7/s1/S11

"""The Conrad gene caller is tool for predicting gene structures in DNA based on the DNA sequence and other available evidence. The gene caller uses semi-Markov Conditional Random Fields. The Conrad CRF engineis a general purpose CRF engine used by the gene caller to provide the structure and algorithms for gene calling."""

To check: http://www.wormbase.org/wiki/index.php/NGASP To read: http://compbiol.plosjournals.org/perlserv/?request=get-document&doi=10.1371%2Fjournal.pcbi.0030054


Pathways

  • g:Profiler a web-based toolset for functional profiling of gene lists from large-scale experiments. Easy to use web server

objections (Damian D, Gorfine M. Statistical concerns about the GSEA procedure): http://www.nature.com/ng/journal/v36/n7/full/ng0704-663a.html and reply: http://www.nature.com/ng/journal/v36/n7/full/ng0704-663b.html

Other ones to check:

Blast tabulated output

Columns:

  • Identity of query sequence
  • Identity of subject sequence (matching sequence in database)
  • Percent identity
  • Alignment length
  • Number of mismatches
  • Number of gaps
  • Start of query sequence
  • End of query sequence
  • Start of subject sequence
  • End of subject sequence
  • E-value
  • Bit-score

Phylogeny

tree checks

CONSEL: for assessing the confidence of phylogenetic tree selection Hidetoshi Shimodaira and Masami Hasegawa http://bioinformatics.oxfordjournals.org/cgi/content/abstract/17/12/1246

http://bioinformatics.oxfordjournals.org/cgi/content/full/22/12/1540

tree builders

  • BioNJ (NJ trees -> same speed better accuracy for large number of taxa)

alignments

  • PANDIT database of multiple sequence alignments and phylogenetic trees covering many common protein domains.

genome variants DB

  • TCAG curated catalogue of structural variation in the human genome


stuff to incorporate

Protein vs mRNA level differences

Protein vs mRNA level deifferences

Stability of mRNA

  • Decapping of mRNA
  • Deandenylation

Regulation of translation

  • number of ribbosomes per mRNA transcript
  • binding of specific RBP (RNA binding proteins) to mRNA

Graph visualisation

EST 2 genome alignment

ref

Portugal


chip-chip

check (from ringo paper)

MAT [5], TileMap [6],  HGMM [8],

Mass spec progs + sites

metagenomics

Taverna


read

check

  • Transcr factors recognition

http://bioinformatics.oxfordjournals.org/cgi/content/full/22/9/1047

Protein 3D


other courses

Mass Spec links

Microarray


ChiP-chip

uses gff format

Microsoft Excel macros. Restriction on maximum number of rows accepted by Excel (ca 64k).

paper:http://scholar.google.com/url?sa=U&q=http://repositories.cdlib.org/cgi/viewcontent.cgi%3Farticle%3D1074%26context%3Duclastat Win only ver.2 ?not working?

BED/GFF file format (BED format: http://genome.ucsc.edu/FAQ/FAQformat#format1) so far only human genome?

paper: http://bioinformatics.oxfordjournals.org/cgi/content/full/21/18/3629

  • TileHGMM

http://www.stat.wisc.edu/~keles/software.html

R package, requires replicates from chips, plus probe location file. For computational reasons data sould be separated for each chromosome.