User talk:Darek Kedra: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
m (chip-chip)
m (Mass spec progs + sites)
Line 256: Line 256:
check:
check:
mpeak [4], TiMAT http://bdtnp.lbl.gov/TiMAT, MAT [5], TileMap [6], ACME [7], HGMM [8], and ChIPOTle [9]
mpeak [4], TiMAT http://bdtnp.lbl.gov/TiMAT, MAT [5], TileMap [6], ACME [7], HGMM [8], and ChIPOTle [9]
== Mass spec progs + sites ==
* http://proteome.gs.washington.edu/
* MyriMatch http://pubs.acs.org/cgi-bin/abstract.cgi/jprobs/2007/6/i02/abs/pr0604054.html
* MyriMatch http://www.mc.vanderbilt.edu/msrc/bioinformatics/

Revision as of 03:59, 22 November 2007

Pages to review:


microarrays

databases

overview [1]

Videos bioinformatics

Positive selection

How to detect a pssitive selection of certain AA in a set of homologues sequences?:


Sequence assembly

First generation

Genome assemblers used in current genomic projects

  • JAZZ -> @JGI in house only
  • RAMEN (not published yet, used for medaka and silkworm projects)

New Programs

  • Minimus suitable for bacterial genomes
  • EULER P.Pevzner graph algorithm producing superior contigs

requires phrap and patched ReAligner

  • MIRA Version 2.9.8 enables true hybrid sequence assembly (454 data with Sanger reads).

Mutation/SNP detection

  • PolyScan ref New program claimed to be superior to MutationSurveyor/PolyPhred/SNPdetector
  • MutationSurveyor $$$ program. 30 days demo available. MS3 has a limit of MS3 400 traces in a single project. GUI. Used in several diagnostic labs.
  • novoSNP ref Windows/Linux 2.0.3. GUI & command line on Linux.
  • InSNP windows only, ABI base calls? detects substitution and indel SNPs in sequencing traces

ESTs clustering

Paper:

Programs:

  • Lucy2 standalone (Win Linux and MacOS)

Sequence assembly with ESTs enhancements=


See also table with less frequently used algorithms: http://biolinfo.org/EST/assembly.htm

Alternative splicing

Sequence comparisons

Protein orthologues

Genomic DNA

GATA: a graphic alignment tool for comparative sequence analysis

Orthologues retrival

Gene ontology

tutorials 2 check

Genome annotation

  • Jigsaw meta-program designed to use the output from gene finders, splice site prediction programs and sequence alignments to predict gene models.


de novo predictions (eucariote, genomic level)

citation: http://genomebiology.com/2006/7/s1/S11

"""The Conrad gene caller is tool for predicting gene structures in DNA based on the DNA sequence and other available evidence. The gene caller uses semi-Markov Conditional Random Fields. The Conrad CRF engineis a general purpose CRF engine used by the gene caller to provide the structure and algorithms for gene calling."""

To check: http://www.wormbase.org/wiki/index.php/NGASP To read: http://compbiol.plosjournals.org/perlserv/?request=get-document&doi=10.1371%2Fjournal.pcbi.0030054

syngr3

 	 RALGDS 2kb upstream

>chromosome:NCBI36:9:134962328:135016542:-1
TGGCCCCAGGAGGCTAGAGTTGGGGTATAGGGACCAGGTCACACAGGGTTCTGAATGCCA
GGCTAGGAGGCAGGGGTCACCGCAGGCCTGTCCCAGCAGGGGGATCAATATATATGGGGC
CCAAGCGCTGGACTCAGGGGATACCTGGCCAGCGAGGCCCCAACACAGGATGAGGCCTCT
GATGACCAGCTCGGGGATGAGTGACGGGGAAGAGCCATGAAGTGGGGGTCACCACGCACA
GCAGGGCCTGGCCACTTACAGCTGAGTGGCCTGCAGTGCTGGTGCCTTGGCTCTGGTATA
AAATTGAATGAAGCCGGGCACAGTGGCTCACGCCTGTAATTGCGGCACTTTGGGAGGCCA
AGGCAGGAGGATCTCTGGAGCCCAAGAGTTCCAAACCAGCCTGGGCAACATAGTGAGACT
TCATCTCTACATAGTTTTTTTAAAAATAAAAAAGGCCGGGCACAGTGGCTCACGCCTGTA
ATCCCAGCACTTTGGGAGGCCAAGGTGGGCAGATTACGAGGTCAGAAGTTCCAGACCAGC
CTGGCCAACATAGTGAAACCCCGCCTCTACTAAAAATACAAAAATTAGCCGGGTGTGATG
GCACATGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGACGGGAGAATCACTTGAACCTGG
GAGGTGGAGGTTGCAGTGAGCTGAGACTGTGCCATTGCACTCCAGTCTGGGTGACAGAGT
GAGACTCTGTCTCAAAAAAATAAAATAAATAAATAAATCAAAAGGGTTGGCCAGGCAAAG
TGGCTCACGCCTGTAATCCCAACACTTTGGGAGGCCGAGGAGGGCAGATCACCTGAGGTC
AGAAGTTCGAGACCAGCCTGGCCAATATGGTGAAACTCTGTCTCTACTAAAAACACAAAA
ATTAGTCGGGCGTGGTGGTGGCAGCGTGCACCTGTAATCCCAGCTACTTGGGAGGCTGAG
ACAGGAGAATCGCTTGAACCCAGGAGGCAGAGGTTGCAATAAGCCGTGATCTAGCCACTG
CATTCCAGCCTGGTCGACAAGAAGGAGACTGCGGCTGGTGCAGTGGCTCATGCCTGTAAT
CCCAGCACTTTGGGAGGCCAAGGTGGGAGGATCACCTGACGTCTGGAGTTTGAGACCAGC
CTGGCCAACATGGTGAAACCCCATCTCTACTAAAAATACAAAATTAGCCGGGCGTGGTGG
CAGGTGCCTGTAATCCCAGCTACTCAGGAGGCTGAGGCAGGAGAATTGCTTGAACCTGGG
ACGTGGAGGTTGCAGTGAGCCGAGATCGCGCCATTGCACTCCAGCCTGGGGGACAAGAGT
GAGACTTCGTCTCAAAAAACAAACAAACGAGACTCCATCTCAAAAACAAAAGTAAATAAA
AAATAAAAAGGGACGAAATCAGTTAGCCCTAATAACACTGACTACCCCCAACAGTCGCTC
AACCCCAGTGTGAGGCCTCCTTCCCCCAGGCTCCCTTGAGCCCCATCTTACACCAGCTGA
AGCTGCAGCTGGACCTTGTGGTCCCGGGTGCTGTCAGGGAAGACAAGGAGGGAGGTGGGG
AAGAGGAGGGGAAGGGGAGACCTGACTTTCTCCCTGCCCAGGCTGAGTTCGCTGTCACCT
CGGGTCCCCCAGCTCCCAGCCATCCGCCGAGCCGAGTCCAGCAGGTGGCATCGGGGTGCT
GGGCGCCAGGGTGAACGTGTATATTTGGTGACGCCGGCGCGCCGACTCAGCGGCCCCCGC
CTGGCTGGGGCGAGGTTGGCCCTAGGTCCTAGCGGGGTGGGGAGTACTGAGCAGGCGGCT
GGGGCGGACGGACACGTGAGATCGGCCGCACATGGCGCTGGGAGCGTGGCGCGTGCGCGC
GGCGAAGCGGAGTGACGTGCACGCGCTGTCTGCGGTCCCGCGCAGGCCCCGTGTGCGCGC
CCGGCCTTGGACAACAGGCCCGGCTGCCCCGCGGGGGGAACACCCGCGTCGGCCCGCGGG
AGGGAGGCCTGAGCGCGCCCCCGAGCGCGTCCCCGAGCTCACGCGGCGGGGCGCGCCCCT
CGCACCTGCGGGCGGGCTGGGGCGGGGCCGCGGCTGTCCCCGCCCACCCGGGCCCGAGCC
CGGGGAGCCGGGGCGGAACCGAGCGGCGAGGCCCGAGCGGCCGGAGCGCGGCGCGGCGCA
GACAATGGGAGCGGCGCTGGCGGCTGCCGGGGCGGCCCCGAGGGCCGCAGAGTCCTGGGC
CCGGCGGGGACCGGAGGCCGCGCCATGAGCCCCGCAGCCGGGCGCACCCCTGCGCCGCGC
GCGGGCCCAGGCGAGCGGCCTCTGCGAGCCCCGGGTCCCGCCCTGGGGCCGGCGATGTGC
CACCGAGGCTGAGGATGATGGTAGATTGCCAG

Pathways

CPath

Blast tabulated output

Columns:

  • Identity of query sequence
  • Identity of subject sequence (matching sequence in database)
  • Percent identity
  • Alignment length
  • Number of mismatches
  • Number of gaps
  • Start of query sequence
  • End of query sequence
  • Start of subject sequence
  • End of subject sequence
  • E-value
  • Bit-score

Phylogeny

tree checks

CONSEL: for assessing the confidence of phylogenetic tree selection Hidetoshi Shimodaira and Masami Hasegawa http://bioinformatics.oxfordjournals.org/cgi/content/abstract/17/12/1246

http://bioinformatics.oxfordjournals.org/cgi/content/full/22/12/1540

tree builders

  • BioNJ (NJ trees -> same speed better accuracy for large number of taxa)

alignments

  • PANDIT database of multiple sequence alignments and phylogenetic trees covering many common protein domains.

genome variants DB

  • TCAG curated catalogue of structural variation in the human genome


stuff to incorporate

Protein vs mRNA level differences

Protein vs mRNA level deifferences

Stability of mRNA

  • Decapping of mRNA
  • Deandenylation

Regulation of translation

  • number of ribbosomes per mRNA transcript
  • binding of specific RBP (RNA binding proteins) to mRNA

Graph visualisation

EST 2 genome alignment

ref

Portugal

motifs again...

chip-chip

check: mpeak [4], TiMAT http://bdtnp.lbl.gov/TiMAT, MAT [5], TileMap [6], ACME [7], HGMM [8], and ChIPOTle [9]

Mass spec progs + sites