User talk:Darek Kedra: Difference between revisions
Darek Kedra (talk | contribs) m (sandbox) |
Darek Kedra (talk | contribs) m (chip-chip) |
||
Line 249: | Line 249: | ||
* [http://nar.oxfordjournals.org/cgi/content/abstract/33/suppl_2/W442 FOOTER paper] web: http://biodev.hgen.pitt.edu/footer_php/Footerv2_0.php | * [http://nar.oxfordjournals.org/cgi/content/abstract/33/suppl_2/W442 FOOTER paper] web: http://biodev.hgen.pitt.edu/footer_php/Footerv2_0.php | ||
== chip-chip == | |||
* Ringo (R package): http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1906858 | |||
check: | |||
mpeak [4], TiMAT http://bdtnp.lbl.gov/TiMAT, MAT [5], TileMap [6], ACME [7], HGMM [8], and ChIPOTle [9] |
Revision as of 06:33, 20 November 2007
Pages to review:
- http://www.biomedcentral.com/1471-2105/7/499 (sequence alignment refinement paper)
- http://nar.oxfordjournals.org/cgi/content/full/33/22/7120 Automatic assessment of alignment quality
- ftp://ftp-igbmc.u-strasbg.fr/pub/RASCAL RASCAL 1.1
- http://www.dbi.tju.edu/dbi/index.php?menu=42 PAINT promoter database
- http://www.pdg.cnb.uam.es/fabascal/ tutorials ++
microarrays
- http://depts.washington.edu/l2l/ Database of up/down regulated genes
databases
- http://base.thep.lu.se/ BASE
- http://www.longhornarraydatabase.org/ LAD
- http://genome.tugraz.at/mars/mars_description.shtml MARS (2006) (doc HTML)
- http://www.sbeams.org/Microarray/ SBEAMS
overview [1]
Videos bioinformatics
- http://calit2.net/events/algorithmicbio/archive.php San Diego ALGORITHMIC BIOLOGY 2006
Positive selection
How to detect a pssitive selection of certain AA in a set of homologues sequences?:
- Selection (WEB) (requires >=5 ORFs + query ORF. Can handle PDB files
- SWAKK (WEB)
- SWAPSC (standalone Win and Linux) PHYLIP formated input
Sequence assembly
First generation
Genome assemblers used in current genomic projects
- JAZZ -> @JGI in house only
- RAMEN (not published yet, used for medaka and silkworm projects)
New Programs
requires phrap and patched ReAligner
- MIRA Version 2.9.8 enables true hybrid sequence assembly (454 data with Sanger reads).
Mutation/SNP detection
- MutationSurveyor $$$ program. 30 days demo available. MS3 has a limit of MS3 400 traces in a single project. GUI. Used in several diagnostic labs.
- InSNP windows only, ABI base calls? detects substitution and indel SNPs in sequencing traces
- PolyPhred [ref http://nar.oxfordjournals.org/cgi/content/full/25/14/2745] 1997 program, integrated with phred and Consed.
ESTs clustering
Paper:
- EMBnet course (PDF)
Programs:
- Lucy2 standalone (Win Linux and MacOS)
Sequence assembly with ESTs enhancements=
- MIRA2 also detects SNPs
See also table with less frequently used algorithms: http://biolinfo.org/EST/assembly.htm
Alternative splicing
- ASTRA (Alternative Splicing and TRanscription Archives)
- ECgene Genome Annotation for Alternative Splicing. Large number of putative splice forms.
Sequence comparisons
Protein orthologues
- Exonerate tool from sanger
- Inparanoid
Genomic DNA
GATA: a graphic alignment tool for comparative sequence analysis
Orthologues retrival
Gene ontology
- Tutorials http://www.geneontology.org/GO.teaching.resources.shtml Very extensiive!
tutorials 2 check
Genome annotation
- Jigsaw meta-program designed to use the output from gene finders, splice site prediction programs and sequence alignments to predict gene models.
de novo predictions (eucariote, genomic level)
citation: http://genomebiology.com/2006/7/s1/S11
- Conrad java standalone
"""The Conrad gene caller is tool for predicting gene structures in DNA based on the DNA sequence and other available evidence. The gene caller uses semi-Markov Conditional Random Fields. The Conrad CRF engineis a general purpose CRF engine used by the gene caller to provide the structure and algorithms for gene calling."""
To check: http://www.wormbase.org/wiki/index.php/NGASP To read: http://compbiol.plosjournals.org/perlserv/?request=get-document&doi=10.1371%2Fjournal.pcbi.0030054
syngr3
RALGDS 2kb upstream
>chromosome:NCBI36:9:134962328:135016542:-1 TGGCCCCAGGAGGCTAGAGTTGGGGTATAGGGACCAGGTCACACAGGGTTCTGAATGCCA GGCTAGGAGGCAGGGGTCACCGCAGGCCTGTCCCAGCAGGGGGATCAATATATATGGGGC CCAAGCGCTGGACTCAGGGGATACCTGGCCAGCGAGGCCCCAACACAGGATGAGGCCTCT GATGACCAGCTCGGGGATGAGTGACGGGGAAGAGCCATGAAGTGGGGGTCACCACGCACA GCAGGGCCTGGCCACTTACAGCTGAGTGGCCTGCAGTGCTGGTGCCTTGGCTCTGGTATA AAATTGAATGAAGCCGGGCACAGTGGCTCACGCCTGTAATTGCGGCACTTTGGGAGGCCA AGGCAGGAGGATCTCTGGAGCCCAAGAGTTCCAAACCAGCCTGGGCAACATAGTGAGACT TCATCTCTACATAGTTTTTTTAAAAATAAAAAAGGCCGGGCACAGTGGCTCACGCCTGTA ATCCCAGCACTTTGGGAGGCCAAGGTGGGCAGATTACGAGGTCAGAAGTTCCAGACCAGC CTGGCCAACATAGTGAAACCCCGCCTCTACTAAAAATACAAAAATTAGCCGGGTGTGATG GCACATGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGACGGGAGAATCACTTGAACCTGG GAGGTGGAGGTTGCAGTGAGCTGAGACTGTGCCATTGCACTCCAGTCTGGGTGACAGAGT GAGACTCTGTCTCAAAAAAATAAAATAAATAAATAAATCAAAAGGGTTGGCCAGGCAAAG TGGCTCACGCCTGTAATCCCAACACTTTGGGAGGCCGAGGAGGGCAGATCACCTGAGGTC AGAAGTTCGAGACCAGCCTGGCCAATATGGTGAAACTCTGTCTCTACTAAAAACACAAAA ATTAGTCGGGCGTGGTGGTGGCAGCGTGCACCTGTAATCCCAGCTACTTGGGAGGCTGAG ACAGGAGAATCGCTTGAACCCAGGAGGCAGAGGTTGCAATAAGCCGTGATCTAGCCACTG CATTCCAGCCTGGTCGACAAGAAGGAGACTGCGGCTGGTGCAGTGGCTCATGCCTGTAAT CCCAGCACTTTGGGAGGCCAAGGTGGGAGGATCACCTGACGTCTGGAGTTTGAGACCAGC CTGGCCAACATGGTGAAACCCCATCTCTACTAAAAATACAAAATTAGCCGGGCGTGGTGG CAGGTGCCTGTAATCCCAGCTACTCAGGAGGCTGAGGCAGGAGAATTGCTTGAACCTGGG ACGTGGAGGTTGCAGTGAGCCGAGATCGCGCCATTGCACTCCAGCCTGGGGGACAAGAGT GAGACTTCGTCTCAAAAAACAAACAAACGAGACTCCATCTCAAAAACAAAAGTAAATAAA AAATAAAAAGGGACGAAATCAGTTAGCCCTAATAACACTGACTACCCCCAACAGTCGCTC AACCCCAGTGTGAGGCCTCCTTCCCCCAGGCTCCCTTGAGCCCCATCTTACACCAGCTGA AGCTGCAGCTGGACCTTGTGGTCCCGGGTGCTGTCAGGGAAGACAAGGAGGGAGGTGGGG AAGAGGAGGGGAAGGGGAGACCTGACTTTCTCCCTGCCCAGGCTGAGTTCGCTGTCACCT CGGGTCCCCCAGCTCCCAGCCATCCGCCGAGCCGAGTCCAGCAGGTGGCATCGGGGTGCT GGGCGCCAGGGTGAACGTGTATATTTGGTGACGCCGGCGCGCCGACTCAGCGGCCCCCGC CTGGCTGGGGCGAGGTTGGCCCTAGGTCCTAGCGGGGTGGGGAGTACTGAGCAGGCGGCT GGGGCGGACGGACACGTGAGATCGGCCGCACATGGCGCTGGGAGCGTGGCGCGTGCGCGC GGCGAAGCGGAGTGACGTGCACGCGCTGTCTGCGGTCCCGCGCAGGCCCCGTGTGCGCGC CCGGCCTTGGACAACAGGCCCGGCTGCCCCGCGGGGGGAACACCCGCGTCGGCCCGCGGG AGGGAGGCCTGAGCGCGCCCCCGAGCGCGTCCCCGAGCTCACGCGGCGGGGCGCGCCCCT CGCACCTGCGGGCGGGCTGGGGCGGGGCCGCGGCTGTCCCCGCCCACCCGGGCCCGAGCC CGGGGAGCCGGGGCGGAACCGAGCGGCGAGGCCCGAGCGGCCGGAGCGCGGCGCGGCGCA GACAATGGGAGCGGCGCTGGCGGCTGCCGGGGCGGCCCCGAGGGCCGCAGAGTCCTGGGC CCGGCGGGGACCGGAGGCCGCGCCATGAGCCCCGCAGCCGGGCGCACCCCTGCGCCGCGC GCGGGCCCAGGCGAGCGGCCTCTGCGAGCCCCGGGTCCCGCCCTGGGGCCGGCGATGTGC CACCGAGGCTGAGGATGATGGTAGATTGCCAG
Pathways
Blast tabulated output
Columns:
- Identity of query sequence
- Identity of subject sequence (matching sequence in database)
- Percent identity
- Alignment length
- Number of mismatches
- Number of gaps
- Start of query sequence
- End of query sequence
- Start of subject sequence
- End of subject sequence
- E-value
- Bit-score
Phylogeny
tree checks
CONSEL: for assessing the confidence of phylogenetic tree selection Hidetoshi Shimodaira and Masami Hasegawa http://bioinformatics.oxfordjournals.org/cgi/content/abstract/17/12/1246
- for clusters: PVCLUST R package
http://bioinformatics.oxfordjournals.org/cgi/content/full/22/12/1540
tree builders
- BioNJ (NJ trees -> same speed better accuracy for large number of taxa)
alignments
- PANDIT database of multiple sequence alignments and phylogenetic trees covering many common protein domains.
genome variants DB
- TCAG curated catalogue of structural variation in the human genome
stuff to incorporate
Protein vs mRNA level differences
Protein vs mRNA level deifferences
Stability of mRNA
- Decapping of mRNA
- Deandenylation
Regulation of translation
- number of ribbosomes per mRNA transcript
- binding of specific RBP (RNA binding proteins) to mRNA
Graph visualisation
- [Cytoscape http://www.cytoscape.org/]
- ONDEX
EST 2 genome alignment
Portugal
motifs again...
chip-chip
- Ringo (R package): http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1906858
check: mpeak [4], TiMAT http://bdtnp.lbl.gov/TiMAT, MAT [5], TileMap [6], ACME [7], HGMM [8], and ChIPOTle [9]