User talk:Darek Kedra/sandbox 11

=Software=

run_ayb.sh [-nc=45] -prefix2=output_dir -prefix3=intensity_dir tile=tile_prefix [-compression=gzip] [-matrix=/path/to/matrix.txt] [-mpi=5] [-I] [-niter=5] [-paired] [-saveR] [-tol=1e-5] where: compression: Gzip, bzip2 or none. "gzip" intensity files are *_int.txt.gz, "bzip2" intensities files are *_int.txt.bz2, "none" *_int.txt (default "none"). I: Read intensities in IPAR format (number of cycles must be given). matrix: Use a predetermined phasing matrix (e.g. that estimated by the Illumina pipeline). Switches off cross-talk estimation, faster but may give worse results (optional). mpi: Use MPI to run on multi-processors. Option is number of processors, otherwise number available on computer (optional). nc: Number of cycles to analyse; should be less than or equal to number of cycles in the intensity file (no default, required for IPAR). niter: Number of full tolerance iterations to do (default 5). paired: Treat read as paired-end, split into two reads of length nc/2. (optional). prefix2: Path to directory in which output files are created (default ""). prefix3: Path to directory from which intensities file is read (default ""). saveR: Save final R data structures to tile.RData (optional). tile: Prefix of tile, e.g. s_1_0015. Filenames are automatically completed, so -tile=s_1 does all of lane 1, -tile=s_1_00 does the first 99 tiles (no default, required). tol: Tolerance for iterations (default 1e-5).
 * AYB AYB is a base caller for the Illumina GA II platform, (21 May 2009, Initial release). No publication yet (Dec 2009).
 * language: R with C helper functions
 * Installation OK, but requires editing config files ( location of /usr/lib64/R and location of final install directory)
 * to run:


 * BayesCall


 * current version: 0.3 (naive) bayesCall (speedup)


 * W. C Kao, K. Stevens, and Y. S Song, “BayesCall: A model-based base-calling algorithm for high-throughput short-read sequencing,” Genome Research 19, no. 10 (2009): 1884.


 * requirements:


 * Gnu Scientific Library (GSL) (version >= 1.12) Note that cblas library might be required by GSL.


 * Python (version >= 2.5)


 * SciPy (version >= 0.7)


 * NumPy (version >= 1.3)


 * SWIFT from Sanger


 * requirements (packages):
 * gsl gsl-dev, libtiff libtiff-dev, fftw3 fftw3-dev


 * input: image data
 * output: base calls (format?)


 * Rolexa R package

R   source("http://bioconductor.org/biocLite.R") biocLite("Rolexa")
 * installation
 * requirements
 * mclust from http://cran.r-project.org/web/packages/mclust/index.html
 * fork from http://cran.r-project.org/web/packages/fork/index.html

R CMD INSTALL mclust_3.3.2.tar.gz R CMD INSTALL fork_1.2.2.tar.gz R CMD INSTALL Rolexa_1.2.0.tar.gz
 * installation


 * running instructions ( bit complicated): http://www.bioconductor.org/packages/2.5/bioc/vignettes/Rolexa/inst/doc/Rolexa-vignette.pdf
 * output: uses alternative probabilistic base calling method based on the fluorescence intensity quantifications that uses the extended IUPAC alphabet to code ambiguous bases


 * Alta-Cyclic Slow!!!
 * requrements: here