BioMicroCenter:PPR Program: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
No edit summary
 
(19 intermediate revisions by the same user not shown)
Line 4: Line 4:
Guaranteeing high quality next-generation sequencing (NGS) data in a rapidly changing environment is an ongoing challenge. The recent introduction of the [[BioMicroCenter:Sequencing|Illumina NextSeq500]] and the depreciation of specific metrics from Illumina's Sequencing Analysis Viewer (SAV) have made it more difficult to directly determine the baseline error rate of sequencing runs. We have created an open-source tool to construct the Percent Perfect Reads (PPR) plot previously provided by the Illumina sequencers. The PPR program is compatible with HiSeq2000/2500, MiSeq, and NextSeq500 instruments, and provides an alternative to Illumina's Q scores for determining run quality. <BR><BR>
Guaranteeing high quality next-generation sequencing (NGS) data in a rapidly changing environment is an ongoing challenge. The recent introduction of the [[BioMicroCenter:Sequencing|Illumina NextSeq500]] and the depreciation of specific metrics from Illumina's Sequencing Analysis Viewer (SAV) have made it more difficult to directly determine the baseline error rate of sequencing runs. We have created an open-source tool to construct the Percent Perfect Reads (PPR) plot previously provided by the Illumina sequencers. The PPR program is compatible with HiSeq2000/2500, MiSeq, and NextSeq500 instruments, and provides an alternative to Illumina's Q scores for determining run quality. <BR><BR>


<LI><B> PPR Program can be downloaded as tarball (.tgz) file [[Media:NGS_sequencing_QC.tgz|'''here''']]</B><br><BR>
<LI><B> PPR Program can be downloaded as tarball (.tgz) file [[Media:NGS_sequencing_QC_v1.1.tgz‎|'''here''']]</B><br><BR>


The software is designed to be run in a UNIX/LINUX environment.  
The software is designed to be run in a UNIX/LINUX environment.  
Line 15: Line 15:
<LI>[https://www.perl.org/get.html perl ] (confirmed for v5.20.0)
<LI>[https://www.perl.org/get.html perl ] (confirmed for v5.20.0)
<LI>[http://samtools.sourceforge.net/ samtools ] (confirmed for v1.0)
<LI>[http://samtools.sourceforge.net/ samtools ] (confirmed for v1.0)
<H2>Test input fastq file and expected output:</H2>
An example Hiseq single end test fastq input file can be downloaded [[Media:Example.fq.gz|'''here''']]. This example file is compressed. Make sure to unzip it before your test run. Otherwise, the tool will not work. The expected output basing on the above test file should look like [[Media:ExampleL1.jpg|'''this''']]. If you see the same image, you have all the dependencies installed correctly


<H2>Commands:</H2>
<H2>Commands:</H2>
Line 23: Line 26:


<H3>IMPORTANT NOTE:</H3>
<H3>IMPORTANT NOTE:</H3>
  The input fastq files cannot be zipped or tarred.
   The input fastq files must contain absolute path.  
   The input fastq files must contain absolute path.  
   For example, if the input file read1.fq is under current directory /home/ubunto,  
   For example, if the input file read1.fq is under current directory /home/ubunto,  
Line 33: Line 37:
<H3>SPECIFIC:</H3>
<H3>SPECIFIC:</H3>
For Paired end NextSeq sequencing:
For Paired end NextSeq sequencing:
   perl ./NGS_missmatch_qc.pl nextseq_paired_end read1.fq read2.fq
   ./NGS_missmatch_qc.pl nextseq_paired_end absolute_path/read1.fq absolute_path/read2.fq


For single end NextSeq sequencing:
For single end NextSeq sequencing:
   perl ./NGS_missmatch_qc.pl nextseq_single_end read1.fq
   ./NGS_missmatch_qc.pl nextseq_single_end absolute_path/read1.fq


For paired end MiSeq sequencing:
For paired end MiSeq sequencing:
   perl ./NGS_missmatch_qc.pl miseq_paired_end read1.fq read2.fq
   ./NGS_missmatch_qc.pl miseq_paired_end absolute_path/read1.fq absolute_path/read2.fq


For single end MiSeq sequencing
For single end MiSeq sequencing
   perl ./NGS_missmatch_qc.pl miseq_single_end read1.fq
   ./NGS_missmatch_qc.pl miseq_single_end absolute_path/read1.fq


For paired end HiSeq sequencing
For paired end HiSeq sequencing
   perl ./NGS_missmatch_qc.pl hiseq_paired_end Lane1_1.fq Lane1_2.fq Lane2_1.fq Lane2_2.fq Lane3_1.fq Lane3_2.fq  
   ./NGS_missmatch_qc.pl hiseq_paired_end absolute_path/Lane1_1.fq absolute_path/Lane1_2.fq
   Lane4_1.fq Lane4_2.fq Lane5_1.fq Lane5_2.fq Lane6_1.fq Lane6_2.fq Lane7_1.fq Lane7_2.fq Lane8_1.fq Lane8_2.fq  
  absolute_path/Lane2_1.fq absolute_path/Lane2_2.fq absolute_path/Lane3_1.fq absolute_path/Lane3_2.fq  
   absolute_path/Lane4_1.fq absolute_path/Lane4_2.fq absolute_path/Lane5_1.fq absolute_path/Lane5_2.fq  
  absolute_path/Lane6_1.fq absolute_path/Lane6_2.fq absolute_path/Lane7_1.fq absolute_path/Lane7_2.fq  
  absolute_path/Lane8_1.fq Lane8_2.fq  


For single end HiSeq sequencing
For single end HiSeq sequencing
   perl ./NGS_missmatch_qc.pl hiseq_single_end Lane1.fq Lane2.fq Lane3.fq Lane4.fq Lane5.fq Lane6.fq Lane7.fq Lane8.fq
   ./NGS_missmatch_qc.pl hiseq_single_end absolute_path/Lane1.fq absolute_path/Lane2.fq
absolute_path/Lane3.fq absolute_path/Lane4.fq absolute_path/Lane5.fq absolute_path/Lane6.fq
absolute_path/Lane7.fq absolute_pathLane8.fq


<H3> Results:</H3>
<H3> Results:</H3>
   A jpg file named by the time of job submission
   See a jpg file named by the time of job submission

Latest revision as of 05:19, 16 June 2016

HOME -- SEQUENCING -- LIBRARY PREP -- HIGH-THROUGHPUT -- COMPUTING -- OTHER TECHNOLOGY

PPR PROGRAM

Guaranteeing high quality next-generation sequencing (NGS) data in a rapidly changing environment is an ongoing challenge. The recent introduction of the Illumina NextSeq500 and the depreciation of specific metrics from Illumina's Sequencing Analysis Viewer (SAV) have made it more difficult to directly determine the baseline error rate of sequencing runs. We have created an open-source tool to construct the Percent Perfect Reads (PPR) plot previously provided by the Illumina sequencers. The PPR program is compatible with HiSeq2000/2500, MiSeq, and NextSeq500 instruments, and provides an alternative to Illumina's Q scores for determining run quality.

  • PPR Program can be downloaded as tarball (.tgz) file here

    The software is designed to be run in a UNIX/LINUX environment.

    Dependencies:

  • fastxtoolkit (confirmed for v0.0.13)
  • bowtie2 (confirmed for v2.2.3)
  • bedtools (confirmed for v2.20.1)
  • r (confirmed for v2.15.3)
  • perl (confirmed for v5.20.0)
  • samtools (confirmed for v1.0)

    Test input fastq file and expected output:

    An example Hiseq single end test fastq input file can be downloaded here. This example file is compressed. Make sure to unzip it before your test run. Otherwise, the tool will not work. The expected output basing on the above test file should look like this. If you see the same image, you have all the dependencies installed correctly

    Commands:

    GENERIC:

     cd [CODE DIRECTORY]
     perl ./NGS_missmatch_qc.pl [RUNTYPE] (ABSOLUTE_PATH/FASTQ FILES)
    

    IMPORTANT NOTE:

     The input fastq files cannot be zipped or tarred.
     The input fastq files must contain absolute path. 
     For example, if the input file read1.fq is under current directory /home/ubunto, 
     the input fastq should be /home/ubunto/read1.fq instead of read1.fq. 
     If the input file read1.fq is under parent directory /home, 
     the input fastq should be /home/read1.fq instead of ../read1.fq. 
     Otherwise, the script will not work. 
    


    SPECIFIC:

    For Paired end NextSeq sequencing:

     ./NGS_missmatch_qc.pl nextseq_paired_end absolute_path/read1.fq absolute_path/read2.fq
    

    For single end NextSeq sequencing:

     ./NGS_missmatch_qc.pl nextseq_single_end absolute_path/read1.fq
    

    For paired end MiSeq sequencing:

     ./NGS_missmatch_qc.pl miseq_paired_end absolute_path/read1.fq absolute_path/read2.fq
    

    For single end MiSeq sequencing

     ./NGS_missmatch_qc.pl miseq_single_end absolute_path/read1.fq
    

    For paired end HiSeq sequencing

     ./NGS_missmatch_qc.pl hiseq_paired_end absolute_path/Lane1_1.fq absolute_path/Lane1_2.fq
     absolute_path/Lane2_1.fq absolute_path/Lane2_2.fq absolute_path/Lane3_1.fq absolute_path/Lane3_2.fq 
     absolute_path/Lane4_1.fq absolute_path/Lane4_2.fq absolute_path/Lane5_1.fq absolute_path/Lane5_2.fq 
     absolute_path/Lane6_1.fq absolute_path/Lane6_2.fq absolute_path/Lane7_1.fq absolute_path/Lane7_2.fq 
     absolute_path/Lane8_1.fq Lane8_2.fq 
    

    For single end HiSeq sequencing

     ./NGS_missmatch_qc.pl hiseq_single_end absolute_path/Lane1.fq absolute_path/Lane2.fq
    absolute_path/Lane3.fq absolute_path/Lane4.fq absolute_path/Lane5.fq absolute_path/Lane6.fq  
    absolute_path/Lane7.fq absolute_pathLane8.fq
    

    Results:

     See a jpg file named by the time of job submission