BioMicroCenter:RNA LIB

From OpenWetWare
Jump to: navigation, search

BioMicroCenter-header6.jpg

HOME -- SEQUENCING -- LIBRARY PREP -- HIGH-THROUGHPUT -- COMPUTING -- OTHER TECHNOLOGY
Wang ET, et al. Nature 2008

The BioMicro Center supports a broad variety of standard library preparation methods for RNAseq. The choice of method is highly dependent on the type of input, the amount of input RNA available, and the quality of the input RNA. The key in all RNAseq methods is the avoidance of ribosomal RNA, which would typically dominate the library preparation. Below area summary of the methods we utilize routinely in the core. For High-Throughput RNA library preparation, please check out our new page for methods designed specifically for large sample batches.

Please note some methods are currently in transition to try to improve data quality and reduce library preparation costs.

Amount of RNA Quality of RNA Method Recommended
>25ng RIN:9.0 Kapa Hyperprep
Illumina NeoPrep or Illumina Truseq (>100ng) - Retired 7/17
>1ug RIN:5.0 Kapa RiboErase
Epicenter RiboZero - only for non-human/mouse
10pg-25ng RIN:9.0 Clontech SMARTer v4
1ng-1ug RIN:5.0 Clontech Pico Ribosomal Depletion (ZapR).
smallRNA NA BIOO SmallRNA Kit or Qiagen miRNA kit.



Kapa mRNA Hyperprep

With the retirement of the Neoprep, the BioMicro Center is converting standard library preparation over to the Kapa mRNA hyperprep kit. This workflow is very similar to Illumina's TruSeq chemistry at lower cost and is streamlined for automation. This chemistry uses polyT beads to isolate the mRNA from the rRNA and tRNA. The use of these beads requires that the RNA be of very high quality or only the 3' end of transcripts will be isolated. Purified mRNA is then fragmented with metal and random priming is used to convert the sample to cDNA. Once double-stranded cDNA is generated, LMPCR is performed to create the indexed Illumina library. The BioMicro Center will be offering mRNA HyperPrep as a single sample reaction or in batches of 24 done on the TecanEvo 150s. We are also exploring reduced volume reactions using the Mosquito HV to further reduce costs.

TruSeq Chemistry
Sample Data

Kapa RiboErase & EpiCenter RiboZero

For samples with degraded RNA or samples where you are interested in looking at non-polyA RNAs, the BioMicro Center utilizes the Kapa RNA RiboErase for Human/Mouse samples and Epicenter RiboZero kit for other species. RiboErase uses RNAseH to degrade rRNAs while RiboZero uses magnetic beads coupled to rRNA sequences to remove these sequences from the solution. The remaining mRNA fragments can then be converted in to cDNA and prepared using the Kapa mRNA Hyperprep kit to produce the Illumina library.

Clontech SMARTseq Low-Input

For samples with less then 50ng of input, the BioMicro Center utilizes the Clontech SMARTseq v4 system. This system differs from the TruSeq chemistry in that it begins with cDNA generation using polyT priming followed by strand switching oligos. The use of polyT priming requires the RNA to be of high quality. Full length double-stranded cDNAs are generated and amplified by PCR. These cDNAs are then prepared into Illumina libraries using the NexteraXT chemistry from Illumina. Data from this system is of similar quality to samples created with Illumina TruSeq chemistry. Single samples can be prepared by hand. Batches of 24, 96 or 384 samples can be prepared using the Mosquito HV as a 1/8th reaction resulting in significantly lower costs/sample.

Clontech system.
Image from Clonetech

Clontech SMARTer Stranded Total RNA-Seq Kit - Pico Input Mammalian -- aka Clontech ZapR

For samples with less then 100ng of input and restricted input amounts, our kit of choice is the Clontech SMARTer Stranded Total RNAseq Kit - Pico Input -- or more simply, Clontech ZapR . This kit utilizes the same template switching as the v4 kit but uses random primers on fragmented RNA. The key is the ZapR enzyme which is used post library production to, in a targeted manner, cause breaks in Illumina library molecules that contain rRNA reads. These breaks make the rRNA containing molecules unreadable. Currently this chemistry is only available as single samples but we are working to adapt it to the Mosquito HV system.

Additional Chemistries Available in the BioMicro Center

Size Selection

For some applications of RNAseq, such as splice choice determination, having a precise knowledge of the insert size is critical. While the SPRIworks does provide some size selection (typically restricting fragments to between 150 and 350bp), this can be too wide for some methodologies. In these cases, after libraries are amplified, they can be run on the Sage BluePippin (either singly or pooled). Here the size distribution can be much tighter, with most of the DNA fragments being within a 50nt range.

Comparison of the RNAseq methods

The BioMicro Center has done testing in head-to-head competitions of the TruSeq, NuGEN v2 and Clontech kits. These data were presented at AGBT 2012 as part of a poster. The authors were: Avanti Shrikumar, Zachary Banks, Manlin Luo, Ryan Sinapius, Paola Favaretto, Jessica Hurt, Chris Burge, and Stuart S. Levine. Selections of the poster are shown below:

Summary

Sequencing of the transcriptome (RNAseq) has become an increasingly important tool in the molecular biology toolkit and is rapidly replacing microarrays as the primary method for determining genome-wide expression levels. Several vendors have created pre-packaged kits for creating RNAseq libraries for Illumina sequencing. These kits differ significantly in methodology and in the amounts of input required. Here we provide a head to head test of five different RNAseq kits in a core setting. The kits were evaluated on two experimental samples with similar expression patterns, murine embryonic stem cells and the same cells with a single factor knocked down by RNAi, to determine the sensitivity of each method. Each kit was additionally evaluated across three different concentrations of RNA input. We found that the different methodologies show different RPKM levels for each transcript and also vary in their technical reproducibility. The different methods resulted in small but largely distinct lists of differentially expressed genes that we compared to genes with known expression changes

AGBT12 1.png

Kit Type Home Made (Burge Lab) Illumina TruSeq NuGen v1 NuGen v2 Clontech SMARTer
1ug X X
100ng X X
10ng X* X X X
1ng X* X X
0.1ng X* X
Experimental Design: ES cells were transfected with siRNA targeting a splicing factor or a control siRNA. The tested splicing factor normally blocks inclusion of specific exons that had been previously identified by RT-PCR. Reduction of the splicing factor’s levels should lead to an increase in the amounts of these specific transcripts. RNA was collected from the cells and analyzed by RNA-seq. Samples were sequenced to a depth of at least 7.5m reads of 40nt length on either a GAIIx or a HiSeq2000 Testing Matrix: A single sample of control and splicing factor knockdown samples were serially diluted. Aliquots of the diluted samples were tested against the 5 methods to determine the ability of each kit to identify differentially expressed genes in a biologically challenging situation as well as to identify their sensitivity to different input amounts and quantify the amount of technical variation. * indicates the amount tested was below the minimum recommended input.
AGBT12 2.png
AGBT12 3.png
AGBT12 4.png
Fraction of reads aligned: Reads aligning once or multiple times to the mouse genome (mm9) are shown. Reads were aligned with Bowtie. NuGen samples show an increase in non-aligned reads at low input amounts. 3’ Bias:

Read densities were calculated along the exonic portions of each transcript. Transcripts were grouped by length and plotted as metagenes with both 5’ and 3’ ends locked. Clear 3’ bias can be observed in the samples processed using the Clontech kit.

Variation in RPKM score across the transcript.(right) The evenness of coverage within the transcript was measured by comparing the read density in largest two exons. Only exons greater than 200nt and genes with PRKMs over 10 were included in this analysis (n = 1,876 genes including 3,676 exons) (left) the average number of exons in the above data set with very low coverage (<5 reads) is shown.

Choosing a read length and read depth

RNAseq References

A few notable references. Please feel free to add more:

<biblio>

  1. Paper1 pmid=18978772
  2. Paper2 pmid=19015660
  3. Paper3 pmid=18550803
  4. Paper4 pmid=20711195

<\biblio>