September 16, 2011
2012 Jan 19
PRC2 core function: suz12, eed, ezh2, jarid2, pct2/mtt2
function of suz12? suz12gt and wt chip data comparison: 2420 gens lose h3k27me3, 970 genes maintain h3k27me3.
pick prc2 components from Joe and other RNA-seq and chip-seq datasets
generate a sub-network for all the components
hoxD cluster genes? 1,
suz12gt, suz12gt-bgal-kd , ezh1 kd, ezh2 kd,
Paul 2012 Jan 10
intro active silent poised expressionfigure
h2a.z interacting partners hmgn2 interaction with h2a.z
a. Use hmgn2 expression in as many tissues as possible b. Use mass spectrum to identify protein interaction partners c.
hmg proteins hmgn1, 2, 3a, 3b, 4, and 5 why hmgn? hmgn2 chip-seq peaked at tts how is it calculated? distribution of chip-seq data, percentage of binding sites, promoter 1000bp 42.4% promoter 1000-2000bp 2.4% distal intergenic 19.5% intron 14.3%
h2a.z knock down chip-seq analysis use chip-seq - great for enrichment analysis k-mean clustering use 6 or 7 clusters, genes vs tss distance, enrichment analysis of clusters not done yet.
a small compendium of chip-seq cluster analysis? try mutual information
meet with Paul and the team to answer questions consequence of h2a.z and hgmn ? mechanism of h2a.z and hgmn, tss? Promoter? Time or location? confirmation, take a target set to do pcr?
August 30, 2011
Identify differentially regulated lncRNAs during cardiomyocyte differentiation using the sequencing files from the cardiomyocyte differentiation time course. Move forward in further analyzing these data to create essentially Figure 1 of the next paper.
Global analysis of differential lncRNA expression during cardiomyocyte differentiation.
The lncRNA annotation file which has been sorted from all lncRNAs contained in ENSEMBL and from the Guttman et al. Nature paper that can be used to identify lncRNAs in the cardiomyocyte expression data sets.
Some ideas for moving forward:
- Cluster lncRNA data to find stage-specific expression patterns for the lncRNAs. What is the best representation for this?
- Cluster lncRNA with the rest of the expression data to determine broader clusters.
- Determine potential pathways (GO, Ingenuity, Gene Set Enrichment Analysis?) based on the broader clusters.
- Compare chromatin patterns to the lncRNA expression cluster data (we have these data as part of the cardiac consortium)
Use these data to identify candidate lncRNAs for further genetic analysis and to derive informative data from the expression analysis in order to learn more about the function of these lncs and to possibly learn more about their regulation.
August 29, 2011
EB differentiation time course
A report was delivered on Aug. 29, 2011. The missing experimental dataset at D4 will be conducted and included in the next round of analysis.
August 16, 2011
EB differentiation time course
Analyze the RNA-Seq data generated for the lncRNA (lnc011) knockdown (kd) and control ESC lines (0d) that were differentiated into EBs (6d and 9d).
The goal is to display the differentially expressed genes in a figure and to further analyze the genes that are mis-regulated as a function of the kd relative to the control (scrambled). Heat map or more informative presentation of the data is expected.
To learn more about how lnc (lnc11) may control cell fate specification by "plotting" the expression of the mis-regulated genes (in the GO categories) along the cardiomyocyte differentiation pathway using the RNA-Seq data for the various time points (namely D0, D4, D5.3, and D10).
Tables attached including fold change relative to control.
The file (all.ComparisonExpn.txt) contains counts and FPKM that was generated with the sequencing files.
The big spreadsheet is with all the tests for differential expression. There are some semi-redundant columns for counts and fpkms. If there are no reads for d0 and d4, there won't be any counts or a stat test, but if there are reads for d5.3, then day0 and day4 will be shown as 0 and a stat test will be conducted.
Ideally generate a GO-type figure and gene network figure possibly by selecting the genes included in over-represented GO categories. The preliminary GO analysis via GOStat (attached) using a 1.5x cut-off for "down-regulated genes" shows enrichment for categories that have roles in heart function including muscle contraction, sarcomere function, heart development, blood circulation, etc.
The report was delivered on Aug. 29, 2011. The missing experimental dataset at D4 will be conducted and included in the next round of analysis.