Kasey E. O'Connor Week 9 Journal: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(add questions)
(answer questions)
Line 1: Line 1:
==Microarry Data Analysis==
==Microarry Data Analysis==
===Process===
For this assignment, I began with the raw GLN3 data. To start analyzing it, the numbers must first be scaled and centered so they can be more accurately compared to one another. To do this, I had to find the average and standard deviation for each trial of each time period. After the data was scaled and centered, I was able to perform statistical analysis on the data. I found the Average Log Fold among all the trials for each time period.  Then, I used that data to find the P-value for each gene at every time period. With this, I filtered out and calculated the number of genes with significant expression change based on predetermined P-values. Doing this allowed me to see the change in gene expression as a reaction to the cold shock, and determine if there was significant up or down regulation.
===Questions===
#The number of replicates for each time point in the data.
#The number of replicates for each time point in the data.
#*
#*There were four replications for each of the time periods: t15, t30, t60, t90, and t120.
#Why is the use of the dollar sign symbols in front of the number important?"
#Why is the use of the dollar sign symbols in front of the number important?"
#*
#*We must use the dollar sign symbols in front of the number to make sure that we are using the cell for average and standard deviation in the equation. Without it, Excel would take the data in incorrect cells as we copy and paste the master equation down throughout the whole column.
#How many genes have p value < 0.05?
#How many genes have p value < 0.05?
#*
#*t15: 781
t30: 1539
t60: 1559
t90: 538
t120: 564
#What about p < 0.01?
#What about p < 0.01?
#*
#*t15: 218
t30: 456
t60: 384
t90: 129
t120: 114
#What about p < 0.001?
#What about p < 0.001?
#*
#*t15: 21
t30: 55
t60: 51
t90: 9
t120: 16
#What about p < 0.0001?
#What about p < 0.0001?
#*
#*t15: 1
t30: 4
t60: 10
t90: 3
t120: 5
#How many of the genes are still significantly changed at p < 0.05 after the Bonferroni correction?
#How many of the genes are still significantly changed at p < 0.05 after the Bonferroni correction?
#*
#*t15: 1
#Keeping the "Pval" filter at p < 0.05, filter the "AvgLogFC" column to show all genes with an average log fold change greater than zero. How many meet these two criteria?
t30: 0
#*
t60: 2
t90: 1
t120: 0
#For time, t60, keeping the "Pval" filter at p < 0.05, filter the "AvgLogFC" column to show all genes with an average log fold change greater than zero. How many meet these two criteria?
#*760
#Keeping the "Pval" filter at p < 0.05, filter the "AvgLogFC" column to show all genes with an average log fold change less than zero. How many meet these two criteria?
#Keeping the "Pval" filter at p < 0.05, filter the "AvgLogFC" column to show all genes with an average log fold change less than zero. How many meet these two criteria?
#*
#*799
#Keeping the "Pval" filter at p < 0.05, How many have an average log fold change of > 0.25 and p < 0.05?
#Keeping the "Pval" filter at p < 0.05, How many have an average log fold change of > 0.25 and p < 0.05?
#*
#*727
#How many have an average log fold change of < -0.25 and p < 0.05?  
#How many have an average log fold change of < -0.25 and p < 0.05?  
#*
#*745
#Find NSR1 in your dataset. Is it's expression significantly changed at any timepoint? Record the average fold change and p value for NSR1 for each timepoint in your dataset.
#Find NSR1 in your dataset. Is it's expression significantly changed at any timepoint? Record the average fold change and p value for NSR1 for each timepoint in your dataset.
#*
#*Average Fold Change
#**t15: 1.2
t30: 1.98
t60: 1.96
t90: -0.75
t120: -0.63
#*P-Value
#**t15: 0.0046
t30: 0.0180
t60: 0.0151
t90: 0.0676
t120: 0.1061
#*For t15, t30, and t60, there is significant change in expression at the p < 0.05 level, however, only t15 is significant at the p < 0.01 level.
#Which gene has the smallest p value in your dataset (at any time point)? Why do you think the cell is changing this gene's expression upon cold shock?
#Which gene has the smallest p value in your dataset (at any time point)? Why do you think the cell is changing this gene's expression upon cold shock?
#*
#*SFH5 (YJL145W) has the smallest P-Value in the data set at time t90. This gene is responsible for protein transport into the plasma membrane, as well as transfer from the Golgi body. It would make sense that this gene is down regulated during recovery because as the cell is recuperating after the cold shock, most of the effort will be within the cell to repair, and there will be less need to bring molecules into the cell.


====Useful Links====
====Useful Links====
{{Kasey E. O'Connor}}
{{Kasey E. O'Connor}}

Revision as of 14:57, 2 April 2013

Microarry Data Analysis

Process

For this assignment, I began with the raw GLN3 data. To start analyzing it, the numbers must first be scaled and centered so they can be more accurately compared to one another. To do this, I had to find the average and standard deviation for each trial of each time period. After the data was scaled and centered, I was able to perform statistical analysis on the data. I found the Average Log Fold among all the trials for each time period. Then, I used that data to find the P-value for each gene at every time period. With this, I filtered out and calculated the number of genes with significant expression change based on predetermined P-values. Doing this allowed me to see the change in gene expression as a reaction to the cold shock, and determine if there was significant up or down regulation.

Questions

  1. The number of replicates for each time point in the data.
    • There were four replications for each of the time periods: t15, t30, t60, t90, and t120.
  2. Why is the use of the dollar sign symbols in front of the number important?"
    • We must use the dollar sign symbols in front of the number to make sure that we are using the cell for average and standard deviation in the equation. Without it, Excel would take the data in incorrect cells as we copy and paste the master equation down throughout the whole column.
  3. How many genes have p value < 0.05?
    • t15: 781

t30: 1539 t60: 1559 t90: 538 t120: 564

  1. What about p < 0.01?
    • t15: 218

t30: 456 t60: 384 t90: 129 t120: 114

  1. What about p < 0.001?
    • t15: 21

t30: 55 t60: 51 t90: 9 t120: 16

  1. What about p < 0.0001?
    • t15: 1

t30: 4 t60: 10 t90: 3 t120: 5

  1. How many of the genes are still significantly changed at p < 0.05 after the Bonferroni correction?
    • t15: 1

t30: 0 t60: 2 t90: 1 t120: 0

  1. For time, t60, keeping the "Pval" filter at p < 0.05, filter the "AvgLogFC" column to show all genes with an average log fold change greater than zero. How many meet these two criteria?
    • 760
  2. Keeping the "Pval" filter at p < 0.05, filter the "AvgLogFC" column to show all genes with an average log fold change less than zero. How many meet these two criteria?
    • 799
  3. Keeping the "Pval" filter at p < 0.05, How many have an average log fold change of > 0.25 and p < 0.05?
    • 727
  4. How many have an average log fold change of < -0.25 and p < 0.05?
    • 745
  5. Find NSR1 in your dataset. Is it's expression significantly changed at any timepoint? Record the average fold change and p value for NSR1 for each timepoint in your dataset.
    • Average Fold Change
      • t15: 1.2

t30: 1.98 t60: 1.96 t90: -0.75 t120: -0.63

    • P-Value
      • t15: 0.0046

t30: 0.0180 t60: 0.0151 t90: 0.0676 t120: 0.1061

    • For t15, t30, and t60, there is significant change in expression at the p < 0.05 level, however, only t15 is significant at the p < 0.01 level.
  1. Which gene has the smallest p value in your dataset (at any time point)? Why do you think the cell is changing this gene's expression upon cold shock?
    • SFH5 (YJL145W) has the smallest P-Value in the data set at time t90. This gene is responsible for protein transport into the plasma membrane, as well as transfer from the Golgi body. It would make sense that this gene is down regulated during recovery because as the cell is recuperating after the cold shock, most of the effort will be within the cell to repair, and there will be less need to bring molecules into the cell.

Useful Links