BIOL368/F11:DNA Microarrays: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(started this page with pasting in text from week 10 assignment)
 
(→‎Week 13: links to vibrio stats protocol)
 
(18 intermediate revisions by the same user not shown)
Line 1: Line 1:
<!--
{{BIOL368/F11}}
=== Introduction to DNA Microarrays ===
 
<div style="padding: 10px; width: 720px; border: 5px solid #000000;">


==== Read ==== 
== Background ==


* Brown, P.O. & Botstein, D. (1999) [http://www.nature.com/ng/journal/v21/n1s/full/ng0199supp_33.html Exploring the new world of the genome with DNA microarrays] ''Nature Genetics''  21: 33-37.
* Brown, P.O. & Botstein, D. (1999) [http://www.nature.com/ng/journal/v21/n1s/full/ng0199supp_33.html Exploring the new world of the genome with DNA microarrays] ''Nature Genetics''  21: 33-37.
* Campbell, A.M. and Heyer, L.J. (2003), “Chapter 4:  Basic Research with DNA Microarrays”, in ''Discovering Genomics, Proteomics, and Bioinformatics'', Cold Spring Harbor Laboratory Press, pp. 107-124. ([https://mylmuconnect.lmu.edu/webapps/portal/frameset.jsp?tab_tab_group_id=_2_1&url=%2Fwebapps%2Fblackboard%2Fexecute%2Flauncher%3Ftype%3DCourse%26id%3D_33586_1%26url%3D Available on MyLMUConnect])
* Campbell, A.M. and Heyer, L.J. (2003), “Chapter 4:  Basic Research with DNA Microarrays”, in ''Discovering Genomics, Proteomics, and Bioinformatics'', Cold Spring Harbor Laboratory Press, pp. 107-124. ([https://mylmuconnect.lmu.edu/webapps/portal/frameset.jsp?tab_tab_group_id=_2_1&url=%2Fwebapps%2Fblackboard%2Fexecute%2Flauncher%3Ftype%3DCourse%26id%3D_33586_1%26url%3D Available on MyLMUConnect])
* Dahlquist, K.D., Salomonis, N., Vranizan, K., Lawlor, S.C., & Conklin, B.R. (2002) [http://www.nature.com/ng/journal/v31/n1/full/ng0502-19.html GenMAPP, A New Tool for Viewing and Analyzing Microarray Data on Biological Pathways.] Nature Genetics 31: 19-20.
* DeRisi, J.L., Iyer, V.R., and Brown, P.O.  (1997)  [http://www.sciencemag.org/content/278/5338/680.full Exploring the Metabolic and Genetic Control of Gene Expression on a Genomic Scale.]  ''Science'' 278: 680-686.
* DeRisi, J.L., Iyer, V.R., and Brown, P.O.  (1997)  [http://www.sciencemag.org/content/278/5338/680.full Exploring the Metabolic and Genetic Control of Gene Expression on a Genomic Scale.]  ''Science'' 278: 680-686.
* [http://genomebiology.com/2003/4/1/R7 Doniger et al. (2003)]
* [http://www.biomedcentral.com/1471-2105/8/217 Salomonis et al. (2007)]
== Groups ==


==== Answer the following Discovery Questions from Chapter 4 ====
* Alex, Bobby, Zeb - ''Staphylococcus aureus'' MRSA252
* Chris, Nicki - ''Mycobacterium smegmatis''
* Isaiah, Sam - ''Helicobacter pylori''
 
== Week 10 ==
 
=== Introduction to DNA Microarrays ===
 
==== Answer the following Discovery Questions from Campbell & Heyer Chapter 4 ====


* Number 5 from p. 110:  Choose two genes from Figure 4.6 (PDF of figures on MyLMUConnect) and draw a graph to represent the change in transcription over time.
* Number 5 from p. 110:  Choose two genes from Figure 4.6 (PDF of figures on MyLMUConnect) and draw a graph to represent the change in transcription over time.
Line 29: Line 43:
** Alex, Bobby, Zeb - ''Staphylococcus aureus'' MRSA25
** Alex, Bobby, Zeb - ''Staphylococcus aureus'' MRSA25
* You may choose to work ahead towards this presentation by finding your Journal Club article and corresponding microarray dataset with which you will perform your project.  Your task is to find a published microarray dataset that measures gene expression from one of the following species:
* You may choose to work ahead towards this presentation by finding your Journal Club article and corresponding microarray dataset with which you will perform your project.  Your task is to find a published microarray dataset that measures gene expression from one of the following species:
** ''Saccharomyces cerevisiae'' (yeast)
** ''Escherichia coli'' K12
** ''Escherichia coli'' K12
** ''Helicobacter pylori''
** ''Helicobacter pylori''
Line 36: Line 49:
** ''Plasmodium falciparum''
** ''Plasmodium falciparum''
** ''Pseudomonas aerugenosa''
** ''Pseudomonas aerugenosa''
** ''Saccharomyces cerevisiae'' (yeast)
** ''Salmonella typhimurium''
** ''Salmonella typhimurium''
** ''Staphylococcus aureus'' MRSA252
** ''Staphylococcus aureus'' MRSA252
Line 46: Line 60:
** In addition, microarray data can sometimes be found as supplementary information with a journal article or on an investigator's own web site.  
** In addition, microarray data can sometimes be found as supplementary information with a journal article or on an investigator's own web site.  
* All journal club articles/microarray datasets are subject to approval by the instructor.
* All journal club articles/microarray datasets are subject to approval by the instructor.
-->
 
== Week 11 ==
 
Find your journal club article and microarray dataset and get approval from the instructor if you have not already done so.
 
=== Preparation for Next Week's Journal Club ===
 
In preparation for the Journal Club, each individual will do the following assignment on their individual [[BIOL368/F11:Week 11 | Week 11 Journal page]].
# Make a list of at least 10 biological terms for which you did not know the definitions when you first read the article.  Define each of the terms.  You can use the glossary in any molecular biology, cell biology, or genetics text book as a source for definitions, or you can use one of many available online biological dictionaries.  List the citation(s) for the dictionary(s) you use; a proper citation of a web site includes the URL and the date accessed.
# Write an outline of the article.  The length should be the equivalent of 2 pages of standard 8 1/2 by 11 inch paper.  Your outline can be in any form you choose, but you should utilize the wiki syntax of headers and either numbered or bulleted lists to create it.  The text of the outline does not have to be complete sentences, but it should answer the questions listed below and have enough information so that others can follow it.  However, your outline should be in YOUR OWN WORDS, not copied straight from the article.
#*What is the main result presented in this paper?  (Hint:  look at the last sentence of the introduction and restate it in plain English.)
#*What is the importance or significance of this work?
#*What were the limitations in previous studies that led them to perform this work?
#*What were the methods used in the study?
#** What samples did they collect and use for the microarray experiment?
#** How many microarray chips did they hybridize in the experiment?
#** Which samples were paired to hybridize on the chip?
#** Which was labeled red (Cy5)?  Which was labeled green (Cy3)?
#** How many replicates did they perform of each type?
#*** Biological replicates are made from entirely different biological samples.
#*** Technical replicates are made when one biological sample is split at a particular stage in the procedure and then carried through to the end of the procedure.
#** What do they say about how they performed each of the steps listed in the [[BIOL368/F11:DNA_Microarrays#Overview_of_DNA_Microarray_Analysis | Overview of Microarray Data Analysis]] section below?
#*Briefly state the result shown in each of the figures and tables.
#*How do the results of this study compare to the results of previous studies (See Discussion).
# Upload your completed PowerPoint slides to your journal page by the Week 11 journal deadline (you may make changes before your presentation Tuesday morning, but I will be evaluating the presenttion you upload.)
 
== Week 12 ==
 
Journal Club 3 Presentations will take place at the beginning of class.
 
=== Overview of DNA Microarray Analysis ===
 
This is a list of steps required to analyze DNA microarray data.
 
# Quantitate the fluorescence signal in each spot in the microarray image.
#* Typically performed by the scanner software, although third party software packages do exist.
#* The image of the microarray slide and this quantitation are considered the "raw-est" form of the data.
#* Ideally, this type of raw data would be made publicly available upon publication. 
#* In practice, the image data is usually not made available because the raw image file of one slide could be up to 100 MB in size.
#* Also, some journals do not require data deposition as a requirement for publication, so often published data are not actually available anywhere for download.
#* Microarray data is not centrally located on the web.  Some major sources are:
#** [http://www.ncbi.nlm.nih.gov/geo/ NCBI GEO]
#** [http://www.ebi.ac.uk/microarray-as/ae/ EBI ArrayExpress]
#** [http://smd.stanford.edu/ Stanford Microarray Database]
#** [http://puma.princeton.edu/ PUMAdb (Princeton Microarray Database)]
#** In addition, microarray data can sometimes be found as supplementary information with a journal article or on an investigator's own web site.
# Calculate the ratio of red/green fluorescence
# Log(base 2) transform the ratios
# Normalize the log ratios on each microarray slide
# Normalize the log ratios for a set of slides in an experiment
# Perform statistical analysis on the log ratios
# Compare individual genes with known data
# Look for patterns (expression profiles) in the data (many programs are available to do this)
# Perform Gene Ontology term enrichment analysis (we will use MAPPFinder for this)
# Map onto biological pathways (we will use GenMAPP for this)
 
=== Begin Microarray Data Analysis ===
 
==== Getting to know your microarray data ====
 
The task for this week is to download and organize the microarray data corresponding to your paper to get it ready for analysis next week.
# Go to the [http://www.ebi.ac.uk/arrayexpress/ ArrayExpress] site for your data and select "view all available files".
# Download the following files:
#* Experiment ReadMe
#* raw.zip
#* processed.zip
#* sdrf.xls
#* Array ReadMe
#* adf.xls
# Then upload these files to the OpenWetWare wiki and then link to them on your individual journal pages.  Only one member of your group needs to upload them, then both partners can just link to the same file.
# From the methods section of your microarray paper, you need to figure out the following:
#* What samples did they collect and use for the microarray experiment?
#* How many microarray chips did they hybridize in the experiment?
#* Which samples were paired to hybridize on the chip?
#* Which was labeled red (Cy5)?  Which was labeled green (Cy3)?
#* How many replicates did they perform of each type?
#** Biological replicates are made from entirely different biological samples.
#** Technical replicates are made when one biological sample is split at a particular stage in the procedure and then carried through to the end of the procedure.
# Record this information on your individual journal pages.  If you have this from your journal outline, you can copy and paste it into your new journal page.
# Using the sdrf.xls file, you need to then find the names of the files that correspond to the names of your samples from the paper.  Make a list that says which file corresponds to which sample.
# The instructor will then show you which columns of data to copy into a new Master spreadsheet.  You will upload this spreadsheet to the wiki and then link to it from your journal page.
 
== Week 13 ==
 
# Project work session to complete statistical analysis of your microarray data and MAPPFinder analysis.
#* You will use Excel to calculate the average log fold changes and perform a modified t test for your experimental groups.
#** We will be following the protocol on this page (with modifications): [[BIOL398-01/S10:Sample_Microarray_Analysis_Vibrio_cholerae | Sample Microarray Analysis for ''Vibrio cholerae'']].
#** You will then use GenMAPP and MAPPFinder to find out which Gene Ontology terms are over-represented in the data.
#* [[BIOL398-01/S10:GenMAPP and MAPPFinder Protocols | GenMAPP and MAPPFinder Protocols]]
# Your presentation for Week 14 will be formatted similarly to your previous research presentations.
#* Your presentation will be 15 minutes long (approximately 15 slides, one per minute).  Include:
#** Title slide
#** Outline slide
#** Background about your microarray dataset (from your journal club)
#** The experimental design of the microarray experiment
#** A table that lists the number of genes with significant changes in expression at the p value cut-offs of < 0.05, < 0.01, < 0.001, and Bonferroni-corrected p < 0.05.
#** A list of the top 10 "most significant" genes and their functions.
#** Tables that list the MAPPFinder results for increased and decreased gene expression.
#** Discussion and interpretation of your results, in comparison to the original journal club paper.
#* Upload your slides to the OpenWetware wiki by the Week 13 journal assignment deadline.  You may make changes to your slides in advance of your presentation, but you will be graded on what you upload by the journal deadline.
#** Use these [[Media:PresentationGuidelines.ppt‎ | Presentation Guidelines]] when preparing your PowerPoint slides.
#** Your presentation will also be graded on the following [[Media:PresentationCritiques.pdf | Guide to critiquing talks]].
<!--** For the groups working with ''Arabidopsis thaliana'', ''Staphylococcus aureus'', and ''Vibrio cholerae'', you will need to download the GenMAPP Gene Database for those species from [http://sourceforge.net/projects/xmlpipedb/files/ the XMLPipeDB SourceForge site].-->
 
== Week 14 ==
 
* Final project presentations in class.
* Course evaluations
 
</div>

Latest revision as of 12:31, 30 November 2011

BIOL368: Bioinformatics Laboratory

Loyola Marymount University

Home       People        HIV Evolution       HIV Structure       DNA Microarrays       Lionshare       Help  

Background

Groups

  • Alex, Bobby, Zeb - Staphylococcus aureus MRSA252
  • Chris, Nicki - Mycobacterium smegmatis
  • Isaiah, Sam - Helicobacter pylori

Week 10

Introduction to DNA Microarrays

Answer the following Discovery Questions from Campbell & Heyer Chapter 4

  • Number 5 from p. 110: Choose two genes from Figure 4.6 (PDF of figures on MyLMUConnect) and draw a graph to represent the change in transcription over time.
  • Number 6b. from p. 110: Look at Figure 4.7, which depicts the loss of oxygen over time and the transcriptional response of three genes. These data are the ratios of transcription for genes X, Y, and Z during the depletion of oxygen. Using the color scale from Figure 4.6, determine the color for each ratio in Figure 4.7b.
  • Number 7 from p. 110: Were any of the genes in Figure 4.7b transcribed similarly?
  • Number 9 from p. 118: Why would most spots be yellow at the first time point?
  • Number 10 p. 118 Go to http://www.yeastgenome.org and search for the gene TEF4; you will see it is involved in translation. Look at the time point labeled OD 3.7 in Figure 4.12, and find the TEF4 spot. Over the course of this experiment, was TEF4 induced or repressed? Hypothesize why TEF4’s gene regulation was part of the cell’s response to a reduction in available glucose (i.e., the only available food).
  • Number 11 from p. 120: Why would TCA cycle genes be induced if the glucose supply is running out?
  • Number 12 from p. 120: What mechanism could the genome use to ensure genes for enzymes in a common pathway are induced or repressed simultaneously?
  • Number 13 from p. 121: Given rule one on page 109, what color would you see on a DNA chip when cells had their repressor gene TUP1 deleted?
  • Number 14 from p. 121: What color spots would you expect to see on the chip when the transcription factor Yap1p is overexpressed?
  • Number 15 from p. 121: Could the loss of a repressor or the overexpression of a transcription factor result in the repression of a particular gene?
  • Number 16 from p. 121: What types of control spots would you like to see in this type of experiment? How could you verify that you had truly deleted or overexpressed a particular gene?

Finding a Journal Club Article/Microarray Dataset

  • Next week you will begin the DNA Microarray Project by preparing for your next Journal Club presentation that will take place in Week 12. You will work in groups of 2 or 3 on this project. Groups are:
    • Chris, Nicki - Mycobacterium smegmatis
    • Isaiah, Sam
    • Alex, Bobby, Zeb - Staphylococcus aureus MRSA25
  • You may choose to work ahead towards this presentation by finding your Journal Club article and corresponding microarray dataset with which you will perform your project. Your task is to find a published microarray dataset that measures gene expression from one of the following species:
    • Escherichia coli K12
    • Helicobacter pylori
    • Mycobacterium smegmatis
    • Mycobacterium tuberculosis
    • Plasmodium falciparum
    • Pseudomonas aerugenosa
    • Saccharomyces cerevisiae (yeast)
    • Salmonella typhimurium
    • Staphylococcus aureus MRSA252
    • Vibrio cholerae
  • Microarray data is not centrally located on the web. Some major sources are:
  • All journal club articles/microarray datasets are subject to approval by the instructor.

Week 11

Find your journal club article and microarray dataset and get approval from the instructor if you have not already done so.

Preparation for Next Week's Journal Club

In preparation for the Journal Club, each individual will do the following assignment on their individual Week 11 Journal page.

  1. Make a list of at least 10 biological terms for which you did not know the definitions when you first read the article. Define each of the terms. You can use the glossary in any molecular biology, cell biology, or genetics text book as a source for definitions, or you can use one of many available online biological dictionaries. List the citation(s) for the dictionary(s) you use; a proper citation of a web site includes the URL and the date accessed.
  2. Write an outline of the article. The length should be the equivalent of 2 pages of standard 8 1/2 by 11 inch paper. Your outline can be in any form you choose, but you should utilize the wiki syntax of headers and either numbered or bulleted lists to create it. The text of the outline does not have to be complete sentences, but it should answer the questions listed below and have enough information so that others can follow it. However, your outline should be in YOUR OWN WORDS, not copied straight from the article.
    • What is the main result presented in this paper? (Hint: look at the last sentence of the introduction and restate it in plain English.)
    • What is the importance or significance of this work?
    • What were the limitations in previous studies that led them to perform this work?
    • What were the methods used in the study?
      • What samples did they collect and use for the microarray experiment?
      • How many microarray chips did they hybridize in the experiment?
      • Which samples were paired to hybridize on the chip?
      • Which was labeled red (Cy5)? Which was labeled green (Cy3)?
      • How many replicates did they perform of each type?
        • Biological replicates are made from entirely different biological samples.
        • Technical replicates are made when one biological sample is split at a particular stage in the procedure and then carried through to the end of the procedure.
      • What do they say about how they performed each of the steps listed in the Overview of Microarray Data Analysis section below?
    • Briefly state the result shown in each of the figures and tables.
    • How do the results of this study compare to the results of previous studies (See Discussion).
  3. Upload your completed PowerPoint slides to your journal page by the Week 11 journal deadline (you may make changes before your presentation Tuesday morning, but I will be evaluating the presenttion you upload.)

Week 12

Journal Club 3 Presentations will take place at the beginning of class.

Overview of DNA Microarray Analysis

This is a list of steps required to analyze DNA microarray data.

  1. Quantitate the fluorescence signal in each spot in the microarray image.
    • Typically performed by the scanner software, although third party software packages do exist.
    • The image of the microarray slide and this quantitation are considered the "raw-est" form of the data.
    • Ideally, this type of raw data would be made publicly available upon publication.
    • In practice, the image data is usually not made available because the raw image file of one slide could be up to 100 MB in size.
    • Also, some journals do not require data deposition as a requirement for publication, so often published data are not actually available anywhere for download.
    • Microarray data is not centrally located on the web. Some major sources are:
  2. Calculate the ratio of red/green fluorescence
  3. Log(base 2) transform the ratios
  4. Normalize the log ratios on each microarray slide
  5. Normalize the log ratios for a set of slides in an experiment
  6. Perform statistical analysis on the log ratios
  7. Compare individual genes with known data
  8. Look for patterns (expression profiles) in the data (many programs are available to do this)
  9. Perform Gene Ontology term enrichment analysis (we will use MAPPFinder for this)
  10. Map onto biological pathways (we will use GenMAPP for this)

Begin Microarray Data Analysis

Getting to know your microarray data

The task for this week is to download and organize the microarray data corresponding to your paper to get it ready for analysis next week.

  1. Go to the ArrayExpress site for your data and select "view all available files".
  2. Download the following files:
    • Experiment ReadMe
    • raw.zip
    • processed.zip
    • sdrf.xls
    • Array ReadMe
    • adf.xls
  3. Then upload these files to the OpenWetWare wiki and then link to them on your individual journal pages. Only one member of your group needs to upload them, then both partners can just link to the same file.
  4. From the methods section of your microarray paper, you need to figure out the following:
    • What samples did they collect and use for the microarray experiment?
    • How many microarray chips did they hybridize in the experiment?
    • Which samples were paired to hybridize on the chip?
    • Which was labeled red (Cy5)? Which was labeled green (Cy3)?
    • How many replicates did they perform of each type?
      • Biological replicates are made from entirely different biological samples.
      • Technical replicates are made when one biological sample is split at a particular stage in the procedure and then carried through to the end of the procedure.
  5. Record this information on your individual journal pages. If you have this from your journal outline, you can copy and paste it into your new journal page.
  6. Using the sdrf.xls file, you need to then find the names of the files that correspond to the names of your samples from the paper. Make a list that says which file corresponds to which sample.
  7. The instructor will then show you which columns of data to copy into a new Master spreadsheet. You will upload this spreadsheet to the wiki and then link to it from your journal page.

Week 13

  1. Project work session to complete statistical analysis of your microarray data and MAPPFinder analysis.
  2. Your presentation for Week 14 will be formatted similarly to your previous research presentations.
    • Your presentation will be 15 minutes long (approximately 15 slides, one per minute). Include:
      • Title slide
      • Outline slide
      • Background about your microarray dataset (from your journal club)
      • The experimental design of the microarray experiment
      • A table that lists the number of genes with significant changes in expression at the p value cut-offs of < 0.05, < 0.01, < 0.001, and Bonferroni-corrected p < 0.05.
      • A list of the top 10 "most significant" genes and their functions.
      • Tables that list the MAPPFinder results for increased and decreased gene expression.
      • Discussion and interpretation of your results, in comparison to the original journal club paper.
    • Upload your slides to the OpenWetware wiki by the Week 13 journal assignment deadline. You may make changes to your slides in advance of your presentation, but you will be graded on what you upload by the journal deadline.

Week 14

  • Final project presentations in class.
  • Course evaluations