AninditaVarshneya BIOL368 Week 14
- What database did you access? (link to the home page of the database)
- I accessed the eggNOG database
- What is the purpose of the database?
- The purpose of this database is to provide orthologous groups (OGs) of proteins at different taxonomic levels, and provide functional annotations. The annotations in eggNOG were recently expanded to include Gene Ontology terms, KEGG pathways, and SMART/Pfam domains.
- What biological information does it contain?
- It contains information about orthologous groups and functional annotations called evolutionary genealogy of genes: Non-supervised Orthologous Groups (eggNOG) using graph-based clustering algorithm.
- What species are covered in the database?
- 2031 eukaryotic and prokaryotic organisms are organisms are included in the database with an additional 1655 prokaryotes. Some key groups of species it had information on include Drosophilidae, Chordata, Fungi, Methanobacteria, and many more.
- What biological questions can it be used to answer?
- The data in this database could be used to analyze functional divergence of different sequences because evolutionary forces affect class of sequences differently.
- What type (or types) of database is it (sequence, structure model organism, or specialty [what?]; primary or “meta”; curated electronically, manually [in-house], manually [community])?
- It is a sequence database with additional information regarding annotations of that sequence, raw algs, trimmed algs, trees, and hmm. Sequences are collected from other public databases (meta). Sequences are curated manually in-house using quality controls (that they did not explicitly specify). Algs is the abbreviation that eggNOG uses to refer to alignments. HMM is an abbreviation for hidden Markov models which is a statistical model used by the database.
- What individual or organization maintains the database?
- The database is maintained by the Computational Biology group - EMBL, Heidelberg and the EggNOG database Team. The main researcher in the eggNOG database is Sean Powell.
- What is their funding source(s)?
- European Commission MetaCardis [FP7-HEALTH-305312]; International Human Microbiome [HEALTH-FP7-2010-261376]; LTFCOFUND2013 [PCOFUND-GA-2013-609409]; European Research Council CancerBiome project [contract number 268985]; European Molecular Biology Organization [ALTF 721-2015]; CellNetworks (Excellence Initiative of the University of Heidelberg); Novo Nordisk Foundation [NNF14CC0001]; European Molecular Biology Laboratory (EMBL)
- Is there a license agreement or any restrictions on access to the database?
- There is no license agreement or restrictions on access to the database. It is funded for open access via EMBL. eggNOG data is available through the Creative Commons Attribution License which means that no restrictions can occur that prevent users from using, distributing, or reproducing any data in this database, so long as the original work is cited.
- How often is the database updated? When was the last update?
- The database is updated every couple months to a year. The last update was for version 4.5 in October 2015. Before that, updates came out on May 2015, Dec 2013, and Nov 2011.
- Are there links to other databases?
- Can the information be downloaded? And in what file formats?
- For each organism, members, annotations, raw_algs, trimmed_algs, trees, and hmm data can be downloaded. All data is downloaded with a .fa file type, which is a FASTA format file type. All files come double compressed.
- Evaluate the “user-friendliness” of the database.
- Is the Web site well-organized?
- The site is very well-organized. It is easy to access the sequence data from any page on the website because they have several different search menu options, including one in the top bar of the page that is there no matter which page you are on. The website also provides an opportunity to do specific sequence searches in FASTA format by copying and pasting the sequence or by uploading a compatible file. Unfortunately, multiple-sequence FASTA files are not allowed. Even though the website is easy to maneuver through, it doesn't make itself accessible to non-specialist scientists because it uses several abbreviations and acronyms throughout the website, but does not provide any documentation regarding what those abbreviations/acronyms mean. However, in defense of the database, those who are using the database for OGs must have some level of background knowledge so the database doesn't necessarily need
- Does it have a help section or tutorial?
- This database does provide a brief tutorial that outlines major methods that users should use to access annotations from the various programs with which the database is interacting.
- Run a sample query. Do the results make sense?
- Running a sample query is incredibly simple. The search menu in the top bar of the page provides options for similar OGs when you start typing a query. The results from the search provide information about the number of proteins and species with that OG as well as details about which ortholog different organisms have. Clicking on the tabs underneath the results summary provides more detail. For example, selecting the taxonomic profile tab provides users with GO terms, KEGG pathways, and domains that are associated with their search. My only criticism of this feature is that the database doesn't link the GO terms and KEGG pathways to the original website so users can learn more about those terms and pathways. Users can also access functional profiles, alignment data, and phylogenetic tree data. Directly from search results, users can also choose to download particular types of data.
- The following data is provided directly under the results summary.
- Is the Web site well-organized?
- Mannoproteins-mannoproteins are defined as glycoproteins that contain 15 to 90% mannose by weight.
- Trehalose biosynthesis - is a two step process in which glucose 6-phosphate plus UDP-glucose is converted to alpha,alpha-trehalose 6-phosphate by trehalose-6-phosphate synthase (TPS), and then alpha,alpha-trehalose 6-phosphate and water are converted to trehalose and phosphate by trehalose-6-phosphate phosphatase (TPP).
- Prototrophic: “having the nutritional requirements of the normal or wild type”
- Cryostat: “an apparatus for maintaining a constant low temperature especially below 0°C”
- Orthologues: A homologous gene that is related to those in different organisms by descent from the DNA of a common ancestor and that may or may not have the same function.
- Permeases: Any of a group of membrane-bound carriers (enzymes) that effect the transport of solute through a semipermeable membrane; this term is not typically used to describe eukaryotes.
- chemostat:an apparatus for growing bacterial cultures at a constant rate bycontrolling the supply of nutrient medium
- trehalose: trehalose was visualized as a storage molecule, aiding the release of glucose for carrying out cellular functions
- Hypergeometric distribution: “a probability function f(x) that gives the probability of obtaining exactly x elements of one kind and n − x elements of another if n elements are chosen at random without replacement from a finite population containing N elements of which M are of the first kind and N − M are of the second kind and that has the form”
- Immunoprecipitation: “precipitation resulting from interaction of specific antibody and antigen.”
- Sphingolipids: Any of a group of lipids, such as the ceramides, that yield sphingosine or its derivatives upon hydrolysis. They are major components of cell membranes and play a role in signaling and regulatory functions.
- Desaturase: An enzyme that converts an unsaturated fatty acid chain to one that includes at least one carbon-carbon double bond.
- Diurnal: recurring every day, having a daily cycle
- Homeoviscous: A compositional adaptation of membrane lipids that serves to maintain the correct membrane fluidity under new conditions.
- Ceramidase: catalyze hydrolysis of ceramides to generate sphingosine (SPH), which is phosphorylated to form sphingosine-1-phosphate (S1P).
- DEAD-box proteins: proteins that are ubiquitous in RNA-mediated processes and function by coupling cycles of ATP binding and hydrolysis to changes in affinity for single-stranded RNA.
- Nitrogen Catabolite Repression: A transcription regulation process in which the presence of one nitrogen source leads to a decrease in the frequency, rate, or extent of transcription of specific genes involved in the metabolism of other nitrogen sources.
- Upregulation: An increase in the number of receptors on the surface of target cells, making the cells more sensitive to a hormone or another agent.
- Shake-flask Culture: Shake flask fermentation is nothing but the fermentation carried out in a shake flasks, in particular Erlenmeyer flask.
- Cis-regulatory sequences: such as enhancers and promoters, control development and physiology by regulating gene expression.
- Transcriptomics: the study of transcriptomes and their functions
- Coordinate regulation: regulation of expression of several different genes at once
- Translocation: 1. Transposition of two segments between nonhomologous chromosomes as a result of abnormal breakage and refusion of reciprocal segments. 2. Transport of a metabolite across a biomembrane.
- Spargeing: is a technique which involves bubbling a chemically inert gas, such as nitrogen, argon, helium, through a liquid.
- Exogenous: developed or originating outside the organism.
- Ergosterol: a sterol occurring mainly in yeast and forming ergocalciferol (vitamin D2) on ultraviolet irradiation or electronic bombardment.
- Oleate: 1. A salt or ester of oleic acid. 2. A solution of an alkaloid or other basic drug in oleic acid.
- DBP2: ATP-dependent RNA helicase of the DEAD-box protein family; has strong preference for dsRNA; interacts with YRA1; required for assembly of Yra1p, Nab2p and Mex67p onto mRNA and formation of nuclear mRNP; involved in mRNA decay and rRNA processing; may be involved in suppression of transcription from cryptic initiation sites
- Transcriptomics: Transcriptomics is the study of the transcriptome—the complete set of RNA transcripts that are produced by the genome, under specific circumstances or in a specific cell—using high-throughput methods, such as microarray analysis.
- Bonferroni Correction: The Bonferroni correction is an adjustment made to P values when several dependent or independent statistical tests are being performed simultaneously on a single data set. To perform a Bonferroni correction, divide the critical P value (α) by the number of comparisons being made.
- What is the main result presented in this paper?
- The results of this paper presented data that showed that, in comparison to cold shock and batch culture tests, the amount of transcription of environmental stressor genes was reduced at 12 degrees Celsius. Also, the results of this experiment showed that in steady-state low-temperature conditions, trehalose is not involved in transcription processes as it is in cold shock adaptation.
- What is the importance or significance of this work?
- The significance of this paper is that it shows that there is a large difference between the chemostat-based transcriptome data of this experiment and data presented in other literature. In regards to the transcriptional reprogramming of long-term low-temperature acclimation and transcriptional responses to a rapid transition to low temperature
- What were the limitations in previous studies that led them to perform this work?
- In one previous study conducted by Sahara et al., cold shock was used instead of a slow transition to low temperature. This entails a rapid decrease in temperature to between 10-20 degrees Celsius. In another paper by Homma et. al., cold shock was also once again used, but in this experiment the temperature was dropped to below 10 degrees Celsius.
- These studies presented data that showed inconsistencies in terms of the expression of ribosomal expression genes.
- Transcription of trehalose genes is common in the presence of near freezing conditions and can be done away with only when in presence temperatures below 10 degrees. However, **Kandror et. al. performed cold shock on mutant samples in temperatures between 10-20 degrees Celsius, which did not have trehalose present and yet showed no growth defects or viability loss.
- The Msn2p/msn4p complex has been identified to be consistent with cold shock response, but no low temperature-specific transcriptional network has been identified so far.
- All previous studies have been performed using the batch culture method, which is not optimal for this study which intends on studying the prolonged effects of low temperatures. The batch culture method makes it difficult to discern between the effect temperature has on the transcription of cells and the specific growth rate of cells.
- How did they treat the yeast cells (what experiment were they doing?)
- Draw a diagram or flow chart of the experimental design.
- What strain(s) of yeast did they use? Were the strain(s) haploid or diploid?
- They used the Saccharomyces cerevisiae strain CEN.PK113-7D (MATa) provided by P. Kötter (Institut für Mikrobiologie, J. W. Goethe Universität Frankfurt, Frankfurt, Germany).
- This strain was a prototropic, haploid reference.
- What media did they grow them in? What temperature? What type of incubator? For how long?
- Cultures were grown in a defined synthetic medium that was limited by carbon or by nitrogen with all other growth requirements in excess.
- Cultures were grown at 12 degrees Celsius and 30 degrees Celcius.
- Biomass dry weight, metabolites, dissolved oxygen, and gas profiles were constant for at least three volume changes before sampling.
- The cultures were grown in an incubator with stirring capabilities (at 600 rpm). The pH of the medium was also maintained at 5 with automatic addition of 2M KOH by the Applikon ADI 1030 Biocontroller.
- What controls did they use?
- Cultures in the 30 degrees Celsius functioned as their control because they were exposed to normal conditions. The cultures were provided will full nitrogen and carbon.
- How many replicates did they perform per treatment or timepoint?
- The results for each condition were identified through three independently cultured replicates.
- What method did they use to prepare the RNA, label it and hybridize it to the microarray? (very brief description)
- Preparing, labeling, and hybridization of the RNA to the microarray in the experiment was performed through the same methods described by Piper et al. (2002), which uses the same methods outlined in the Affymetrix users’ manual.
- What mathematical/statistical methods did they use to analyze the data? (very brief description)
- The researchers examined significance analysis of microarrays, Venn diagrams, heat map visualizations, promoter analysis, statistical analysis of overrepresentation of GO biological processes categories, as well as Overrepresentation of transcription-factor binding sites
- Are the data publicly available for download? From which web site?
- Yes, the data for the microarray can be found at the Genome Expression Omnibus database (http://www.ncbi.nlm.nih.gov/geo/) under the series number GSE6190.
- Briefly state the result shown in each of the figures and tables.
- Table 1: This table shows numerous characteristics of the yeast (Saccharomyces cerevisiae) at varying conditions, such as temperature and limiting nutrient. The results show that growth rates remain largely unaffected by temperature variation.
- Table 2: This table shows levels of nitrogen, protein, trehalose, and glycogen content stored in the yeast based on two different temperatures and two different limiting reagents (glucose and ammonium). The table shows that there are markedly higher levels of cell protein as well as nitrogen content in ammonium limited yeasts at 12 degrees Celsius compared to 30 degrees Celsius.
- Table 3: The table reveals a number of overrepresented binding motifs related to specific regulatory clusters in response to low temperature. The second part of the table shows a list of transcription factor binding targets organized by regulatory cluster and the factor that binds to the target.
- Figure 1: This figure shows a venn diagram comparing the numbers of genes that are temperature responsive in either a carbon or nitrogen limited chemostat culture. The diagram shows that there are more temperature responsive genes that react in a nitrogen limited environment compared to a carbon limited one. A total of 235 genes respond to the stress of low temperature regardless of the limiting nutrient.
- Figure 2: Heat map showing the specific genes that are up and downregulated in response to temperature change as well as nutrient limitation that allows for acclimation of the yeast to stress conditions.
- Figure 3: A collection of genes that are transcriptionally regulated in response to low temperatures in three different studies from the past. A total of 259 genes common to all three batches. The second part of the figure shows a heatmap of these 259 genes, broken down by which are upregulated (91), downregulated (48), and differentially regulated (120)
- Figure 4: A heat map comparison between the chemostat cultures to the batch cultures from 3 previous studies. Overlapping genes showing the same temperature response were mapped out, and two heat maps were produced, one for upregulated genes and one for downregulated genes. The brackets show genes that show the same response across all 4 studies.
- Figure 5: Two studies that examined changes in gene regulation in response to a decrease in growth rate are compared to both the batch studies and the present study for overlap in gene regulation. The result show a very small number of genes that overlap in response to both temperature decrease as well as growth rate decrease.
- Figure 6: The venn diagrams show a comparison of genes expressed in three different studies, this one, the batch studies, and the ESR study. Overlap shows that one third of low temperature response genes in the batch studies are attributed to ESR. An opposite response was shown for 233 genes in low temperature compared to the results of the microarray study.
- How does this work compare with previous studies?
- Comparing the layout and content to other studies there are several points that make this paper distinguishable from other similar previous studies. For frame of reference the first three listed references of the paper were accessed to make these assessments. First, the length of the introductions of other papers were more concise yet didn’t consistently describe objectives, motivations and/pr methods conducted in the study. There are several merits to this approach. First of which it keeps the reader more engaged and prevents the paper from becoming muddled with rhetoric or otherwise unnecessary disclosures. The downside of having a concise introduction is that it may take the reader a longer time to read because information was missing from the introduction so when points are seemingly haphazardly introduced in the paper it slows the reader down because that bigger picture was not set up.
- What are the important implications of this work?
- This experiment shows that transcriptional responses to low temperature and low specific growth rate are linked when it comes to batch cultures and can be discerned when using chemostat cultures.
- Slow rate low temperature acclimation of yeast does not solely involve transcriptional reprogramming.
- This study also shows the significance of batch culture vs. chemostat cultures when it comes to discriminating between phases of physiological adaptation.
- What future directions should the authors take?
- The authors made a comment about the persistence of residual concentrated glucose of in the cultures grown at 12°C. They refer to the possible solution by using glucose-limited cultures as a way to ameliorate those issue but the comparison of glucose-limited cultures only would have led to what the authors call a “contamination” of the culture's temperature-responsive gene sets because those genes' transcriptions are influenced by glucose. It seems difficult to believe that this obstacle is unavoidable and the authors should find a new way for refining the data such that the clarity of the data isn’t compromised. After reading further an interesting direction that the authors can take is one that they mention which is to investigate how temperature responses in S. cerevisiae are influenced by the availability of oxygen.
- Give a critical evaluation of how well you think the authors supported their conclusions with the data they showed. Are there any major flaws to the paper?
- It was interesting to see a paper presented on organism performance in terms of a cold environment given that heat shock is the more popular research topic. The chemostat approach was inventful in the effort to describe the transcriptional “acclimation” to steady-state growth under a temperature-driven environmental conditions.
Revised Presentation Slides
My group, Will Fuchs, Isai Lopez, and Shivum Desai, worked together to complete all aspects of this assignment in class on 12/29, and outside of class on 12/4 and 12/5 for several hours. Thank you to Dr. Kam D. Dahlquist for helping me complete this assignment. This group journal entry was completed by every member of my group equally, and was not copied from another source.
- Huerta-Cepas, Jaime, et al. "eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences." Nucleic acids research (2015): gkv1248.
- EggNOG Database
- Week 14 Assignment
- Tai, S. L., Daran-Lapujade, P., Walsh, M. C., Pronk, J. T., & Daran, J. M. (2007). Acclimation of Saccharomyces cerevisiae to low temperature: a chemostat-based transcriptome analysis. Molecular Biology of the Cell, 18(12), 5100-5112. doi: 10.1091/mbc.E07-02-0131
- Slides we revised: PowerPoint slides for Tai et al. (2007)
User Page: Anindita Varshneya
Bioinfomatics Lab: Fall 2016
Class Page: BIOL 368-01: Bioinfomatics Laboratory, Fall 2016
|Weekly Assignments||Individual Journal Assignments||Shared Journal Assignments|
Links: Electronic Lab Notebook