Richard Brous Week 7

From OpenWetWare
Jump to navigationJump to search

Richard Brous

  1. Heidelberg JF, Eisen JA, Nelson WC, Clayton RA, Gwinn ML, Dodson RJ, Haft DH, Hickey EK, Peterson JD, Umayam L, Gill SR, Nelson KE, Read TD, Tettelin H, Richardson D, Ermolaeva MD, Vamathevan J, Bass S, Qin H, Dragoi I, Sellers P, McDonald L, Utterback T, Fleishmann RD, Nierman WC, White O, Salzberg SL, Smith HO, Colwell RR, Mekalanos JJ, Venter JC, and Fraser CM. DNA sequence of both chromosomes of the cholera pathogen Vibrio cholerae. Nature. 2000 Aug 3;406(6795):477-83. DOI:10.1038/35020000 | PubMed ID:10952301 | HubMed [Paper3]

Preparation for Next Week's Journal Club - 10 Definitions

  1. Pathogenicity - The ability of a parasite to inflict damage on the host. link
  2. Plasmid - A linear or circular double-stranded DNA that is capable of replicating independently of the chromosomal DNA. link
  3. Aetiological - Causing a disease or a pathological condition . As in: etiological agent of a disease. link
  4. Antigen - Any of the various substances that when recognized as non-self by the adaptive immune system triggers an immune response, stimulating the production of an antibody that specifically reacts with it. link
  5. Aldehyde - a carbon atom double-bonded to an oxygen, single-bonded to a hydrogen, and single-bonded to another chemical group (such as methane, benzene, another hydrogen, anything). link
  6. Anion - a negatively-charged ion. link
  7. Nucleoside - link
    1. Definition - A nitrogenous base (purine or pyrimidine) bound to a pentose sugar ribose or deoxyribose.
    2. Definition - A glycoside formed from the hydrolysis of nucleic acid.
    3. Supplement - A phosphate group attached to a nucleoside would make a nucleotide.
  8. Solute - link
    1. Definition - a component of a solution: in a solution, the dissolving substance is called a solvent whereas the dissolved substance is called a solute.
    2. Definition - a substance (usually in lesser] amount) dissolved in another substance. A typical example of a solution is sugar dissolved in water: sugar is the solute and water is the solvent.
  9. Catabolic - (Science: biochemistry) Of or pertaining to catabolism; as, catabolic processes, which give rise to substances (catastates) of decreasing complexity and increasing stability. link
  10. Reductase - An enzyme that catalyses a reduction; since all enzymes catalyze reactions in either direction, any reductase can, under the proper conditions, behave as an oxidase and vice versa, hence the term oxidoreductase. For individual reductase's, see the specific names. link

Preparation for Next Week's Journal Club - Outline of Article

  1. Abstract
    1. Completed the genome sequencing of Proteobacterium Vibrio cholerae
    2. 2 circular chromosomes, together encode 3,885 ORF's
      1. large chomosome
        1. 2,961,146 bp
        2. Most recognizable genes for important cell functions and pathogenicity
      2. small chromosome
        1. 1,072,314
        2. Contains higher percentage of hypothetical genes plus, a gene capture system and host addiction genes
        3. Suggestion that small chromosome was a megaplasmid captured far in the past by an ancestral Vibrio cholerae
    3. MAIN RESULT - Sequence provides initial starting point to understand this bacterial pathogen affecting humans
  2. What vibirio cholera is
    1. Vibrio cholerae carries human disease Cholera
    2. Many known epidemics throughout recorded history
    3. Many strains
      1. Pathogenic
      2. Non-Pathogenic
    4. Present in oceans, coastal waters and estuaries
    5. IMPORTANCE - Article is jumping point to analyse how the pathogenic strain affects humans by horizontal gene transfer
  3. Genomic comparitive analysis
    1. Sequencing performed using the the whole genome random sequencing model
    2. Two circular chromosomes with 3,885 predicted ORF's
    3. 792 predicted Rho-independent terminators
      1. Larger
        1. 2,961,146 bp
        2. Av G+C 46.9%
        3. Growth and viability genes (some on smaller tho)
        4. 2,770 ORF, 599 Rho-independent terms
      2. Smaller
        1. 1,072,314 bp
        2. Av G+C 47.7%
        3. small number genes used for cell function
        4. intermediary metabolic pathways encoded only here
        5. 1,115 ORF, 193 Rho-independent terms
  4. Results
    1. Figure 1 - Linear representation of the V. cholerae chromosomes
      1. Location of predicted coding regions
        1. coded by:
          1. biological role
          2. RNA genes
          3. tRNA
          4. other RNA
          5. Rho-independent terminators
          6. VCR's
        2. Transcription direction of each predicted region show by arrows
        3. Numbers next to tRNA indicate number of tRNA at a locus
        4. Numbers next to GES indicate number of predicted membrane-spanning domains
          1. GES = Goldman, Engelman and Steitz scale, calculated by TopPred for the protein
        5. Genes names available from TIGR
    2. Figure 2 - Circular representation of the V. cholerae genome
      1. Two chromosomes, large and small
      2. From outside inward:
        1. first and second circles
          1. predicted protein-coding regions on the plus and minus strands by role using color legend
        2. third circle
          1. recently duplicated genes on same (black) chromosome and on different (green) chromosomes
        3. fourth circle
          1. transposition-related (black), phage-related (blue), VCR's (pink) and pathogenesis genes (red)
        4. fifth circle
          1. regions with significant values for trinucleotide composition in the 2,000-bp window
        5. sixth circle
          1. % C+G in relation to the mean G+C for the chromosome
        6. seventh circle
          1. tRNA
        7. eighth circle
          1. rRNA
    3. Figure 3 - Overview of metabolism nd transport in V. cholerae
      1. Shows pathways for energy production and the metabolism of organic compounds, acids and aldehydes
        1. Transporters are grouped by substrate
          1. cations (green)
          2. anions (red)
          3. carbohydrates (yellow)
          4. nucleosides, purines and pyrimidines (purple)
          5. amino acids/peptides/amines (dark blue)
          6. other (light blue)
        2. Question marks associated with transporters indicate:
          1. a putative gene
          2. uncertainty in substrate specificity
          3. direction of transport
        3. Permeases represented as ovals
        4. ABC transporters are composites of ovals, diamonds and circles
          1. Porins are represented as 3 ovals
          2. large-conductance mechanosensitive channel shown as gated cylinder
          3. outer membrane transporters or recepters show as other cylinders
          4. all other transporters shown as rectangles
          5. export or import of solutes indicated by direction of arrow through transporter
          6. If a precise substrate could not be determined for transporter, no gene name used and a general common name for substrate was used
        5. Gene location of both transporters and metabolic steps indicated by colored arrow:
          1. genes located on large chromosome (black)
          2. genes located on small chromosome (blue)
          3. all genes needed for complete pathway on one chromosome but a duplicate copy on one or more genes on other chromosome (purple)
          4. required genes on both chromosomes (red)
          5. complete pathway on both chromosomes (green)
        6. Gene numbers on both chromosomes shown in parenthesis and use same color code as above
        7. Substrates underlines and capicalized can be used as sources of energy
          1. PRPP, phosphoribosyl-pyrophosphate
          2. PEP, phosphoenolpyruvate
          3. PTS, phosphoenolpyruvate-dependant phosphotransferase system
          4. ATP, adenosine triphosphate
          5. ADP, adenosine diphosphate
          6. MCP, methyl-accepting chaemotaxis protein
          7. NAG, N-acetylglucosamine
          8. G3P, glycerol-3-phosphate
          9. glyc, glycerol
          10. NMN, nicotinamide mononucleotide
    4. Table 1 - General features of the Vibrio cholerae genome
      1. Replicative origin in chromosome 1
        1. similar to Vibrio harveyi
        2. similar to Escherichia coli
          1. co-localization of genes found near origin of prokaryotic types
            1. dnaA
            2. dnaN
            3. recF
            4. gyrA
          2. GC nucleotide skew
            1. GC skew = (G-C/G+C) analysis
        3. conclusion1 - designated base pair 1 in an intergenic region located in the putative origin of replication.
        4. conclusion2 - Only GC skew was useful to identify a putative origin on chromosome 2
      2. Genomic sequence confirmed presence of a large integron island (gene capture system of approx. 125,300 bp) on chromosome 2
        1. Integron island contains all copies of the VCR sequences and 216 ORFs
          1. But - most ORFs have no homology to other sequences
        2. Among recognizable genes are those that encode:
          1. products that could provide drug resistance
            1. chloramphenicol acetyltransferase
            2. fosfomycin resistance protein
            3. glutathione transferase
          2. DNA metabolism enzymes
            1. MutT
            2. transposase
            3. an integrase
          3. possible virulence genes
            1. haemagglutinin
            2. lipoproteins
          4. genes which encode results similar to 'host addition' proteins which plasmids use to select their maintenance from host cells
            1. higA
            2. higB
            3. doc
  5. Comparative genomics
    1. Comparison types used
      1. between the two V. cholerae chromosomes
        1. Significant assymetrical distribution of genes known for growth and virulence between the chromosomes
        2. Chromosome 1 encode
          1. DNA replication and repair
          2. transcription
          3. translation
          4. cell-wall synthesis
          5. several central catabolic and biosynthetic pathways
          6. gene important to bacterial pahogenicity
            1. toxin co-regulated pilus
            2. cholera toxin
            3. lipopolysaccharide
            4. extracellular protein secretion machinery
        3. Chromosome 2 encode
          1. larger (59%) of hypothetical genes and those of unknown function comared to chromosome 1 (42%)
            1. The partitioning of hypothetical genes proteins is highly localized in the integron island
          2. carries 3-hydroxyl-3-methylglutaryl CoA reductase - apparently acquired from an archaea
      2. V. cholerae chromosomes and chromosomes of other microbial species
        1. Figure 4 - Percentage of total Vibrio cholerae open reading frames in biological roles compared with other Proteobacertia
          1. V. cholerae chromosome 1 (blue)
          2. V. cholerae chromosome 1 (red)
          3. Escherichia coli (yellow)
          4. Haemophilus influenzae (pale blue)
        2. figure 4 conclusions
          1. Majority of V. cholerae genes very similar to E. coli genes (1,454 ORFs)
            1. But - 499 ORFs showed highest similarity to other V. cholera genes suggesting recent dupes
              1. functions related to:
                1. regulatory functions
                2. chemotaxis
                3. transport and binding
                4. transposition
                5. pathogenicity
                6. uknnown functions or hypothetical
          2. 105 duplications with at least one ORF on each chromosome
            1. suggest recent crossovers between chromosomes
          3. significant duplication of scavenging behavior genes
            1. chemotaxis
            2. solute transport
            3. suggest high importance in V. cholerae biology - ability to exist in in many diverse environments
            4. then suggests environments may have selected genes for duplication and divergence of genes to support special functions
            5. various strains have different numbers and location of these genes
              1. suggest the virulence gene numbers are subject to pressures which affect copy numbers and location
        3. Figure 5 Comparison of the V. cholerae ORF's with thos of other completely sequenced genomes
          1. Protein sequences from NCBI, TIGR and Caenorhabditis elegans (wormpep16) database
          2. V. cholerae ORFs (chromosome 1 in blue and chromosome 2 in red) compared against all other genomes with FASTA3
          3. Number of V.cholerae ORFs with greatest similarity displayed proportionatly to the total ORFs of that genome
          4. No ORFs similar to Mycoplasma pneumoniae ORF
        4. Figure 5 conclusions
          1. central spike in middle of figure indicated highest ORF similarity
        5. Figure 6 Phylogenetic tree of methyl-accepting chemotactic proteins (MCP) homologues in completed genomes
          1. Homologues of MCP, identified through FASTA3 searches of all the complete genomes available.
          2. Amino acid sequences aligned using CLUSTALW
          3. The neighbor-joining phylogenetic tree generated using a PAM-based distance calculator
          4. Hypervariable regions of alignment and position with gaps were excluded
          5. significant bootstrap value nodes designated by ** and *
        6. Figure 6 conclusions
          1. ORFS with seemingly identical functions exist on both chromosomes which suggest acquisition by lateral gene transfer
            1. Example: glyA found on both chromosomes
  6. Overall paper conclusion
    1. V.colerae genome sequence is now the defacto starting point to study its environmental and pathobiological characteristics
    2. Attention should be focused on the gene expression patterns that govern its survial and replication during human infection as well as the various earthly environments in which it is found
    3. Generally the sequence will assist greatly the continued study of this model organism
      1. genomic sequence comparison should help to clarify the origin of the new smaller chromosome and its specific functions
      2. discover important clues in understanding metabolic and regulator link between the two chromosomes
      3. Finally, sequence good basis to study how several horizontally acquired loci on each chromosome can still interact operationally at the regulatory, cell and biochemical levels.
  7. Methods
      1. Whole-genome random sequencing procedure
        1. V. colerae grown in single isolated colony; with cloning, sequencing and assembly described by TIGR
      2. ORF prediction determined
        1. Initial set of ORFs from Glimmer db, ORFS searched against non-redundant protein db
        2. ORFs analyzed with two sets of Markok models constructed for a number of conserved protein families
        3. TopPred used to identify membrane-spanning domains in proteins
        4. Paralogous gene families by searching ORFs against themselves in BLASTX then clustered into multi-gene families
        5. Multiple alignments generated with CLUSTALW
      3. Phyogenic tree of homologues
        1. Homologues of genes identified using BLASTP and FASTA3 db searches
        2. Homolgues aligned using CLUSTALW program
        3. Phylogenic tree generated from alignments using neighbor-joining algorithm implemented in PAUP application