# Janelle N. Ruiz Assignment 3

## In Class Activity

• Question 1: Record in your online notebook (your individual week 3 journal page) a summary paragraph of the information you already know about HIV (it's OK if you don't know much yet).

Human Immunodeficiency Virus (HIV) is a retrovirus, more specifically a lentivirus, which means that its genetic material is composed of RNA rather than DNA. To infect a host cell, HIV utilizes its RNA in conjunction with reverse transcriptase (as well as other viral proteins) to integrate its genome into an infected host’s DNA. In this way, the virus is able to exploit the mechanisms and cellular components of the host’s cell in order to transcribe more viral RNA/protein and ultimately infect more cells. The disease caused by HIV is called Acquired Immune Deficiency Syndrome which causes devastating immune dysfunction. HIV primarily infects a host’s CD4+ T cells, which are central to proper function of the entire adaptive immune system; however, HIV can also infect macrophages and dendritic cells. HIV is transmitted through bodily fluids via sexual contact, intravenous drug use, blood transfusions, etc. It is considered a world-wide pandemic, with the majority of infections being present in Sub-Sahara Africa where both access to healthcare and proper education regarding viral transmission and prevention are lacking.

• Question 2: Write three questions (or more) that you have about HIV that you would like answered.
1. How do the various viral proteins associated with HIV work to infect a host cell?
2. How is HIV either similar or different from other retro-viruses?
3. What is the current status of progress on the HIV vaccine and/or how is the scientific community currently exploiting the mechanisms of "elite controllers" of HIV infection to produce a vaccine that would protect against the devastating effects of the virus?
• Question 3: In Chapter 2 of Bioinformatics for Dummies, follow the protocol on "Becoming an Instant Expert with PubMed/Medline", using the examples shown in the book. Notice the differences between the instructions and screenshots shown in the book and what you see on today's version of PubMed.

The layout of PubMed has changed slightly since the Bioinformatics book was published. For example, suggestions for articles similar to the search article are given. Also, the links to reviews, etc have changed location and format on the screen. When I type in dUTPase, 404 references rather than 200 references are found. Though the book states that on the Results list, the author's name is able to be clicked, in fact, now, the title of the article must be clicked before you are able to link to the authors' names (though they are still shown on the screen).

When I typed "Abergel", 100 more results came up than are listed in the book. Also, when "Abergel dUTPase" was inputted, two results rather than one were found.

Instead of searching with the "limits" link, you know need to go to "advanced search" to narrow your search results to specific categories.

• Question 4: Now use your new skills to find a recent scholarly review about the HIV virus. Record the full citation of the review you found on your journal page using the wiki syntax. (Hint: you can see an example of how to use it in the source for this page.) Compare your search results with Google Scholar and the ISI Web of Science (a commercial site that LMU subscribes to).
1. Hariri S and McKenna MT. Epidemiology of human immunodeficiency virus in the United States. Clin Microbiol Rev. 2007 Jul;20(3):478-88, table of contents. DOI:10.1128/CMR.00006-07 | [Paper1]

The three search engines/online databases gave me many results; however I thought that PubMed gave me the best results in terms of specificity when I used the "advanced search" features. Google Scholar did not give me results in the order of most recent to least recent, like Pubmed and Web of Science. With Google Scholar, you need to restrict the search settings to do this, whereas the other databases do this automatically. Though I restricted my searched on PubMed and Web of Science to FullText, when I did not do this, typically I could find the fulltext article on Google Scholar when I could not find it on PubMed and/or Web of Science.

• Question 5: Using ISI Web of Science perform a prospective search on the Markham et al. (1998) article to find out what articles cite that article since its publication in 1998.
• The Markaim article was cited 52 times. The top 5 articles that included the citation were:
1. A comparative study of HIV-1 clade C env evolution in a Zambian infant with an infected rhesus macaque during disease progression
2. Multiple-infection and recombination in HIV-1 within a longitudinal cohort of women
3. HIV-1 evolution in gag and env is highly correlated but exhibits different relationships with viral load and the immune response
4. Relationship of Injection Drug Use, Antiretroviral Therapy Resistance, and Genetic Diversity in the HIV-1 pol Gene
5. Dynamic Correlation between Intrahost HIV-1 Quasispecies Evolution and Disease Progression

## Preparation for Week 4 Journal Club

1. Markham RB, Wang WC, Weisstein AE, Wang Z, Munoz A, Templeton A, Margolick J, Vlahov D, Quinn T, Farzadegan H, and Yu XF. Patterns of HIV-1 evolution in individuals with differing rates of CD4 T cell decline. Proc Natl Acad Sci U S A. 1998 Oct 13;95(21):12568-73. DOI:10.1073/pnas.95.21.12568 | [Paper1]

### Vocabulary

Make a list of at least 10 biological terms for which you did not know the definitions when you first read the article. Define each of the terms. You can use the glossary in any molecular biology, cell biology, or genetics text book as a source for definitions, or you can use one of many available online biological dictionaries (links below). List the citation(s) for the dictionary(s) you use, providing a URL to the page is fine.

1. Nonsynonymous mutation: A nucleotide substitution that changes the amino acid specified (i.e., AGC to AGA, or serine Æ arginine). (Compare with synonymous mutation which is a nucleotide sustituion which does not change the amino acid specified). http://www.whfreeman.com/thelifewire6e/content/glossary_htm/gloss_n.html
2. Variant: a group of organisms within a species that differ in trivial ways from similar groups; a new strain of microorganisms. http://www.biology-online.org/dictionary/Variant
3. Frequency dependent selection is the term given to an evolutionary process where the fitness of a phenotype is dependent on its frequency relative to other phenotypes in a given population. In positive frequency dependent selection, the fitness of a phenotype increases as it becomes more common. In negative frequency dependent selection, the fitness of a phenotype increases as it becomes less common. Negative frequency dependent selection is a particular mechanism of balancing selection. http://en.wikipedia.org/wiki/Frequency_dependent_selection
4. Seroconversion: The development of detectable antibodies in the blood directed against an infectious agent. It normally takes some time for antibodies to develop after the initial exposure to the agent. Following seroconversion, a person tests positive in tests based on the presence of antibodies (such as ELISA). http://www.medterms.com/script/main/art.asp?articlekey=9388
5. Peripheral Blood Mononuclear Cell (PBMC) is any blood cell having a round nucleus[1]. For example: a lymphocyte, a monocyte or a macrophage. These blood cells are a critical component in the immune system to fight infection and adapt to intruders. The lymphocyte population consists of T cells (CD4 and CD8 positive ~75%), B cells and NK cells (~25% combined). http://en.wikipedia.org/wiki/PBMC
6. Consensus Sequence: In molecular biology and bioinformatics, a consensus sequence is a way of representing the results of a multiple sequence alignment, where related sequences are compared to each other, and similar functional sequence motifs are found. The consensus sequence shows which residues are most abundant in the alignment at each position. http://en.wikipedia.org/wiki/Consensus_sequence
7. Monophyletic means common descent form a single ancestor. Biologists have introduced a taxonomy. If there is a group, that is made of a common ancestor (or parent), and all its descendants (children), they call that group monophyletic (Greek: "of one race"). http://simple.wikipedia.org/wiki/Monophyletic
8. Divergent evolution: the accumulation of differences between groups which can lead to the formation of new species, usually a result of diffusion of the same species adapting to different environments, leading to natural selection defining the success of specific mutations. Primarily difussion is the basis of molecular division can be seen in some higher-level characters of structure and function that are readily observable in organisms. For example, the vertebrate limb is one example of divergent evolution. The limb in many different species has a common origin, but has diverged somewhat in overall structure and function. Alternatively, "divergent evolution" can be applied to molecular biology characteristics. This could apply to a pathway in two or more organisms or cell types, for example. This can apply to genes and proteins, such as nucleotide sequences or protein sequences that derive from two or more homologous genes. Both orthologous genes (resulting from a speciation event) and paralogous genes (resulting from gene duplication within a population) can be said to display divergent evolution. Because of the latter, it is possible for divergent evolution to occur between two genes within a species. http://en.wikipedia.org/wiki/Divergent_evolution
9. Phylogenetic tree: A tree showing the evolutionary relationships among various biological species or other entities that are known to have a common ancestor. In a rooted phylogenetic tree, each node with descendants represents the most recent common ancestor of the descendants, and the edge lengths in some trees correspond to time estimates. Each node is called a taxonomic unit. Internal nodes are generally called hypothetical taxonomic units (HTUs) as they cannot be directly observed. http://en.wikipedia.org/wiki/Phylogenetic_tree
10. Primer: a strand of nucleic acid that serves as a starting point for DNA replication. They are required because the enzymes that catalyze replication, DNA polymerases, can only add new nucleotides to an existing strand of DNA. The polymerase starts replication at the 3'-end of the primer, and copies the opposite strand. In most cases of natural DNA replication, the primer for DNA synthesis and replication is a short strand of RNA (which can be made de novo). This RNA is produced by primase, and is later removed and replaced with DNA by a repair polymerase. Many of the laboratory techniques of biochemistry and molecular biology that involve DNA polymerase, such as DNA sequencing and the polymerase chain reaction (PCR), require DNA primers. These primers are usually short, chemically synthesized oligonucleotides, with a length of about twenty bases. They are hybridized to a target DNA, which is then copied by the polymerase. http://en.wikipedia.org/wiki/Primer_(molecular_biology)

### Outline

Patterns of HIV-1 evolution in individuals with differing rates of CD4 T cell decline

#### Introduction

• HIV-1 has high mutation and replication rates
1. This allows for the rapid adaptation of the virus to changes in its host
2. In a stable environment, the most fit virus would be selected for and replicate until it represented the dominant genetic variant within the population (all other variants would exhibit neutral mutations which would generally not affect fitness)
3. In an unstable environment, high genetic mutation of the virus occurs
• What contributes to an unstable environment within a host?
1. Dynamic host immune response
2. Differential display of co-receptors
• If these responses...
1. worked to eliminate viral variants by randomly selecting against existing variants, viral diversity would be reduced and only a select few surviving variants (those that were originally most numerous
2. worked to eliminate viral variants by targeting only the most abundant viral variant in the population, this would cause reduction in viral load but would not reduce genetic diversity of population -- causing increased diversity overall as minority populations expand and mutate. This increased diversity increases the likelihood that viral variants with the ability to supersede the effects of the immune system.
• In order to determine the type of selective forces and the efficiency of these responses on influencing viral evolution, it is important to examine patterns of diversity during HIV-1 evolution. By doing this, we can understand how HIV-1 adapts to the changing host environment and use this information in the development of possible therapeutics.
• Limitations of previous studies:
1. Previous studies of HIV-1 genetic evolution examined very small cohorts, characterized HIV-1 genetic variants using techniques that did not involve direct sequencing of patient samples, and analyzed viral variants in patient samples at few time points.
• This study examined HIV-1 evolution in 15 subjects during the time following serocovnversion. HIV-1 evolution was analyzed at frequent intervals over 4 years.
1. Results show that different patterns of selection occur between non-progressor and progressor participants.
2. Results also show that higher levels of viral genetic diversity predicted by more rapid decline in CD4+ T cell decline.

#### Material and Methods

• The Study Population:
1. 15 participants from cohort of IV drug users followed from the point of seroconversion of HIV-1
2. Participants had different levels of CD4 T cells
3. Rapid progressors: less than 200 CD4 T cells within two years
4. Moderate progressors: 200-650 CD4 T cells within 4 year period
5. Non-progressors: more than 650 CD4 T cells throughout observation period.
• Sequencing of HIV-1 env Genes:
1. Nested PCR amplified 285 bp region of env gene from PBMC cells: Primers for PCR contained BamH1 and EcoR1 restriction sites; Amplified sequences cloned into plasmid UC19 and sequenced using Sanger chain termination method; The idea was to sequence the env gene for unique viral genomes and determine where mutations were occurring and also to see how many env genetic variants existed within a given patient.
1. Determined via RT-PCR
• Generation of Phylogenetic Trees:
1. Trees were constructed using MEGA computer package
2. Taxon labels show time each strain was isolated and the number of identical sequences in sample
• Correlation Analysis:
1. Purpose: to determine correlation between genetic diversity and CD4 T cell count one year later
• Determination of dS/dN ratios:
1. Purpose: to compare initial consensus sequence for each subject with each subsequently observed strain consensus sequence.
2. This allowed authors to see whether mutations were synonymous or non-synonymous
• Examination of Source of Greater Initial Visit Diversity in Subjects 9 and 15:
1. Why? High genetic variation in observed in subjects 9 and 15 in first visit. Were they infected with two different viruses? No, monophyletic (same ancestor)
• Comparison of the Rate of Change of Divergence and Diversity:
1. The divergence/diversity of the viral genome was tracked over time for each individual
2. Participants were grouped into three groups by progressor type and there divergence/diversity over time averaged.

#### Results

• CD4 T cell decline patterns across participants were highly variable
• Non-progressor group: generally low viral load at early time points compared with other two groups
• Genetic sequence variants focused on env region near V3 (third hypervariable) domain.
1. Why this region? It is an important site of host-virus interaction and is a site of frequent mutation
• Changes in HIV-1 sequence over time quantified by:
1. The genetic diversity in env at each visit defined as the avg number of nuc differences b/w env sequences
2. Divergence, defined as the median % of nuc per env clone at a given visit that differed from the consensus env sequence from first visit
• Table 1:
1. CD4 count, virus copy number, annual rate of CD4 decline between participants
2. Rate of change in…
Diversity: -2.94 to 5.10 nt per clone per year
Divergence:  0.13% - 2.09% per clone per year

1. Viral homogeny:
13 of 15 subjects had homogenous virus upon initial visit
2 subjects had heterogeneous viruses on first visit -- dual infection? Recombinant viruses? Mistiming of seroconversion?

1. Rate of synonymous mutations per potential site for that mutation type (dS) to the rate of non-synonymous mutations per potential site for that mutation type (dN) was measured for participants in each subject group.
Rapid and moderate progressors: median dS/dN ratio of 0.4 -- meaning that mutation not occurring randomly but selecting for NS mutations.
Non-progressors: 1.6 -- meaning mutation may be occurring randomly without selection, but shows a trend toward selection against NS mutations.
Non-progressors distinguished from progressor status by whether or not selection favored NS mutations

• Fig 2:
1. Diversity and divergence increased over time in all three progressor categories
2. Increase in diversity and divergence per year was greater in rapid progressor than moderate progressor, which was greater than the non-progressor group. Sig difference b/w rapid progressor and non-progressor in increase in divergence and diversity over time. Sig differences b/w non-progressors and moderate progressor in diversity. All other differences = trends and were non-sig.
• Correlation b/w viral genetic diversity/divergence at a given visit and CD4 T cell count decline over subsequent year was examined.
1. Diversity and divergence significantly negatively correlated with CD4 T cell count 12 mos later. This means that as viral diversity and divergence increased, CD4 T cell count 12 mos later decreased.
• Fig 3:
1. Phylogenetic trees from 10 of 15 participants showed no evidence of single strain variant persisting over time
Viral variants start off similar in first few visits and become more genetically different in subsequent visits.
Limited progression pattern -- viral variants move away from original variants and return later to those similar variants -- seen in four randomly selected individuals from cohort shown in figure


#### Discussion

• Higher HIV-1 diversity and divergence associated with greater decline in CD4 count
• NS mutations were seen three times as frequently in progressors than non-progressors
1. Pattern: Viral strains in non-progressors showed selection against NS mutations while viral strains in progressors showed selection for NS mutation. Inconsistent with model which contends that progression occurs when most fit initial viral strain proliferates rapidly upon initial infection
• Two previous studies of association b/w genetic variation and disease progression - conflicting results
1. McDonald et al.: Found greater genetic diversity in portion of rapid progressors, but intra-visit diversity in rapid progressors less than that observed in slow progressors -- not consistent with findings of current study.
Explanation for difference? McDonald study participants not followed from time of seroconversion and fewer time points analyzed.

1. Wolinsky et al: observed less viral diversity in two subjects with most rapidly declining CD4 count as compared to participants with more slowly declining CD4 count. Not consistent with current study.
Explanation for difference? May be an exceptional circumstance in which patients could not develop an effective immune response to virus.  Also, Wolinsky had small sample size.

• Nowak model:
1. Current study results consistent with this model which proposes that increasing viral genetic diversity leads to increase in CD4 T cell decline.
2. Nowak hypothesized that reason for this is that evolution of virus ultimately results in variants that are outside of T cell repertoire of host, meaning that the T cell-mediated immune response cannot respond to the viral infection
If this is true, should expect to see increased viral diversity followed by decreased diversity limited to select number of variants outside of T cell repertoire .
Current study did not see this
Current study sees evolution consistent with frequency-dependent selection or independent evolution of viral variants from different sites in body (model proposed by Sala et al.)
Results of this study consistent with independent evolution of viral variants from different sites in body (model proposed by Sala et al.)

• Immune system seems to be targeting most frequent viral variants (frequency-dependent selection) instead of a broad range of viral variants, which may explain the persistent increase in viral genetic diversity. Immune system is not evolving to fight HIV-1 infection in the same way that HIV-1 is evolving to overcome the immune response.
• Does immune response respond more effectively in non-progressors?
1. Maybe not, might just be because particular HIV-1 not replicating to some critical threshold required for targeting by immune response. Selection against NS mutations may be preferred because these may by more replicatively competent and at the same time more recognizable by host immune system.