# Non: Week 13

## Purpose

The purpose of this week's lab to use various tools to explore the protein structure of the SARS-CoV-2 spike protein.

## Combined Methods/Results

1. First, I copied the FASTA sequence of the SARS-CoV-2 spike protein from the Week 13 Protocol into the NCBI Open Reading Frame Finder.
• The website generated the following screenshot:
• The most likely reading frame is the first one ORF1 because it encompasses the entirety of the gene.
• This was confirmed by going to NCBI protein record which generated the same protein code as the Reading frame.
2. Next I went to the UnitProt database to look up information on the SARS-CoV protein.
• Searching for "SARS-CoV" generated 833 results.
• I looked at P59594 as a reference to see what types of information could be accessed.
• For that database page, you could find info about the function, taxonomy, sub-cellular location, pathology, PTM processing, interactions, structure, family, domains, sequence, and similar proteins.
3. Then, I used PredictProtein to predict the structure of the surface glycoprotein of SARS-CoV-2 mentioned in Data S1 of the Walls et al. article.
• It generated the following screenshot:
 >YP_009724390.1 surface glycoprotein [Severe acute respiratory syndrome coronavirus 2]
SGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPF
LGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPI
NLVRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYN
ENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASV
YNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYF
PLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFL
PTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLG
AENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGI
LGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIG
VTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDI
LSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLM
SFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNT
FVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVA
KNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDD
SEPVLKGVKLHYT

1. Finally, I viewed the structure of the glycoprotein from the article, specifically 6VYB.
• The following screenshot of the structure was generated, mirroring the perspective of Fig. 3D of the Walls, et al. article:
• The N terminus is A46

while the C terminus is S1146.

• The sequence chain image from RCSB shows all of the secondary structure features. They do not really match up with the prediction from PredictProtein.

• The article mentions 14 amino acids that are important for ACE2 binding: T402, R426, Y436, Y440, Y442, L472, N473, Y475, N479, Y484, T486, T487, G488, and Y491.

## Research Question

1. What question will you answer about sequence-->structure-->function relationships in a SARS-CoV-2 protein?
• What makes ACE2 the most optimal receptor for SARS-CoV-2/SARS-CoV binding?
2. What sequences will you use? I want you to take advantage of sequence data available to perform a multiple sequence alignment as part of your project.
• SARS-CoV-2 spike protein; SARS-CoV spike protein;
• different ACE2 sequences, from different organisms?
• other common receptors utilized by viruses
3. What protein tools will you use for analysis and answering your question?
• RCSB, UniProt, Phylogeny.fr

## Scientific Conclusion

There are a variety of free online tools that allow for indepth analysis of protein structure and sequence. It is very hard to predict the structure of a protein.

## Acknowledgements

• I worked with my partners Jenny and Carolyn in figuring out what our topic of analysis should be.
• I used the Week 13 Protocol for this assignment.
• Except for what is noted above, this individual journal entry was completed by me and not copied from another source.

Non (talk) 22:35, 22 April 2020 (PDT)