# 20.109(S13):Assess protein function (Day8)

From OpenWetWare

Revision as of 18:45, 3 February 2013 by AgiStachowiak (talk | contribs) (New page: {{Template:20.109(S13)}} <div style="padding: 10px; width: 640px; border: 5px solid #99FF66;"> ==Introduction== ==Protocols== ===Part 4: Analysis=== Begin by applying the practice ...)

## Introduction

## Protocols

### Part 4: Analysis

Begin by applying the practice analysis from Day 3, Part 5 of this module to your real data. Recall that here you plot titration curves in Excel and make a first crude estimate of K_{D} values. Next, you will use MATLAB to get improved estimates of K_{D}s and also assess cooperativity. **The MATLAB code is now up to date.**

#### Preparation

- Download these three files: S12_Fit_Main, Fit_SingleKD, and Fit_KDn. If using a computer in 16-336, the files will be available in the
*Downloads*folder under your username. Move them to the username/Documents/MATLAB folder on your PC. - Double-click on the MATLAB icon to start up this software.
- The main window that opens is called the command window: here is where you run programs (or directly input commands) and view outputs. You can also see and access the command history, workspace, and current directory windows, but you likely won’t need to today.
- In the command window, type
*more on*; this command allows you to scroll through multi-page output (using the spacebar), such as help files. - In addition to the command area, MATLAB comes with an editor. Click
*File*→*Open*and select the program**S12_Fit_Main**. It has the .m extension and thus is executable by MATLAB. Read the introductory comments (the beginning of a comment is indicated by a % sign), and then input your fluorescence data. - Read through the program, and as you encounter unfamiliar terms, return to the workspace and type
*help functioname*. Feel free to ask questions of the teaching faculty as well.- You might read about such built-in functions as
*logspace*and*nlinfit*. - You will also want to open and read
**Fit_SingleKD**– a user-defined function called by**S12_Fit_Main**– in the MATLAB editor. - If you type
*help function*you will learn the syntax for a function header. - Note that a dot preceeding an operator (such as A ./ B or A .* B) is a way of telling MATLAB to perform element-by-element rather than matrix algebra.
- Also note that when a line of code is
*not*followed by a semi-colon, the value(s) resulting from the operation will be displayed in the command window.

- You might read about such built-in functions as

#### Analysis

- Once you more-or-less follow Part 1 of the program, type
**S12_Fit_Main**in the workspace, hit return to run the program, and consider the following questions:- Why must the fluorescence data be transformed (from
*S*to*Y*) prior to use in the model? - What
**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle K_D}**values are output in the command window, and how do they compare to the values you estimated from your Excel plots? - Figure 1 should display your wild type and mutant data points and model curves. How do they look in comparison to the curves you plotted in Excel?
- Figure 2 should display the residuals (difference between data and model) for your three proteins. If the absolute values are low, this indicates good agreement between the model and the data numerically. Whether or not this is the case, another thing to look for is whether the residuals are evenly and randomly distributed about the zero-line. If there is a pattern to the errors, likely there is a systematic difference between the data and the model, and thus the model does not reflect the actual binding process well. What are the residuals like for each of your modeled proteins?

- Why must the fluorescence data be transformed (from
- Now move on to Part 2 of the
**S12_Fit_Main**program. Part 2 also fits the data to a model with a single, ‘apparent’ value of**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle K_D}**, but it allows for multiple binding sites and tests for cooperativity among them. The parameter used to measure cooperativity is called the Hill coefficient. A Hill coefficient of 1 indicates independent binding sites, while greater or lesser values reflect positive or negative cooperativity, respectively. Let the following questions guide you as you proceed:- Visually, which model appears to fit your wild-type data better (Fig. 3 vs. Fig. 1)? Your mutant data?
- Do the respective residuals support your qualitative assessment (Fig. 4 vs. Fig. 2)?
- Numerically, how do the values of
**Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://api.formulasearchengine.com/v1/":): {\displaystyle K_D}**compare for the two models? How does the value of*n*compare to the implicitly assumed value in Part 1? - Do you see changes in binding affinity and/or cooperativity between the wild-type, E67K/T79P/M124S, and X#Z samples? Do they match your
*a priori*predictions? **Don't forget to save any figures you want to use in your report!**If the legends are covering up your data, you can simply move them over with your mouse.

- Finally, you can skim Part 3 of the
**S12_Fit_Main**program. Don’t worry too much about the coding details, but do read through the comments.- Look at Part 1 of Figure 5: are the binding curves asymptotic, sigmoidal, or other? What does this shape indicate? You can use the zoom button to get a closer look at part of the plot, or the
*axis*command present in the code. (Don't worry too much about this question if it is unclear.) - Now look in the command window. What values of
*n*) do you get for your three proteins? How do the - Comparing the model and data points by eye (Part 2 of Figure 5), do you think it is a good model for any of your proteins? If so, which ones? What experimental limitations might prevent Hill analysis from working well, especially for some mutants?
- Why should only the transition region be analyzed in a Hill plot?
- What is the relationship between slope and
*n*, and intercept and*n*?

- Look at Part 1 of Figure 5: are the binding curves asymptotic, sigmoidal, or other? What does this shape indicate? You can use the zoom button to get a closer look at part of the plot, or the
- If your mutant proteins are not well-described by any of the models so far, what kind of model(s) (qualitatively speaking) do you think might be useful?
- Optional: If your data might be well-described by a model with two KD's (or if you are interesting in exploring some sample data that is), download and run Fit_TwoKD and Fit_TwoKD_Func.