Beauchamp:CIMS: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
 
(31 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Causal Inference of Multisensory Speech Model ==
== Fitting the Causal Inference of Multisensory Speech Model to Data ==
''NB'': This code is in active development
===System setup===
# Install  [http://cran.cnr.berkeley.edu GNU R ]
# Download the [[Media:Cims_code_pack.zip|R code]] zip file
# Extract the zip file to your Desktop/ or other preferred location


'''A copy of this help file is included in the code pack as README.txt'''
===Data setup===


If you run into trouble with any step, please contact me: john dot magnotti at gmail dot com. If the model fitting fails to converge for your dataset, you may need to send me at least a portion of the data so I can replicate the error.
The model code assumes the data are stored in a matrix format with rows as subjects and each column a separate asynchrony. The first row is used as labels for the columns. Each cell stores the number of times the subject judged the stimulus at the given asynchrony as synchronous. If there are 17 subject and 15 asynchronies, the file will have 18 rows (1st row is header row) and 15 columns. For multi-condition experiments, place each asynchrony/condition combination in a separate column. Asynchronies within a condition must be contiguous. If the study has 4 conditions and 15 asynchronies, then the columns 1-15 will be treated as condition 1, 16-30 as condition 2, and so on. See ''data.csv'' in the code pack for the data used in Magnotti, Ma, & Beauchamp.  


If you find this code useful, please cite our work:
If you are unfamiliar with R, the best approach is to run through all the model building steps using the included data.csv file, and then try with your own data.


Magnotti, J. F., Ma, W. J., & Beauchamp, M. S. (submitted). Causal Inference in Multisensory Speech Perception.
===Program setup===


===Fitting the CIMS model to your data===
We need to ensure R can find the data and code files


'''System setup'''
1. Launch R
# Install  [http://cran.cnr.berkeley.edu GNU R ]
# Download the CIMS CODE PACK (link this to Dropbox)
# Extract the zip file to your Desktop/ or other preferred location


'''Data setup'''
2. Open the file '''fit_models.R''':  File-> Open Document


The model code assumes the data are stored in a matrix format with rows as subjects and each column a separate asynchrony. The first row is used as labels for the columns. Each cell stores the number of times the subject judged the stimulus at the given asynchrony as synchronous. If there are 17 subject and 15 asynchronies, the file will have 18 rows (1st row is header row) and 15 columns. For multi-condition experiments, place each asynchrony/condition combination in a separate column. Asynchronies within a condition must be contiguous. If the study has 4 conditions and 15 asynchronies, then the columns 1-15 will be treated as condition 1, 16-30 as condition 2, and so on. See ''data.csv'' in the code pack for the data used in Magnotti, Ma, & Beauchamp.
3. We need to make 4 changes before running the code. See the comments in the code file for additional direction


If you are unfamiliar with R, the best approach is to run through all the model building steps using the included data.csv file, and then try with your own data.
Set the path to be the location of the downloaded files. If you extracted the code pack to your desktop, the path may already be correct
  setwd('~/Desktop/cims_code_pack/')
Set the location of the data to be fit
  count_mat = as.matrix(read.csv(file='data.csv') )
Set the value of '''max_count''' to be the total number of trials at each asynchrony for each condition.
  max_count = 12
Set the '''asyncs''' used in each condition. The order of the asynchronies must match the order in the data file. For multiple condition experiments, list the asynchronies only once.
  asyncs = c(-300, -267, -200, -133, -100, -67, 0, 67, 100, 133, 200, 267, 300, 400, 500)


'''Program setup'''
4. Run the setup code to make sure there are no errors
#Highlight lines 1 through 16  using the mouse
#Execute the code by using the R menu: Edit -> Execute


We need to ensure R can find the data and code files
===Fitting the model===
# Launch R
# Open the file '''fit_models.R''':  File-> Open Document
# We need to make 4 changes before running the code. See the comments in the code file for additional direction
## Line 5: Set the path to be the location of the downloaded files. If you extracted the code pack to your desktop, the path may already be correct
## Line 11: Set the location of the data to be fit
## Line 14: Set the value of '''max_count''' to be the total number of trials at each asynchrony for each condition.
## Line 17: Set the value of the '''asynchronies''' used in each condition. The order of the asynchronies must match the order in the data file. For multiple condition experiments, list the asynchronies only once.
# Run the setup code to make sure there are no errors
## Highlight lines 1 through 17 using the mouse
## Execute the code by using the R menu: Edit -> Execute
 
'''Fitting the model'''


Highlight and execute each of the following lines in turn
Highlight and execute each of the following lines in turn
<code>
<code>
   cl = makeCluster(detectCores())
   cl = makeCluster(detectCores())
   cims.model = cims(n.reps=1000)
 
   gauss.model = gauss()
  # This takes about 15 seconds per repetition on a fast computer
   cims.model = cims(n.reps=512)
 
   gauss.model = gauss(n.reps=512)
</code>
</code>


'''Model Parameters'''
===Model Parameters===


#The resulting parameters for each model are saved to '''cims_out.csv''' and '''gauss_out.csv'''.
#The resulting parameters for each model are saved to '''cims_out.csv''' and '''gauss_out.csv'''.
#The predicted values for each model are saved to '''cims_predicted.csv''' and '''gauss_predicted.csv'''.
#The predicted values for each model are saved to '''cims_predicted.csv''' and '''gauss_predicted.csv'''.


'''Model Comparison'''
===Model Comparisons===
 
    # make sure these values correspond to the appropriate degrees of freedom for each model
    n.par.cims = 8
    n.par.gauss = 12
 
    #calculate the number of conditions
    n.conditions = ncol(count_mat) / length(asyncs)


Batch comparison within and across conditions
    #calculate the BIC for each model using the separate=T function to get the log likelihood for each condition
  within_condition_tests(cims.model, gauss.model)
    BIC.c = -2*logLik(cims.model, separate=T) + (log(max_count*ncol(count_mat)) * n.par.cims) / n.conditions
  across_condition_tests(cims.model, gauss.model)
    BIC.g = -2*logLik(gauss.model, separate=T) + (log(max_count*ncol(count_mat)) * n.par.gauss) / n.conditions


== Advanced Functions ==
== Advanced Functions ==


We include here helpful functions for further data exploration. The functions assume you have run all the code in '''fit_models.R'''. Intrepid users are encouraged to let the the source be their guide.
We mention here some other useful functions for those comfortable analyzing data with the R language. These functions assume you have run all the code in the previous section. Intrepid users are encouraged to let the source be their guide.


Extract the log Likelihood from each model
===Load previously fitted models===
<code>
<code>
   logLik(cims.model)
   cims.model = load_cims_model('cims_out.csv')
  logLik(cims.model, separate=T)
   gauss.model = load_gauss_model('gauss_out.csv')
   logLik(gauss.model)
  logLik(gauss.model, separate=T)
</code>
</code>


To plot model fits
===Obtain predicted values===
<code>
<code>
   ebar_plot(asyncs, count_mat/max_count)
   cims.p = predict(cims.model)
  add_fit_line(cims.model, col='orange')
   gauss.p = predict(gauss.model)
   add_fit_line(gauss.model, col='blue')
</code>
</code>


To obtain the raw predicted values:
===Plot model fits===
<code>
<code>
   cims.predicted = predict(cims.model)
   #create proportion data from count data
   gauss.predicted = predict(gauss.model)
  actual = count_mat / max_count
 
  #create condition indicies
  c1 = 1:15
  c2 = 16:30
 
  #plot conditions (see function in model_plotters.R for further plot customization)
   plotMeanWithFitted(asyncs, actual[,c1], cims.p[,c1], gauss.p[,c1])
  plotMeanWithFitted(asyncs, actual[,c2], cims.p[,c2], gauss.p[,c2])
</code>
</code>
===Obtain model log likelihoods===
<code>
  cims.nlnL = -logLik(cims.model, separate=T)
  gauss.nlnL = -logLik(gauss.model, separate=T)
</code>
''NB:'' Remove the separate=T to return a logLikelihood object representing total log likelihood. Generic functions AIC and BIC will also work (e.g., AIC(cims.model) ) but be sure that the degrees of freedom reported by the function is appropriate for your model
  attr(logLik(cims.model), 'df')
== Contact Information ==
If you run into trouble with any step, please contact me: john dot magnotti at gmail dot com. If the model fitting fails to converge for your dataset, you may need to send me at least a portion of the data so I can replicate the error.


== Copyright/Licensing ==  
== Copyright/Licensing ==  
Line 86: Line 109:


This work is licensed under a [http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution-ShareAlike 3.0 Unported License].
This work is licensed under a [http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution-ShareAlike 3.0 Unported License].
If you find this code useful, please cite our work:
Magnotti JF, Ma W and Beauchamp MS (2013). Causal inference of asynchronous audiovisual speech. Front. Psychol. 4:798. doi: 10.3389/fpsyg.2013.00798

Latest revision as of 06:19, 9 December 2013

Fitting the Causal Inference of Multisensory Speech Model to Data

System setup

  1. Install GNU R
  2. Download the R code zip file
  3. Extract the zip file to your Desktop/ or other preferred location

Data setup

The model code assumes the data are stored in a matrix format with rows as subjects and each column a separate asynchrony. The first row is used as labels for the columns. Each cell stores the number of times the subject judged the stimulus at the given asynchrony as synchronous. If there are 17 subject and 15 asynchronies, the file will have 18 rows (1st row is header row) and 15 columns. For multi-condition experiments, place each asynchrony/condition combination in a separate column. Asynchronies within a condition must be contiguous. If the study has 4 conditions and 15 asynchronies, then the columns 1-15 will be treated as condition 1, 16-30 as condition 2, and so on. See data.csv in the code pack for the data used in Magnotti, Ma, & Beauchamp.

If you are unfamiliar with R, the best approach is to run through all the model building steps using the included data.csv file, and then try with your own data.

Program setup

We need to ensure R can find the data and code files

1. Launch R

2. Open the file fit_models.R: File-> Open Document

3. We need to make 4 changes before running the code. See the comments in the code file for additional direction

Set the path to be the location of the downloaded files. If you extracted the code pack to your desktop, the path may already be correct

  setwd('~/Desktop/cims_code_pack/')

Set the location of the data to be fit

  count_mat = as.matrix(read.csv(file='data.csv') )

Set the value of max_count to be the total number of trials at each asynchrony for each condition.

  max_count = 12

Set the asyncs used in each condition. The order of the asynchronies must match the order in the data file. For multiple condition experiments, list the asynchronies only once.

  asyncs = c(-300, -267, -200, -133, -100, -67, 0, 67, 100, 133, 200, 267, 300, 400, 500)

4. Run the setup code to make sure there are no errors

  1. Highlight lines 1 through 16 using the mouse
  2. Execute the code by using the R menu: Edit -> Execute

Fitting the model

Highlight and execute each of the following lines in turn

  cl = makeCluster(detectCores())
  # This takes about 15 seconds per repetition on a fast computer
  cims.model = cims(n.reps=512)
  gauss.model = gauss(n.reps=512)

Model Parameters

  1. The resulting parameters for each model are saved to cims_out.csv and gauss_out.csv.
  2. The predicted values for each model are saved to cims_predicted.csv and gauss_predicted.csv.

Model Comparisons

   # make sure these values correspond to the appropriate degrees of freedom for each model
   n.par.cims = 8
   n.par.gauss = 12
   #calculate the number of conditions
   n.conditions = ncol(count_mat) / length(asyncs)
   #calculate the BIC for each model using the separate=T function to get the log likelihood for each condition
   BIC.c = -2*logLik(cims.model, separate=T) + (log(max_count*ncol(count_mat)) * n.par.cims) / n.conditions
   BIC.g = -2*logLik(gauss.model, separate=T) + (log(max_count*ncol(count_mat)) * n.par.gauss) / n.conditions

Advanced Functions

We mention here some other useful functions for those comfortable analyzing data with the R language. These functions assume you have run all the code in the previous section. Intrepid users are encouraged to let the source be their guide.

Load previously fitted models

  cims.model = load_cims_model('cims_out.csv')
  gauss.model = load_gauss_model('gauss_out.csv')

Obtain predicted values

  cims.p = predict(cims.model)
  gauss.p = predict(gauss.model)

Plot model fits

  #create proportion data from count data
  actual = count_mat / max_count
  #create condition indicies
  c1 = 1:15
  c2 = 16:30
  #plot conditions (see function in model_plotters.R for further plot customization)
  plotMeanWithFitted(asyncs, actual[,c1], cims.p[,c1], gauss.p[,c1])
  plotMeanWithFitted(asyncs, actual[,c2], cims.p[,c2], gauss.p[,c2])

Obtain model log likelihoods

  cims.nlnL = -logLik(cims.model, separate=T)
  gauss.nlnL = -logLik(gauss.model, separate=T)

NB: Remove the separate=T to return a logLikelihood object representing total log likelihood. Generic functions AIC and BIC will also work (e.g., AIC(cims.model) ) but be sure that the degrees of freedom reported by the function is appropriate for your model

  attr(logLik(cims.model), 'df')

Contact Information

If you run into trouble with any step, please contact me: john dot magnotti at gmail dot com. If the model fitting fails to converge for your dataset, you may need to send me at least a portion of the data so I can replicate the error.

Copyright/Licensing

http://i.creativecommons.org/l/by-sa/3.0/88x31.png

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

If you find this code useful, please cite our work:

Magnotti JF, Ma W and Beauchamp MS (2013). Causal inference of asynchronous audiovisual speech. Front. Psychol. 4:798. doi: 10.3389/fpsyg.2013.00798