Beauchamp:TensorECOG

From OpenWetWare
Jump to: navigation, search

Template:PAGE UNDER CONSTRUCTION

Tensor Analysis for ECoG Data

POC: Kelly Geyer, kelly.l.geyer@rice.edu

ECoG Tensor Decomposition

Analogous to regularized PCA, RHOPCA helps identify influencing components and relationships between trials, electrodes, frequencies and time. Provided ECoG data in the format of a tensor , we can find K vectors that correspond to the strongest patterns in each mode. This allows us to analyze modes individually, or a combination of them. We compute decompositions using a variant of CP-Decomposition, which decomposes into the sum of outer products between vectors as shown below.

Ecog cp decomp.png

We ensure that factors are interpretable by computing u, v, w, and t using a regularized version of the tensor power algorithm that solves the following optimization problem:

Opteqn.png

The parameters , , control sparsity (L1 penalty) and the matrix smooths over time.

Additionally, this can be extended to a supervised case when response Y or are available. In this case, the covariance of and the response is decomposed. The response can be either continuous or categorical.

About RHOP Package

This page contains documentation of our MATLAB package for the tensor factorization algorithm Regularized Higher-Order Partial Least Squares (RHOPLS) and Principal Component Analysis (RHOPCA). These algorithms are specifically designed for electrocorticography (ECoG) data, with dimensions consisting of trials, electrodes, epoch time, and frequency.

Current capabilities:

  1. Data processing: multiple methods for scaling data and removing outlier trails
  2. Regularized Higher-Order PCA: Implementation of RHOPCA for ECoG data
  3. Regularized Higher-Order PLS: Implementation of RHOPLS for ECoG data
  4. Parameter tuning for RHOPLS: Tuning function to select best parameters for RHOPLS

Upcoming capabilities:

  1. R wrapper functions: Ability to use this package in R
  2. Integration with RAFE: Ability to visualize RHOPCA heatmaps with RAFE tool, as well as SUMA
  3. Parameter tuning for RHOPCA: Tuning function to select best parameters for RHOPCA

Requirements

Required Software

  1. Developed with R (version 3.4.1)
  2. Developed with MATLAB (R2015b & R2016b)

Obtaining Code

This package is available on the GitLab page https://gitlab.com/klgeyer/rhop. This is currently a private page. Contact Kelly Geyer to receive a copy of the code.

Configuration

Configuration for Mac OSX

1. Create terminal command 'matlab' by typing the following lines in the terminal. For example <path to matlab bin> could be "/Applications/MATLAB_R2015a.app/bin"

   $ export PATH=$PATH:<path to matlab bin>
   $ source ~/.bash_profile

Test this by typing the following, which should return "<path to matlab bin>/matlab"

   $ which matlab

2. Configure variables for rhopls code by editing the config.yaml file as needed. You can check values with the commands below.

   $ which R 
   $ which Rscript

3. Make R to recognize the command matlab. Run the following terminal commands

   $ touch ~/.Renviron
   $ echo "export PATH=\"/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:$PATH\"" >> ~/.Renviron
   $ touch ~/.Renviron.site
   $ echo "export PATH=\"/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:$PATH\"" >> ~/.Renviron.site

Test this step by running the following command in R

   > system("matlab -nodesktop")

4. We use an R package named "rhdf5," which can be found on the site for [3]. Enter the following commands in R to install, and select default options if presented. This is only required for if you plan to use RAFE.

   > source("https://bioconductor.org/biocLite.R")
   > biocLite("rhdf5")

Install Package

Install the MATLAB package with the following commands (in Matlab). It is helpful to include at the beginning of any .m script that uses this package.

<source lang="matlab">

   % Install RHOPLS package
   cd(rhopls_dir);
   addpath(pwd);
   savepath;
   % Install required toolboxes
   rhopls_setup();

</source>

Processing ECoG Data for Analysis

The RHOPLS package provides several methods of processing ECoG data before an analysis. These include several methods of standardization, as well as removal of outlier trials. The function process_patient_data implements these in a user-friendly wrapper. Ultimately it creates the design tensor for a single patient, which is a 4-dimensional tensor with dimensions (trials x electrode x frequency x time).

Data Standardization Methods

  1. none: When using the <load_patient_data> function this avoids implementing any standardization. However, it is not an option for the standardize_data function
  2. flatten_trials: This function flattens the ECoG tensor by trial, or first dimension, and then centers and scales data according by user options
  3. array_normal: This function centers and scales data using the array normal method, according by user options. The array normal distribution is a higher-order generalization of the multivariate normal distribution. We estimate a mean and covariance assuming the tensor was generated via the array normal distribution and use the estimated parameters to center and scale the tensor.

Example

In following example, we flatten the ECoG data by trial and then center and scale the columns as a method of standardization. Additionally, outlier trials are removed. Users can edit these options, and save processed data as shown below

<source lang="matlab">

   % Set parameters
   rhop_code_dir = <directory containing rhop matlab code>
   data = <your ecog data with dimensions (trials x electrode x frequency x time)>
   saveOutputAs = 'processed_data.h5';      % Could be .mat file
   % Install RHOPLS package
   cd(rhop_code_dir);
   addpath(pwd);
   savepath;
   % Install required toolboxes
   rhopls_setup();
   % Process data
   % 1. Standardizes data (two methods available)
   % 2. Remove outliers
   % 3. Save files of processed data with outliers removed
   % 4. Save list of trials identified as outliers
   sd_method = 'flatten_trials';      % 'array_normal', 'none'
   center = true;                      
   scaled = true;                      
   rm_outliers = true;                 
   [X, outliers] = process_patient_data(data, ...
       'standardizationMethod', sd_method, ...
       'center', center, ...
       'scaled', scaled, ...
       'removeOutliers', rm_outliers, ...
       'saveOutputAs', saveOutputAs);

</source>


Regularized Higher-Order Models

This package implements both supervised and unsupervised methods leveraging RHOPCA.

  1. RHOPCA
  2. RHOPLS for continuous and categorical responses
  3. Parameter tuning for RHOPLS

RHOPCA

The following is an example of implementing RHOPCA on ECoG tensor X. It returns U, a cell array of all CP vectors computed and D the scaling constants.

<source lang="matlab"> % Set parameters output_fn = 'cp_decomp_results.mat';  % Could be .h5 file K = 3;  % Number of CP-factors lamu = 0;  % L1 penalty for trials lamv = 0.1;  % L1 penalty for electrodes lamw = 0;  % L1 penalty for frequencies omega = 0.1*gallery('tridiag', size(X,4));  % Smoothing over time

% Perform RHOPCA [U, D, Xhat, objVals] = hopca_cptpa_ecog(X, K, ...

   'lams', [lamu, lamv, lamw], ...
   'omega', omega, ...
   'saveOutputAs', output_fn);

</source>

Tuning RHOPLS Parameters

The following example of tuning parameters for RHOPLS.

<source lang="matlab"> % Notice that X is a (trials x electrode x frequency x time) tensor and trail_labels is a % vector of class labels

% Standardization Parameters standardize_method = 'flatten_trials'; center = true; scaled = true;

% Tuning parameters nSplits = 25;  % Number of cross-validation splits K = 4;  % Number of components lamu = 0;  % L1 penalty for trials lamv = 5*[10:70];  % L1 penalty for electrodes lamw = 0;  % L1 penalty for frequency omega = gallery('tridiag', size(X,4));  % Time differencing matrix (default) alpha_list = [.1,.2,.3,.4,.5,logspace(.001,1.5,5)];  % Smoothing oefficient

% Format Y to be a binary matrix with dimensions (trials x classes) two_class_labels = mk_2class_labels(trial_labels); [Y, label2column] = labels_2_multiclass(two_class_labels);

% Nested Cross-validation for tuning RHOPLS parameters [U, V, W, T, D, lams, alps] = hopls_cptpa_ecog_4D_cv(X, Y, K, ...

   'lamu', lamu, ...
   'lamvList', lamv, ...
   'lamw', lamw, ...
   'alphaList', alpha_list, ...
   'standardizationMethodX', standardize_method, ...
   'centerY', center, 'centerX', center, ...
   'scaledY', scaled, 'scaledX', scaled, ...
   'nSplits', nSplits);

</source>

Results

These are results for RHOPCA (decomposed into 3 factors) from an ECoG study. In this patients watched a video with a person either saying "rock" or "rain," sometimes the visual or audio would be distorted.

Look at single modes:
Image:600 pixels Image:800 pixels


Consider two modes at once:

Image:800 pixels Image:800 pixels

References

  1. Frederick Campbell, "Interpretable Brain Decoding for Electrocorticography Data though Regularized High Order Partial Least Squares." Publication in progress.
  2. Brett W. Bader, Tamara G. Kolda and others. MATLAB Tensor Toolbox Version 2.6, Available online, February 2015. URL: http://www.sandia.gov/~tgkolda/TensorToolbox/.
  3. Bernd Fischer and Gregoire Pau (2016). rhdf5: HDF5 interface to R. R package version 2.18.0.
  4. G. I. Allen, "Regularized Tensor Factorizations and Higher-Order Principal Components Analysis", Rice University Technical Report No. TR2012-01, arXiv:1202.2476, 2012.
  5. G. I. Allen, "Sparse Higher-Order Principal Components Analysis", In Proceedings of the 15th International Conference on Artificial Intelligence and Statistics, 2012. [pdf]