User:Timothee Flutre/Notebook/Postdoc/2011/11/20

From OpenWetWare

Jump to: navigation, search
Project name Main project page
Previous entry      Next entry

Entry title

  • Prepare journal club on "Analysis of population structure: A unifying framework and novel methods based on sparse factor analysis." by Engelhardt & Stephens (PLoS Genetics 2010).
  • From "Inference of population structure using multilocus genotype data" by Pritchard, Stephens & Donnelly (Genetics 2000):
    • data: genotypes at L loci for N individuals (matrix X: N x L) from several populations (K, unknown)
    • aim: jointly assign individuals to populations while estimating population allele frequencies P, allow admixture, use MCMC
  • From "Applied Multivariate Statistical Analysis" (Amazon):
    • Let be \mathbf{X} a vector of p observed variables with \mathbf{\mu} as mean vector and \mathbf{\Sigma} as covariance matrix.
    • A principal component analysis is concerned with explaining the variance-covariance structure of \mathbf{X} through a few linear (and uncorrelated) combinations of these variables. Although p components are required to reproduce the total variability, often much of this variability can be accounted for by a small number k of the principal components that depend solely on \mathbf{\Sigma}.
    • A factor analysis attempts to describe the covariance relationships among the X's in terms of a few underlying, but unobservable, random quantities called factors. It postulates that \mathbf{X} is linearly dependent upon k random variables F1,F2,...,Fk called factors, and p additional source of variation ε12,...,εp called errors. A matrix \mathbf{\Lambda} contains the loadings lij of the ith variable on the jth factor: \mathbf{X} = \mathbf{\mu} + \mathbf{\Lambda} \mathbf{F} + \mathbf{\epsilon}
    • The difference between the factor analysis model above and the multivariate linear regression model, \mathbf{Y} = \mathbf{X} \mathbf{B} + \mathbf{\epsilon}, is that in the latter both \mathbf{Y} and \mathbf{X} are observed, whereas in the former \mathbf{F} is not.


Personal tools