Research Article Open Access

Inversion of Covariance Matrix for High Dimension Data

Samruam Chongcharoen

Abstract

Problem statement: In the testing statistic problem for the mean vector of independent and identically distributed multivariate normal random vectors with unknown covariance matrix when the data has sample size less than the dimension n≤p, for example, the data came from DNA microarrays where a large number of gene expression levels are measured on relatively few subjects, the p×p sample covariance matrix S does not have an inverse.. Hence any statistic value involving inversion of S does not exist. Approach: In this study, we showed a version of some modification on S, S+cI and find a real smallest value c≠0 which makes (S + cI)-1 exist. Results: The result from study provided when the dimension p tends to infinity and smallest change in S, the (S + cI)-1 do exist when c = 1. Conclusion: In statistical analysis involving with high dimensional data that an inversion of sample covariance matrix do not exist, one way to modify a sample covariance matrix S to have an inverse is to consider a sample covariance matrix, S, as the form S + cI and we recommend to choose c = 1.

Journal of Mathematics and Statistics
Volume 7 No. 3, 2011, 227-229

DOI: https://doi.org/10.3844/jmssp.2011.227.229

Submitted On: 16 March 2011 Published On: 27 July 2011

How to Cite: Chongcharoen, S. (2011). Inversion of Covariance Matrix for High Dimension Data. Journal of Mathematics and Statistics, 7(3), 227-229. https://doi.org/10.3844/jmssp.2011.227.229

  • 4,148 Views
  • 2,581 Downloads
  • 0 Citations

Download

Keywords

  • DNA micro arrays
  • eigenvalue
  • positive semidefinite
  • positive definite
  • gene expression
  • covariance matrix
  • statistic value
  • real vector
  • real number
  • determinant
  • symmetric matrix
  • definite matrix