Gaussian graphical model estimation with measurement error

Xianglu Wang

doi:10.52396/JUSTC-2022-0108

PDF( 1033 KB)

Open Access JUSTC Mathematics 15 January 2024

Gaussian graphical model estimation with measurement error

Xianglu Wang^,

Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei 230026, China

Cite this:

https://doi.org/10.52396/JUSTC-2022-0108

More Information

Author Bio:
Xianglu Wang is currently a master student at the School of Management, University of Science and Technology of China (USTC). He received his B.S. degree from USTC in 2019. His research mainly focuses on high-dimensional variable selection and inference
Corresponding author: E-mail: wz124517@mail.ustc.edu.cn
Received Date: 28 July 2022
Accepted Date: 23 October 2022

Available Online: 15 January 2024

Abstract Full text PDF

Abstract

Abstract

It is well known that regression methods designed for clean data will lead to erroneous results if directly applied to corrupted data. Despite the recent methodological and algorithmic advances in Gaussian graphical model estimation, how to achieve efficient and scalable estimation under contaminated covariates is unclear. Here a new methodology called convex conditioned innovative scalable efficient estimation (COCOISEE) for Gaussian graphical models under both additive and multiplicative measurement errors is developed. It combines the strengths of the innovative scalable efficient estimation in the Gaussian graphical model and the nearest positive semidefinite matrix projection, thus enjoying stepwise convexity and scalability. Comprehensive theoretical guarantees are provided and the effectiveness of the proposed methodology is demonstrated through numerical studies.

Graphical abstract

The process of recovering the precision matrix in the presence of additive and multiplicative measurement errors.

Abstract

It is well known that regression methods designed for clean data will lead to erroneous results if directly applied to corrupted data. Despite the recent methodological and algorithmic advances in Gaussian graphical model estimation, how to achieve efficient and scalable estimation under contaminated covariates is unclear. Here a new methodology called convex conditioned innovative scalable efficient estimation (COCOISEE) for Gaussian graphical models under both additive and multiplicative measurement errors is developed. It combines the strengths of the innovative scalable efficient estimation in the Gaussian graphical model and the nearest positive semidefinite matrix projection, thus enjoying stepwise convexity and scalability. Comprehensive theoretical guarantees are provided and the effectiveness of the proposed methodology is demonstrated through numerical studies.

Public Summary

We propose a new methodology COCOISEE to achieve scalable and interpretable estimation for Gaussian graphical model under both additive and multiplicative measurement errors.
The method is stepwise convex, computationally stable, efficient and scalable.
Both theoretical and simulation results verify the feasibility of our method.

FullText(HTML)

References(41)

References

[1]	Baselmans B M, Jansen R, Ip H F, et al. Multivariate genome-wide analyses of the well-being spectrum. Nature Genetics, 2019, 51 (3): 445–451. doi: 10.1038/s41588-018-0320-8
[2]	Yang K, Lee L F. Identification and QML estimation of multivariate and simultaneous equations spatial autoregressive models. Journal of Econometrics, 2017, 196 (1): 196–214. doi: 10.1016/j.jeconom.2016.04.019
[3]	Zhu X, Huang D, Pan R, et al. Multivariate spatial autoregressive model for large scale social networks. Journal of Econometrics, 2020, 215 (2): 591–606. doi: 10.1016/j.jeconom.2018.11.018
[4]	Han F, Liu H. Optimal rates of convergence for latent generalized correlation matrix estimation in transelliptical distribution. arXiv: 1305.6916, 2013.
[5]	Rubinstein M. Markowitz’s “portfolio selection”: A fifty-year retrospective. The Journal of Finance, 2002, 57 (3): 1041–1045. doi: 10.1111/1540-6261.00453
[6]	Wegkamp M, Zhao Y. Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas. Bernoulli, 2016, 22 (2): 1184–1226. doi: 10.3150/14-BEJ690
[7]	Fan J, Han F, Liu H. Challenges of big data analysis. National Science Review, 2014, 1 (2): 293–314. doi: 10.1093/nsr/nwt032
[8]	Cai T, Liu W, Luo X. A constrained ℓ₁ minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 2011, 106 (494): 594–607. doi: 10.1198/jasa.2011.tm10155
[9]	Tokuda T, Goodrich B, van Mechelen I, et al. Visualizing distributions of covariance matrices. New York: Columbia University, 2011.
[10]	Fan J, Peng H. Nonconcave penalized likelihood with a diverging number of parameters. The Annals of Statistics, 2004, 32 (3): 928–961. doi: 10.1214/009053604000000256
[11]	Yuan M, Lin Y. Model selection and estimation in the Gaussian graphical model. Biometrika, 2007, 94 (1): 19–35. doi: 10.1093/biomet/asm018
[12]	Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical Lasso. Biostatistics, 2008, 9 (3): 432–441. doi: 10.1093/biostatistics/kxm045
[13]	Banerjee O, El Ghaoui L, d’Aspremont A. Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. The Journal of Machine Learning Research, 2008, 9: 485–516. doi: 10.5555/1390681.1390696
[14]	Meinshausen N, Bühlmann P. High-dimensional graphs and variable selection with the Lasso. The Annals of Statistics, 2006, 34 (3): 1436–1462. doi: 10.1214/009053606000000281
[15]	Wille A, Zimmermann P, Vranova E, et al. Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana. Genome Biology, 2004, 5 (11): R92. doi: 10.1186/gb-2004-5-11-r92
[16]	Rothman A J, Bickel P J, Levina E, et al. Sparse permutation invariant covariance estimation. Electronic Journal of Statistics, 2008, 2: 494–515. doi: 10.1214/08-EJS176
[17]	Lam C, Fan J. Sparsistency and rates of convergence in large covariance matrix estimation. The Annals of Statistics, 2009, 37 (6B): 4254–4278. doi: 10.1214/09-AOS720
[18]	Yuan M. High dimensional inverse covariance matrix estimation via linear programming. The Journal of Machine Learning Research, 2010, 11: 2261–2286. doi: 10.5555/1756006.1859930
[19]	Liu W, Luo X. High-dimensional sparse precision matrix estimation via sparse column inverse operator. arXiv: 1203.3896, 2012.
[20]	Sun T, Zhang C H. Sparse matrix inversion with scaled Lasso. The Journal of Machine Learning Research, 2013, 14 (1): 3385–3418. doi: 10.5555/2567709.2567771
[21]	Fan Y, Lv J. Innovated scalable efficient estimation in ultra-large Gaussian graphical models. The Annals of Statistics, 2016, 44 (5): 2098–2126. doi: 10.1214/15-AOS1416
[22]	Bickel P, Ritov Y. Efficient estimation in the errors in variables model. The Annals of Statistics, 1987, 15 (2): 513–540. doi: 10.1214/aos/1176350358
[23]	Ma Y, Li R. Variable selection in measurement error models. Bernoulli, 2010, 16 (1): 274–300. doi: 10.3150/09-bej205
[24]	Liang H, Li R. Variable selection for partially linear models with measurement errors. Journal of the American Statistical Association, 2009, 104 (485): 234–248. doi: 10.1198/jasa.2009.0127
[25]	Städler N, Bühlmann P. Missing values: Sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing, 2012, 22 (1): 219–235. doi: 10.1007/s11222-010-9219-7
[26]	Loh P L, Wainwright M J. High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity. Advances in Neural Information Processing Systems, 2012, 40 (3): 1637–1664. doi: 10.1214/12-AOS1018
[27]	Belloni A, Rosenbaum M, Tsybakov A B. Linear and conic programming estimators in high dimensional errors-in-variables models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2017, 79 (3): 939–956. doi: 10.1111/rssb.12196
[28]	Datta A, Zou H. Cocolasso for high-dimensional error-in-variables regression. The Annals of Statistics, 2017, 45 (6): 2400–2426. doi: 10.1214/16-AOS1527
[29]	Tibshirani R. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 1996, 58 (1): 267–288. doi: 10.1111/j.2517-6161.1996.tb02080.x
[30]	Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 2001, 96 (456): 1348–1360. doi: 10.1198/016214501753382273
[31]	Zou H, Hastie T. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2005, 67 (2): 301–320. doi: 10.1111/j.1467-9868.2005.00503.x
[32]	Zou H. The adaptive Lasso and its oracle properties. Journal of the American Statistical Association, 2006, 101 (476): 1418–1429. doi: 10.1198/016214506000000735
[33]	Candes E, Tao T. The Dantzig selector: Statistical estimation when p is much larger than n. The Annals of Statistics, 2007, 35 (6): 2313–2351. doi: 10.1214/009053606000001523
[34]	Bickel P J, Ritov Y, Tsybakov A B. Simultaneous analysis of Lasso and Dantzig selector. The Annals of Statistics, 2009, 37 (4): 1705–1732. doi: 10.1214/08-AOS620
[35]	Zhao P, Yu B. On model selection consistency of Lasso. The Journal of Machine Learning Research, 2006, 7: 2541–2563. doi: 10.5555/1248547.1248637
[36]	Wainwright M J. Sharp thresholds for high-dimensional and noisy sparsity recovery using l₁-constrained quadratic programming (Lasso). IEEE Transactions on Information Theory, 2009, 55 (5): 2183–2202. doi: 10.1109/TIT.2009.2016018
[37]	Buldygin V V, Kozachenko Yu V. Metric Characterization of Random Variables and Random Processes. Providence, RI: American Mathematical Society, 2000.
[38]	Sun T, Zhang C H. Scaled sparse linear regression. Biometrika, 2012, 99 (4): 879–898. doi: 10.1093/biomet/ass043
[39]	Ren Z, Sun T, Zhang C H, et al. Asymptotic normality and optimalities in estimation of large Gaussian graphical models. The Annals of Statistics, 2015, 43 (3): 991–1026. doi: 10.1214/14-AOS1286
[40]	Bickel P J, Levina E. Regularized estimation of large covariance matrices. The Annals of Statistics, 2008, 36 (1): 199–227. doi: 10.1214/009053607000000758
[41]	Bickel P J, Levina E. Covariance regularization by thresholding. The Annals of Statistics, 2008, 36 (6): 2577–2604. doi: 10.1214/08-AOS600

Supplements(0)

Track Citations

Proportional views

Proportional views

Get Citation

PDF

XML

[1]	Baselmans B M, Jansen R, Ip H F, et al. Multivariate genome-wide analyses of the well-being spectrum. Nature Genetics, 2019, 51 (3): 445–451. doi: 10.1038/s41588-018-0320-8
[2]	Yang K, Lee L F. Identification and QML estimation of multivariate and simultaneous equations spatial autoregressive models. Journal of Econometrics, 2017, 196 (1): 196–214. doi: 10.1016/j.jeconom.2016.04.019
[3]	Zhu X, Huang D, Pan R, et al. Multivariate spatial autoregressive model for large scale social networks. Journal of Econometrics, 2020, 215 (2): 591–606. doi: 10.1016/j.jeconom.2018.11.018
[4]	Han F, Liu H. Optimal rates of convergence for latent generalized correlation matrix estimation in transelliptical distribution. arXiv: 1305.6916, 2013.
[5]	Rubinstein M. Markowitz’s “portfolio selection”: A fifty-year retrospective. The Journal of Finance, 2002, 57 (3): 1041–1045. doi: 10.1111/1540-6261.00453
[6]	Wegkamp M, Zhao Y. Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas. Bernoulli, 2016, 22 (2): 1184–1226. doi: 10.3150/14-BEJ690
[7]	Fan J, Han F, Liu H. Challenges of big data analysis. National Science Review, 2014, 1 (2): 293–314. doi: 10.1093/nsr/nwt032
[8]	Cai T, Liu W, Luo X. A constrained ℓ₁ minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 2011, 106 (494): 594–607. doi: 10.1198/jasa.2011.tm10155
[9]	Tokuda T, Goodrich B, van Mechelen I, et al. Visualizing distributions of covariance matrices. New York: Columbia University, 2011.
[10]	Fan J, Peng H. Nonconcave penalized likelihood with a diverging number of parameters. The Annals of Statistics, 2004, 32 (3): 928–961. doi: 10.1214/009053604000000256
[11]	Yuan M, Lin Y. Model selection and estimation in the Gaussian graphical model. Biometrika, 2007, 94 (1): 19–35. doi: 10.1093/biomet/asm018
[12]	Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical Lasso. Biostatistics, 2008, 9 (3): 432–441. doi: 10.1093/biostatistics/kxm045
[13]	Banerjee O, El Ghaoui L, d’Aspremont A. Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. The Journal of Machine Learning Research, 2008, 9: 485–516. doi: 10.5555/1390681.1390696
[14]	Meinshausen N, Bühlmann P. High-dimensional graphs and variable selection with the Lasso. The Annals of Statistics, 2006, 34 (3): 1436–1462. doi: 10.1214/009053606000000281
[15]	Wille A, Zimmermann P, Vranova E, et al. Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana. Genome Biology, 2004, 5 (11): R92. doi: 10.1186/gb-2004-5-11-r92
[16]	Rothman A J, Bickel P J, Levina E, et al. Sparse permutation invariant covariance estimation. Electronic Journal of Statistics, 2008, 2: 494–515. doi: 10.1214/08-EJS176
[17]	Lam C, Fan J. Sparsistency and rates of convergence in large covariance matrix estimation. The Annals of Statistics, 2009, 37 (6B): 4254–4278. doi: 10.1214/09-AOS720
[18]	Yuan M. High dimensional inverse covariance matrix estimation via linear programming. The Journal of Machine Learning Research, 2010, 11: 2261–2286. doi: 10.5555/1756006.1859930
[19]	Liu W, Luo X. High-dimensional sparse precision matrix estimation via sparse column inverse operator. arXiv: 1203.3896, 2012.
[20]	Sun T, Zhang C H. Sparse matrix inversion with scaled Lasso. The Journal of Machine Learning Research, 2013, 14 (1): 3385–3418. doi: 10.5555/2567709.2567771
[21]	Fan Y, Lv J. Innovated scalable efficient estimation in ultra-large Gaussian graphical models. The Annals of Statistics, 2016, 44 (5): 2098–2126. doi: 10.1214/15-AOS1416
[22]	Bickel P, Ritov Y. Efficient estimation in the errors in variables model. The Annals of Statistics, 1987, 15 (2): 513–540. doi: 10.1214/aos/1176350358
[23]	Ma Y, Li R. Variable selection in measurement error models. Bernoulli, 2010, 16 (1): 274–300. doi: 10.3150/09-bej205
[24]	Liang H, Li R. Variable selection for partially linear models with measurement errors. Journal of the American Statistical Association, 2009, 104 (485): 234–248. doi: 10.1198/jasa.2009.0127
[25]	Städler N, Bühlmann P. Missing values: Sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing, 2012, 22 (1): 219–235. doi: 10.1007/s11222-010-9219-7
[26]	Loh P L, Wainwright M J. High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity. Advances in Neural Information Processing Systems, 2012, 40 (3): 1637–1664. doi: 10.1214/12-AOS1018
[27]	Belloni A, Rosenbaum M, Tsybakov A B. Linear and conic programming estimators in high dimensional errors-in-variables models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2017, 79 (3): 939–956. doi: 10.1111/rssb.12196
[28]	Datta A, Zou H. Cocolasso for high-dimensional error-in-variables regression. The Annals of Statistics, 2017, 45 (6): 2400–2426. doi: 10.1214/16-AOS1527
[29]	Tibshirani R. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological), 1996, 58 (1): 267–288. doi: 10.1111/j.2517-6161.1996.tb02080.x
[30]	Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 2001, 96 (456): 1348–1360. doi: 10.1198/016214501753382273
[31]	Zou H, Hastie T. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2005, 67 (2): 301–320. doi: 10.1111/j.1467-9868.2005.00503.x
[32]	Zou H. The adaptive Lasso and its oracle properties. Journal of the American Statistical Association, 2006, 101 (476): 1418–1429. doi: 10.1198/016214506000000735
[33]	Candes E, Tao T. The Dantzig selector: Statistical estimation when p is much larger than n. The Annals of Statistics, 2007, 35 (6): 2313–2351. doi: 10.1214/009053606000001523
[34]	Bickel P J, Ritov Y, Tsybakov A B. Simultaneous analysis of Lasso and Dantzig selector. The Annals of Statistics, 2009, 37 (4): 1705–1732. doi: 10.1214/08-AOS620
[35]	Zhao P, Yu B. On model selection consistency of Lasso. The Journal of Machine Learning Research, 2006, 7: 2541–2563. doi: 10.5555/1248547.1248637
[36]	Wainwright M J. Sharp thresholds for high-dimensional and noisy sparsity recovery using l₁-constrained quadratic programming (Lasso). IEEE Transactions on Information Theory, 2009, 55 (5): 2183–2202. doi: 10.1109/TIT.2009.2016018
[37]	Buldygin V V, Kozachenko Yu V. Metric Characterization of Random Variables and Random Processes. Providence, RI: American Mathematical Society, 2000.
[38]	Sun T, Zhang C H. Scaled sparse linear regression. Biometrika, 2012, 99 (4): 879–898. doi: 10.1093/biomet/ass043
[39]	Ren Z, Sun T, Zhang C H, et al. Asymptotic normality and optimalities in estimation of large Gaussian graphical models. The Annals of Statistics, 2015, 43 (3): 991–1026. doi: 10.1214/14-AOS1286
[40]	Bickel P J, Levina E. Regularized estimation of large covariance matrices. The Annals of Statistics, 2008, 36 (1): 199–227. doi: 10.1214/009053607000000758
[41]	Bickel P J, Levina E. Covariance regularization by thresholding. The Annals of Statistics, 2008, 36 (6): 2577–2604. doi: 10.1214/08-AOS600

TrendMD

Volume 53 Issue 11 page: 1105

Cover

Keywords

Article Metrics

Article views (331) PDF downloads(1013)

Gaussian graphical model estimation with measurement error

Abstract

Graphical abstract

Abstract

Public Summary

References

Proportional views

Catalog

Recommended articles

TrendMD

Article Metrics

Proportional views

Authors

Browse

Contact Us

About

Gaussian graphical model estimation with measurement error

Share

Tools

Abstract

Graphical abstract

Abstract

Public Summary

References

Proportional views

Catalog

Recommended articles

TrendMD

Article Metrics

Proportional views

Authors

Browse

Contact Us

About

Export File

Citation

Format

Content