Processing math: 100%

ISSN 0253-2778

CN 34-1054/N

open
Open AccessOpen Access JUSTC Mathematics; Life Sciences Article

Sib-pair genetic longitudinal studies with missing not at random data

Cite this: JUSTC, 2024, 54(12): 1203
https://doi.org/10.52396/JUSTC-2024-0026
CSTR: 32290.14.JUSTC-2024-0026
More Information
  • Author Bio:

    Siyu Jiang is currently a graduate student under the tutelage of Prof. Hong Zhang at the University of Science and Technology of China. His research mainly focuses on statistical genetics

    Hong Zhang is a Full Professor at the University of Science and Technology of China (USTC). He received his Bachelor’s degree in Mathematics and Ph.D. degree in Statistics from USTC in 1997 and 2003, respectively. His research mainly focuses on statistical genetics, causal inference, and machine learning

  • Corresponding author:

    Hong Zhang, E-mail: zhangh@ustc.edu.cn

  • Received Date: February 23, 2024
  • Accepted Date: April 24, 2024
  • In the interdisciplinary realm of statistics, genetics, and epidemiology, longitudinal sibling pair data offers a unique perspective for investigating complex diseases and traits, allowing the exploration of the dynamic processes of gene expression over time by controlling numerous confounding factors. Missing-not-at-random (MNAR) data are commonly used in such types of studies, but no statistical methods specifically tailored have been developed to handle MNAR data in complex longitudinal data in the literature. Here, we propose a new statistical method by jointly modeling longitudinal data from sib-pairs and MNAR data. Extensive simulations demonstrate the excellent finite sample properties of the proposed method.

    H-GEE method flowchart.

    • The challenge in longitudinal studies lies in effectively addressing missing-not-at-random (MNAR) data, which complicates data analysis.
    • The proposed H-GEE method, combining the Heckman model and generalized estimating equations (GEE), aims to overcome MNAR challenges for robust data analysis.
    • Extensive simulations validate H-GEE’s effectiveness in handling MNAR data, highlighting its potential for advancing genetic and epidemiological research.

Catalog

    {{if article.pdfAccess}}
    {{if article.articleBusiness.pdfLink && article.articleBusiness.pdfLink != ''}} {{else}} {{/if}}PDF
    {{/if}}
    XML
    [1]
    Rutter M. Nature, nurture, and development: From evangelism through science toward policy and practice. Child Development, 2002, 73 (1): 1–21. DOI: 10.1111/1467-8624.00388
    [2]
    Friedlander Y, Talmud P J, Edwards K L, et al. Sib-pair linkage analysis of longitudinal changes in lipoprotein risk factors and lipase genes in women twins. Journal of Lipid Research, 2000, 41 (8): 1302–1309. DOI: 10.1016/S0022-2275(20)33438-6
    [3]
    Guo Z, Li X, Rao S Q, et al. Multivariate sib-pair linkage analysis of longitudinal phenotypes by three step-wise analysis approaches. BMC Genetics, 2003, 4 (1): 1–7. DOI: 10.1186/1471-2156-4-1
    [4]
    Keyes M A, Malone S M, Elkins I J, et al. The enrichment study of the Minnesota twin family study: increasing the yield of twin families at high risk for externalizing psychopathology. Twin Research and Human Genetics, 2009, 12 (5): 489–501. DOI: 10.1375/twin.12.5.489
    [5]
    Silventoinen K, Sammalisto S, Perola M, et al. Heritability of adult body height: A comparative study of twin cohorts in eight countries. Twin Research, 2003, 6 (5): 399–408. DOI: 10.1375/136905203770326402
    [6]
    Hand D M, Crowder M J. Practical Longitudinal Data Analysis. New York: Chapman & Hall/CRC, 1996 .
    [7]
    Verbeke G. Linear mixed models for longitudinal data. In: Linear Mixed Models in Practice. New York: Springer, 1997 .
    [8]
    Diggle P, Heagerty P, Liang K Y, et al. Analysis of Longitudinal Data. New York: Oxford University Press, 2002 .
    [9]
    Fitzmaurice G M, Laird N M, Ware J H. Applied Longitudinal Analysis. Hoboken, USA: Wiley, 2012 .
    [10]
    Everitt B S, Dunn G. Applied Multivariate Data Analysis. Second Edition. Chichester, UK: Wiley, 2001 .
    [11]
    Dunlop D D. Regression for longitudinal data: a bridge from least squares regression. The American Statistician, 1994, 48 (4): 299–303. DOI: 10.1080/00031305.1994.10476085
    [12]
    Laird N M, Ware J H. Random-effects models for longitudinal data. Biometrics, 1982, 38 (4): 963–974. DOI: 10.2307/2529876
    [13]
    Laird N M, Lange N, Stram D. Maximum likelihood computations with repeated measures: application of the EM algorithm. Journal of the American Statistical Association, 1987, 82 (397): 97–105. DOI: 10.1080/01621459.1987.10478395
    [14]
    Liang K Y, Zeger S L. Longitudinal data analysis using generalized linear models. Biometrika, 1986, 73 (1): 13–22. DOI: 10.1093/biomet/73.1.13
    [15]
    Little R J, Rubin D B. Statistical Analysis with Missing Data. Third Edition. Hoboken, USA: Wiley, 2019 .
    [16]
    Little R J. Pattern-mixture models for multivariate incomplete data. Journal of the American Statistical Association, 1993, 88 (421): 125–134. DOI: 10.1080/01621459.1993.10594302
    [17]
    Sterne J A, Carlin J B, Royston P, et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ, 2009, 338: b2393. DOI: 10.1136/bmj.b2393
    [18]
    Schafer J L, Graham J W. Missing data: Our view of the state of the art. Psychological Methods, 2002, 7 (2): 147–177. DOI: 10.1037/1082-989X.7.2.147
    [19]
    Graham J W. Missing data analysis: Making it work in the real world. Annual Review of Psychology, 2009, 60: 549–576. DOI: 10.1146/annurev.psych.58.110405.085530
    [20]
    Heckman J J. Sample selection bias as a specification error. The Econometric Society, 1979, 47 (1): 153–161. DOI: 10.2307/1912352

    Article Metrics

    Article views (20) PDF downloads (1)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return