ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

An outlier sample eliminating algorithm based on joint XY distances for near infrared spectroscopy analysis

Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2016.03.005
  • Received Date: 27 August 2015
  • Accepted Date: 01 December 2015
  • Rev Recd Date: 01 December 2015
  • Publish Date: 30 March 2016
  • Outlier samples in near infrared spectroscopy analysis can strongly influence on the performance of the prediction model. To detect and eliminate the outlier samples, a new outlier sample eliminating algorithm base on joint XY distances (ODXY) was presented, and the relation of XY distances of NIR is proposed and proved. In this research, 102 lamb samples were collected and the data of NIR spectroscopy and moisture content was measured and analyzed. Initially, Mahalanobis distances method, Monte-Carlo sampling method and ODXY method to were employed to eliminate the outlier samples and built the PLS prediction model based on the processed samples. Then, the predictive mean square error (RMSEP) and the coefficient of determination (R2) were used to test the performance of the prediction model. Finally, the generalization of the eliminating algorithm was tested by new calibration and validation sets. The experiments show that ODXY method has better performance and better generalization ability than the other methods tested in our experiments.
    Outlier samples in near infrared spectroscopy analysis can strongly influence on the performance of the prediction model. To detect and eliminate the outlier samples, a new outlier sample eliminating algorithm base on joint XY distances (ODXY) was presented, and the relation of XY distances of NIR is proposed and proved. In this research, 102 lamb samples were collected and the data of NIR spectroscopy and moisture content was measured and analyzed. Initially, Mahalanobis distances method, Monte-Carlo sampling method and ODXY method to were employed to eliminate the outlier samples and built the PLS prediction model based on the processed samples. Then, the predictive mean square error (RMSEP) and the coefficient of determination (R2) were used to test the performance of the prediction model. Finally, the generalization of the eliminating algorithm was tested by new calibration and validation sets. The experiments show that ODXY method has better performance and better generalization ability than the other methods tested in our experiments.
  • loading
  • [1]
    KOTEESWARAN S, VISU P, JANET J. A review on clustering and outlier analysis techniques in data mining[J]. American Journal of Applied Sciences, 2012, 9(2): 254-258.
    [2]
    ZOU X B, ZHAO J W, POVEY M J W, et al. Variables selection methods in near-infrared spectroscopy[J]. Analytica Chimica Acta, 2010, 667(1-2):14-32.
    [3]
    MOUROT B P, GRUFFAT D, DURAND D, et al. Breeds and muscle types modulate performance of near-infrared reflectance spectroscopy to predict the fatty acid composition of bovine meat[J]. Meat Science, 2015, 99(99):104-112.
    [4]
    TALENS P, MORA L, MORSY N, et al. Prediction of water and protein contents and quality classification of Spanish cooked ham using NIR hyperspectral imaging[J]. Journal of Food Engineering, 2013, 117(3): 272-280.
    [5]
    REEVES J B, VAN KESSEL J S. Near-infrared spectroscopic determination of carbon, total nitrogen, and ammonium-N in dairy manures[J]. Journal of Dairy Science, 2000, 83(8): 1829-1836.
    [6]
    刘智超, 蔡文生, 邵学广. 蒙特卡洛交叉验证用于近红外光谱奇异样本的识别[J]. 中国科学(B辑:化学), 2008, 38(4):316-323.
    [7]
    祝诗平, 王一鸣, 张小超,等. 近红外光谱建模异常样品剔除准则与方法[J]. 农业机械学报, 2004, 35(4): 115-119.
    ZHU Shiping, WANG Yiming, ZHANG Xiaochao, et al. Outlier sample eliminating criterions and methods for building calibration model of near infrared spectroscopy analysis[J]. Transactions of the Chinese Society for Agricultural Machinery, 2004, 35(4): 115-119.
    [8]
    赵振英, 林君, 张怀柱. 近红外光谱法分析油页岩含油率中异常样品识别和剔除方法的研究[J]. 光谱学与光谱分析, 2014, 34(6): 1707-1710.
    ZHAO Zhenying, LIN Jun, ZHANG Huaizhu. Research on outlier detection methods for determination of oil yield in oil shales using near-infrared spectroscopy[J]. Spectroscopy and Spectral Analysis, 2014, 34(6): 1707-1710.
    [9]
    BHATTACHARYA G, GHOSH K, CHOWDHURY A S. Outlier detection using neighborhood rank difference[J]. Pattern Recognition Letters, 2015, 60(C): 24-31.
    [10]
    NURUNNABI A, WEST G, BELTON D. Outlier detection and robust normal-curvature estimation in mobile laser scanning 3D point cloud data[J]. Pattern Recognition, 2015, 48(4): 1404-1419.
    [11]
    REIS M M, ROSENVOLD K. Early on-line classification of beef carcasses based on ultimate pH by near infrared spectroscopy [J]. Meat Science, 2014, 96(2): 862-869.
    [12]
    ALAMPRESE C, CASALE M, SINELLI N, et al. Detection of minced beef adulteration with turkey meat by UV-vis, NIR and MIR spectroscopy[J]. LWT-Food Science and Technology, 2013, 53(1): 225-232.
    [13]
    KAMRUZZAMAN M, SUN D W, ELMASRY G, et al. Fast detection and visualization of minced lamb meat adulteration using NIR hyperspectral imaging and multivariate image analysis[J]. Talanta, 2013, 103(2): 130-136.
    [14]
    GALVO R K, ARAUJO M C U, JOS G E, et al. A method for calibration and validation subset partitioning[J]. Talanta, 2005, 67(4): 736-740.)
  • 加载中

Catalog

    [1]
    KOTEESWARAN S, VISU P, JANET J. A review on clustering and outlier analysis techniques in data mining[J]. American Journal of Applied Sciences, 2012, 9(2): 254-258.
    [2]
    ZOU X B, ZHAO J W, POVEY M J W, et al. Variables selection methods in near-infrared spectroscopy[J]. Analytica Chimica Acta, 2010, 667(1-2):14-32.
    [3]
    MOUROT B P, GRUFFAT D, DURAND D, et al. Breeds and muscle types modulate performance of near-infrared reflectance spectroscopy to predict the fatty acid composition of bovine meat[J]. Meat Science, 2015, 99(99):104-112.
    [4]
    TALENS P, MORA L, MORSY N, et al. Prediction of water and protein contents and quality classification of Spanish cooked ham using NIR hyperspectral imaging[J]. Journal of Food Engineering, 2013, 117(3): 272-280.
    [5]
    REEVES J B, VAN KESSEL J S. Near-infrared spectroscopic determination of carbon, total nitrogen, and ammonium-N in dairy manures[J]. Journal of Dairy Science, 2000, 83(8): 1829-1836.
    [6]
    刘智超, 蔡文生, 邵学广. 蒙特卡洛交叉验证用于近红外光谱奇异样本的识别[J]. 中国科学(B辑:化学), 2008, 38(4):316-323.
    [7]
    祝诗平, 王一鸣, 张小超,等. 近红外光谱建模异常样品剔除准则与方法[J]. 农业机械学报, 2004, 35(4): 115-119.
    ZHU Shiping, WANG Yiming, ZHANG Xiaochao, et al. Outlier sample eliminating criterions and methods for building calibration model of near infrared spectroscopy analysis[J]. Transactions of the Chinese Society for Agricultural Machinery, 2004, 35(4): 115-119.
    [8]
    赵振英, 林君, 张怀柱. 近红外光谱法分析油页岩含油率中异常样品识别和剔除方法的研究[J]. 光谱学与光谱分析, 2014, 34(6): 1707-1710.
    ZHAO Zhenying, LIN Jun, ZHANG Huaizhu. Research on outlier detection methods for determination of oil yield in oil shales using near-infrared spectroscopy[J]. Spectroscopy and Spectral Analysis, 2014, 34(6): 1707-1710.
    [9]
    BHATTACHARYA G, GHOSH K, CHOWDHURY A S. Outlier detection using neighborhood rank difference[J]. Pattern Recognition Letters, 2015, 60(C): 24-31.
    [10]
    NURUNNABI A, WEST G, BELTON D. Outlier detection and robust normal-curvature estimation in mobile laser scanning 3D point cloud data[J]. Pattern Recognition, 2015, 48(4): 1404-1419.
    [11]
    REIS M M, ROSENVOLD K. Early on-line classification of beef carcasses based on ultimate pH by near infrared spectroscopy [J]. Meat Science, 2014, 96(2): 862-869.
    [12]
    ALAMPRESE C, CASALE M, SINELLI N, et al. Detection of minced beef adulteration with turkey meat by UV-vis, NIR and MIR spectroscopy[J]. LWT-Food Science and Technology, 2013, 53(1): 225-232.
    [13]
    KAMRUZZAMAN M, SUN D W, ELMASRY G, et al. Fast detection and visualization of minced lamb meat adulteration using NIR hyperspectral imaging and multivariate image analysis[J]. Talanta, 2013, 103(2): 130-136.
    [14]
    GALVO R K, ARAUJO M C U, JOS G E, et al. A method for calibration and validation subset partitioning[J]. Talanta, 2005, 67(4): 736-740.)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return