• 中文核心期刊要目总览
  • 中国科技核心期刊
  • 中国科学引文数据库(CSCD)
  • 中国科技论文与引文数据库(CSTPCD)
  • 中国学术期刊文摘数据库(CSAD)
  • 中国学术期刊(网络版)(CNKI)
  • 中文科技期刊数据库
  • 万方数据知识服务平台
  • 中国超星期刊域出版平台
  • 国家科技学术期刊开放平台
  • 荷兰文摘与引文数据库(SCOPUS)
  • 日本科学技术振兴机构数据库(JST)

近红外光谱分析中的一种基于XY变量联合的异常样本剔除算法

An outlier sample eliminating algorithm based on joint XY distances for near infrared spectroscopy analysis

  • 摘要: 在近红外光谱分析中,异常样本的存在会影响所建预测模型的性能.为了剔除异常样本,提高预测模型的预测能力,首先提出并证明了XY距离关系定理;在此基础上,设计了一种新型的基于XY变量联合的ODXY异常样本剔除算法.本次研究对102个羊肉样本的近红外光谱及其含水率进行了测定,在此样本集上分别采用常用的马氏距离剔除法、蒙特卡洛采样法和本文提出的ODXY算法对异常样品进行判别和剔除,并用剔除后的样本建立偏最小二乘预测模型;然后采用预测均方差RMSEP和决定系数R2来检验模型的性能;最后,通过重新分配训练集和验证集检验算法的泛化能力.实验结果表明,在利用ODXY算法剔除预测样本的基础上建立的预测模型性能最佳,且具有更好的泛化能力.

     

    Abstract: Outlier samples in near infrared spectroscopy analysis can strongly influence on the performance of the prediction model. To detect and eliminate the outlier samples, a new outlier sample eliminating algorithm base on joint XY distances (ODXY) was presented, and the relation of XY distances of NIR is proposed and proved. In this research, 102 lamb samples were collected and the data of NIR spectroscopy and moisture content was measured and analyzed. Initially, Mahalanobis distances method, Monte-Carlo sampling method and ODXY method to were employed to eliminate the outlier samples and built the PLS prediction model based on the processed samples. Then, the predictive mean square error (RMSEP) and the coefficient of determination (R2) were used to test the performance of the prediction model. Finally, the generalization of the eliminating algorithm was tested by new calibration and validation sets. The experiments show that ODXY method has better performance and better generalization ability than the other methods tested in our experiments.

     

/

返回文章
返回