• 中文核心期刊要目总览
  • 中国科技核心期刊
  • 中国科学引文数据库(CSCD)
  • 中国科技论文与引文数据库(CSTPCD)
  • 中国学术期刊文摘数据库(CSAD)
  • 中国学术期刊(网络版)(CNKI)
  • 中文科技期刊数据库
  • 万方数据知识服务平台
  • 中国超星期刊域出版平台
  • 国家科技学术期刊开放平台
  • 荷兰文摘与引文数据库(SCOPUS)
  • 日本科学技术振兴机构数据库(JST)

机器学习在数据包络分析模型中的应用:一种用于指标选择的智能机制

Machine learning in data envelopment analysis: A smart mechanism for indicator selection

  • 摘要: 指标选择一直是数据包络分析中一个引人注目的问题。随着大数据时代的到来,学者们面临着更加复杂的指标选择情形。机器学习的蓬勃发展为解决这一问题提供了机会。然而,在容易过拟合或欠拟合的情形下,如果使用不恰当的方法,很可能会筛选出质量差的指标。一些学者已经率先使用最小绝对收缩和选择算子来克服过拟合情形下的指标选择难题,但迄今为止,研究者并没有提出将数据包络分析所面临的大数据场景划分为容易过拟合或欠拟合的情形,也没有尝试为这两种情形开发一套完整的指标选择体系。为了填补这些研究空白,本研究采用了机器学习方法,并在此基础上提出了一种平均得分法。蒙特卡洛模拟表明,最小绝对收缩和选择算子在过拟合情形下的指标选择问题中表现优异,但在欠拟合情形下往往不能选择出好的指标,而集成方法则能在欠拟合情景下占据一定程度的优势;至于本文提出的均值法,则在两种情形下都有较好的表现。基于不同方法的优势和局限性,本研究提出了一种智能指标选择机制,以协助数据包络分析领域的学者们进行指标的选择。

     

    Abstract: Indicator selection has been a compelling problem in data envelopment analysis. With the advent of the big data era, scholars are faced with more complex indicator selection situations. The boom in machine learning presents an opportunity to address this problem. However, poor quality indicators may be selected if inappropriate methods are used in overfitting or underfitting scenarios. To date, some scholars have pioneered the use of the least absolute shrinkage and selection operator to select indicators in overfitting scenarios, but researchers have not proposed classifying the big data scenarios encountered by DEA into overfitting and underfitting scenarios, nor have they attempted to develop a complete indicator selection system for both scenarios. To fill these research gaps, this study employs machine learning methods and proposes a mean score approach based on them. Our Monte Carlo simulations show that the least absolute shrinkage and selection operator dominates in overfitting scenarios but fails to select good indicators in underfitting scenarios, while the ensemble methods are superior in underfitting scenarios, and the proposed mean approach performs well in both scenarios. Based on the strengths and limitations of the different methods, a smart indicator selection mechanism is proposed to facilitate the selection of DEA indicators.

     

/

返回文章
返回