• 中文核心期刊要目总览
  • 中国科技核心期刊
  • 中国科学引文数据库(CSCD)
  • 中国科技论文与引文数据库(CSTPCD)
  • 中国学术期刊文摘数据库(CSAD)
  • 中国学术期刊(网络版)(CNKI)
  • 中文科技期刊数据库
  • 万方数据知识服务平台
  • 中国超星期刊域出版平台
  • 国家科技学术期刊开放平台
  • 荷兰文摘与引文数据库(SCOPUS)
  • 日本科学技术振兴机构数据库(JST)

基于数据划分的核岭回归加速算法

An accelerator for kernel ridge regression algorithms based on data partition

  • 摘要: 核岭回归(KRR)是一种重要的回归算法,具有可解释性、强泛化性能等优点,被广泛应用于模式识别、数据挖掘等领域;然而面对大规模数据时,核岭回归存在着训练效率较低的缺陷.为此,利用分而治之思想提出一种基于数据划分的核岭回归加速算法(PP-KRR).首先利用一簇平行超平面将当前数据所在的空间划分为m个互不相交的区域;其次在划分后的每个区域上训练KRR模型;最后每个KRR模型预测处在同一区域内的未标记实例.在真实数据集上与传统的算法进行实验比较分析,实验结果表明,提出的算法在保持一定预测精度的同时,能够获得更短的训练时间.

     

    Abstract: Kernel ridge regression (KRR) is an important regression algorithm widely used in pattern recognition and data mining for its interpretability and strong generalization capability. However, it has the defect of low training efficiency when faced with large-scale data. To address this problem, an accelerating algorithm is proposed which uses the concept of divide-and-conquer for kernel ridge regression based on data partition (PP-KRR). Firstly, the current training data space is divided into m mutually disjoint regions by a bunch of parallel hyperplanes. Secondly, each KRR model is trained on each region respectively. Finally, each unlabeled instance is predicted by the KRR model within the same region. Comparisons with three traditional algorithms on real datasets show that the proposed algorithm obtains similar prediction accuracy with less training time.

     

/

返回文章
返回