• 中文核心期刊要目总览
  • 中国科技核心期刊
  • 中国科学引文数据库(CSCD)
  • 中国科技论文与引文数据库(CSTPCD)
  • 中国学术期刊文摘数据库(CSAD)
  • 中国学术期刊(网络版)(CNKI)
  • 中文科技期刊数据库
  • 万方数据知识服务平台
  • 中国超星期刊域出版平台
  • 国家科技学术期刊开放平台
  • 荷兰文摘与引文数据库(SCOPUS)
  • 日本科学技术振兴机构数据库(JST)

基于网格聚类的情感分析研究

Sentiment analysis based on grid clustering

  • 摘要: 传统基于语义词典和基于机器学习的中文情感分析方法,其情感分析结果受人的主观因素影响较大,在一定程度上依赖于人工建立的词典,词典的可扩展性不强.本文对于不被包括在知网情感词典中但又含有一定情感倾向的词语,使用点互信息PMI算法、设置参数阈值等方法,进行自动识别、提取和分类,从而达到扩充词典的目的.在此基础上,建立商品评论的特征向量模型,提出情感分类算法SCG,通过网格聚类算法建立分类模型,在网格聚类过程中引入动态衰减因子,周期性地移除稀疏网格,减少计算量.实验结果表明,相比Naive Bayes,SMO(sequential minimal optimization)等分类算法,SCG算法具有更高的准确率和领域适应性.

     

    Abstract: To expand a lexicon, the methods of point mutual information (PMI), setting the threshold parameter, etc. were used to automatically identify, extract and classification the words which are not included in the HowNet but have a certain emotional tendency. On that basis, a feature vector model based on commodity comments was established, and the SCG (sentiment classification based on grid clustering) algorithm was presented. Next, the grid-based clustering algorithm was used to build up a classification model. The amount of calculation decreased after the dynamic attenuation factors were introduced and sparse grids were periodically removed in the grid-based clustering process. Experimental results indicate that the classification accuracy and field adaptability of SCG is higher, compared with other algorithms such as Naive Bayes, SMO (sequential minimal optimization).

     

/

返回文章
返回