ISSN 0253-2778

CN 34-1054/N

open
Open AccessOpen Access JUSTC Original Paper

Core-points based spectral clustering for big data analysis

Cite this: JUSTC, 2016, 46(9): 757-763
https://doi.org/10.3969/j.issn.0253-2778.2016.09.007
More Information
  • Received Date: February 29, 2016
  • Revised Date: September 16, 2016
  • Accepted Date: September 16, 2016
  • Published Date: September 29, 2016
  • With regard to failures in applying spectral clustering to big data due to its computation complexity, a new spectral clustering algorithm for big data was proposed. Firstly, core-points based on random sampling and data similarity were selected, with which, the big data were grouped. Secondly, spectral clustering was applied to the core-points. Finally, the clustering of whole data was completed by combining the clustering result of the core-points and the grouped big data information. The algorithm both promotes the spectral clustering to big data and reduces the influence of noise or abnormal data by the core-points. A large number of experiments fully verify the effectiveness of the method proposed in this paper.

Catalog

    {{if article.pdfAccess}}
    {{if article.articleBusiness.pdfLink && article.articleBusiness.pdfLink != ''}} {{else}} {{/if}}PDF
    {{/if}}
    XML
    [1]
    刘冰.Web数据挖掘[M].北京:清华大学出版社,2011.
    [2]
    KAUFMAN L, ROUSSEEUW P J. Finding Groups in Data: An Introduction to Cluster Analysis[M]. New York: Wiley, 1990.
    [3]
    XU R, WUNSCH D. Survey of clustering algorithms [J]. IEEE Transactions on Neural Networks, 2005, 16(3): 645-678.
    [4]
    SHI J B, MALIK J. Normalized cuts and image segmentation [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2000, 22(8): 888-905.
    [5]
    MACQUEEN J. Some methods for classification and analysis of multivariate observations[C]// Proceedings of the 5th Berkeley Symposium Mathematical Statistics Probability. Berkeley: 1967: 281-297.
    [6]
    WILLIAMS P K, SOARES C V, GILBERT J E. A clustering rule based approach for classification problems [J]. International Journal of Data Warehousing and Mining, 2010, 8(1): 1-23.
    [7]
    FOWLKES C, BELONGIE S, FAN C, et al. Spectral grouping using the Nystrm method [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2004, 26(2): 214-225.
    [8]
    ZHANG K, KWOK J T. Clustered Nystrm method for large scale manifold learning and dimension reduction [J]. IEEE Transactions on Neural Networks, 2010, 21(10):1576-1587.
    [9]
    DING S F, JIA H J, SHI Z Z. Spectral clustering algorithm based on adaptive Nystrm sampling for big data analysis [J]. Journal of Software, 2014, 25(9): 2037-2049.
    [10]
    CHEN X L, DENG C. Large scale spectral clustering with landmark-based representation[C]// Proceedings of the 25th AAAI Conference on Artificial Intelligence. San Francisco: AAAI Press, 2011: 313-318.
    [11]
    YAN D H, HUANG L, JORDAN M I. Fast approximate spectral clustering[C]// Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Paris,France: ACM Press, 2009: 907-916.
    [12]
    SHINNOU H, SASAKI M. Spectral clustering for a large data set by reducing the similarity matrix size[C]// Proceedings of the 6th International Language Resources and Evaluation. 2008.
    [13]
    VISWANATH P, BABU V S. Rough-DBSCAN: A fast hybrid density based clustering method for large data sets [J]. Pattern Recognition Letters, 2009, 30(16): 1477-1488.
    [14]
    ZHANG T, RAMAKRISHNAN R, LIVNY M. BIRCH: An efficient data clustering method for very large databases [J]. ACM SIGMOD Record, 1999, 25(2): 103-114.
    [15]
    马儒宁,王秀丽,丁军娣.多层核心集凝聚算法[J].软件学报,2013,24(3):490-506.
    MA R N,WANG X L,DING J D.Multilevel core-sets based aggregation clustering algorithm [J]. Journal of Software,2013,24(3):490-506.)

    Article Metrics

    Article views (144) PDF downloads (192)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return