ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

A robust approximate algorithm for large-scale clustering of visual features

Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2014.10.008
  • Received Date: 13 September 2013
  • Accepted Date: 16 May 2014
  • Rev Recd Date: 16 May 2014
  • Publish Date: 30 October 2014
  • The large-scale clustering problem of visual features is crucial for image recognition and retrieval. The state-of-the-art algorithm, the approximate k-means, approximately guarantees the clustering performance by applying the high-precision approximate search. An improved algorithm was proposed, which requires no extra memory cost and nearly no extra time consumption. The robust approximate algorithm can better guarantee its convergence and clustering performance by utilizing more information in the iteration to update the partition, so that clustering loss is non-increasing and reduced rapidly. Theoretical proofs guarantee that the algorithm converges to the converged solution of the Lloyd algorithm, regard less of the precision values of approximate search. The experiment results show that the algorithm has about 10 times the speed of the approximate k-means algorithm. Besides, the clustering performance is also directly verified by comparing the images in the clustering results of global features.
    The large-scale clustering problem of visual features is crucial for image recognition and retrieval. The state-of-the-art algorithm, the approximate k-means, approximately guarantees the clustering performance by applying the high-precision approximate search. An improved algorithm was proposed, which requires no extra memory cost and nearly no extra time consumption. The robust approximate algorithm can better guarantee its convergence and clustering performance by utilizing more information in the iteration to update the partition, so that clustering loss is non-increasing and reduced rapidly. Theoretical proofs guarantee that the algorithm converges to the converged solution of the Lloyd algorithm, regard less of the precision values of approximate search. The experiment results show that the algorithm has about 10 times the speed of the approximate k-means algorithm. Besides, the clustering performance is also directly verified by comparing the images in the clustering results of global features.
  • loading
  • [1]
    Nister D, Stewenius H. Scalable recognition with a vocabulary tree[C]// 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE Press, 2006, 2: 2 161-2 168.
    [2]
    Philbin J, Chum O, Isard M, et al. Object retrieval with large vocabularies and fast spatial matching[C]// IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE Press, 2007: 1-8.
    [3]
    Tuytelaars T, Lampert C H, Blaschko M B, et al. Unsupervised object discovery: A comparison[J]. International Journal of Computer Vision, 2010, 88(2): 284-302.
    [4]
    Philbin J, Zisserman A. Object mining using a matching graph on very large image collections[C]//Proceedings of the Inidan Conference on Computer Vision, Graphics & Image Processing. Bhubaneswar, India: IEEE Press, 2008: 738-745.
    [5]
    Li X W, Wu C C, Zach C, et al. Modeling and recognition of landmark image collections using iconic scene graphs[C]// Proceedings of 10th European Conference on Computer Vision. Marseille, France: Springer, 2008: 427-440.
    [6]
    Torralba A, Fergus R, Freeman W T. 80 million tiny images: A large data set for nonparametric object and scene recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(11): 1 958-1 970.
    [7]
    Lloyd S. Least squares quantization in PCM[J]. IEEE Transactions on Information Theory, 1982,28(2):129-136.
    [8]
    Selim S Z, Ismail M A. K-means-type algorithms: A generalized convergence theorem and characterization of local optimality[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1984, 6(1): 81-87.
    [9]
    Jain A K. Data clustering: 50 years beyond K-means[J]. Pattern Recognition Letters, 2010,31(8):651-666.
    [10]
    Judd D, McKinley P K, Jain A K. Large-scale parallel data clustering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998,20(8): 871-876.
    [11]
    Kollios G, Gunopulos D, Koudas N, et al. Efficient biased sampling for approximate clustering and outlier detection in large data sets[J]. IEEE Transactions on Knowledge and Data Engineering, 2003, 15(5): 1 170-1 187.
    [12]
    Silpa-Anan C, Hartley R. Optimised KD-trees for fast image descriptor matching[C]// IEEE Conference on Computer Vision and Pattern Recognition. Anchorage,USA: IEEE Press, 2008: 1-8.
    [13]
    Muja M, Lowe D G. Fast approximate nearest neighbors with automatic algorithm configuration[C]// Proceedings of the 4th International Conference on Computer Vision Theory and Applications. Lisboa, Portugal: IEEE Press, 2009: 331-340.
    [14]
    Arthur D, Vassilvitskii S. k-means++: The advantages of careful seeding[C]// Proceedings of the 8th Annual ACM-SIAM Symposium on Discrete Algorithms. Portland, USA: ACM Press, 2007: 1 027-1 035.
  • 加载中

Catalog

    [1]
    Nister D, Stewenius H. Scalable recognition with a vocabulary tree[C]// 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE Press, 2006, 2: 2 161-2 168.
    [2]
    Philbin J, Chum O, Isard M, et al. Object retrieval with large vocabularies and fast spatial matching[C]// IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE Press, 2007: 1-8.
    [3]
    Tuytelaars T, Lampert C H, Blaschko M B, et al. Unsupervised object discovery: A comparison[J]. International Journal of Computer Vision, 2010, 88(2): 284-302.
    [4]
    Philbin J, Zisserman A. Object mining using a matching graph on very large image collections[C]//Proceedings of the Inidan Conference on Computer Vision, Graphics & Image Processing. Bhubaneswar, India: IEEE Press, 2008: 738-745.
    [5]
    Li X W, Wu C C, Zach C, et al. Modeling and recognition of landmark image collections using iconic scene graphs[C]// Proceedings of 10th European Conference on Computer Vision. Marseille, France: Springer, 2008: 427-440.
    [6]
    Torralba A, Fergus R, Freeman W T. 80 million tiny images: A large data set for nonparametric object and scene recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(11): 1 958-1 970.
    [7]
    Lloyd S. Least squares quantization in PCM[J]. IEEE Transactions on Information Theory, 1982,28(2):129-136.
    [8]
    Selim S Z, Ismail M A. K-means-type algorithms: A generalized convergence theorem and characterization of local optimality[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1984, 6(1): 81-87.
    [9]
    Jain A K. Data clustering: 50 years beyond K-means[J]. Pattern Recognition Letters, 2010,31(8):651-666.
    [10]
    Judd D, McKinley P K, Jain A K. Large-scale parallel data clustering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998,20(8): 871-876.
    [11]
    Kollios G, Gunopulos D, Koudas N, et al. Efficient biased sampling for approximate clustering and outlier detection in large data sets[J]. IEEE Transactions on Knowledge and Data Engineering, 2003, 15(5): 1 170-1 187.
    [12]
    Silpa-Anan C, Hartley R. Optimised KD-trees for fast image descriptor matching[C]// IEEE Conference on Computer Vision and Pattern Recognition. Anchorage,USA: IEEE Press, 2008: 1-8.
    [13]
    Muja M, Lowe D G. Fast approximate nearest neighbors with automatic algorithm configuration[C]// Proceedings of the 4th International Conference on Computer Vision Theory and Applications. Lisboa, Portugal: IEEE Press, 2009: 331-340.
    [14]
    Arthur D, Vassilvitskii S. k-means++: The advantages of careful seeding[C]// Proceedings of the 8th Annual ACM-SIAM Symposium on Discrete Algorithms. Portland, USA: ACM Press, 2007: 1 027-1 035.

    Article Metrics

    Article views (25) PDF downloads(70)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return