ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

Image annotation by searching semantically related regions

Funds:  National Natural Science Foundation (60933013).
Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2014.01.008
More Information
  • Author Bio:

    DAI Lican, male, born in 1985, PhD student. Research filed: Information retrieval. E-mail: licand@mail.ustc.edu.cn

  • Corresponding author: YU Nenghai
  • Received Date: 19 March 2013
  • Accepted Date: 28 April 2013
  • Rev Recd Date: 28 April 2013
  • Publish Date: 30 January 2014
  • Based on abundant partially annotated images on the web, a novel framework for image annotation was proposed. By utilizing both the visual and textual knowledge of public available image database Image-Net, the proposed framework first learnt a set of weakly labeled visual concept classifiers, and then used the outputs of these learnt classifiers on image regions as descriptors to conduct the region-based search in a large scale image database for a query image. After that, search results mining and clustering was introduced to generate annotations to the query image. Compared with image-level representation, the proposed region-based semantic representation performs better at capturing images multi-objects/semantics. The proposed framework takes advantage of both traditional classification-based approaches and large scale data-driven approaches. Experimental results conducted on 24 million web images and challenging image database have demonstrated the effectiveness and efficiency of the proposed approach.
    Based on abundant partially annotated images on the web, a novel framework for image annotation was proposed. By utilizing both the visual and textual knowledge of public available image database Image-Net, the proposed framework first learnt a set of weakly labeled visual concept classifiers, and then used the outputs of these learnt classifiers on image regions as descriptors to conduct the region-based search in a large scale image database for a query image. After that, search results mining and clustering was introduced to generate annotations to the query image. Compared with image-level representation, the proposed region-based semantic representation performs better at capturing images multi-objects/semantics. The proposed framework takes advantage of both traditional classification-based approaches and large scale data-driven approaches. Experimental results conducted on 24 million web images and challenging image database have demonstrated the effectiveness and efficiency of the proposed approach.
  • loading
  • [1]
    Li J, Wang J Z. Automatic linguistic indexing of pictures by a statistical modeling approach[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(10): 1 075-1 088.
    [2]
    Jeon J, Lavrenko V, Manmatha R. Automatic image annotation and retrieval using cross-media relevance models[C]// Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Toronto, Canada: ACM Press, 2003: 119-126.
    [3]
    Fu H, Zhang Q, Qiu G P. Random forest for image annotation[C]// Proceedings of the 12th European Conference on Computer Vision. Florence, Italy: Springer, 2012: 86-99.
    [4]
    Zhang D S, Islam M M, Lu G J. A review on automatic image annotation techniques[J]. Pattern Recognition, 2012, 45(1): 346-362.
    [5]
    Wang X J, Zhang L, Li X R, et al. Annotating images by mining image search results[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(11): 1 919-1 932.
    [6]
    Wang X J, Zhang L, Liu M, et al. ARISTA - image search to annotation on billions of web photos[C]// IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE Press, 2010: 2 987-2 994.
    [7]
    Lu Y J, Zhang L, Liu J M, et al. Constructing concept lexica with small semantic gaps[J]. IEEE Transactions on Multimedia, 2010, 12(4): 288-299.
    [8]
    Li L J, Su H, Xing E P, et al. Object bank: A high-level image representation for scene classification & semantic feature sparsification[C]// Proceedings of the Advances in Neural Information Processing Systems. Vancouver, Canada, 2010: 1 378-1 386.
    [9]
    Mahajan D K, Sellamanickam S, Nair V. A joint learning framework for attribute models and object descriptions[C]// International Conference on Computer Vision. Barcelona, Spanish: IEEE Press, 2011: 1 227-1 234.
    [10]
    Yu F X, Ji R R, Tsai, M H, et al. Weak attributes for large-scale image retrieval[C]// Proceedings of IEEE International Conference on Computer Vision and Patten Recognition. Portland, USA: IEEE Press, 2012: 2 949-2 956.
    [11]
    Deng J, Dong W, Socher R, et al. ImageNet: A large-scale hierarchical image database[C]// IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE Press, 2009: 248-255.
    [12]
    Torresani L, Szummer M, Fitzgibbon A W. Efficient object category recognition using classemes[C]// Proceedings of the 11th European Conference on Computer Vision: Part I. Heraklion, Greece: IEEE Press, 2010: 776-789.
    [13]
    van de Sande K E A, Gevers T, Snoek C G M. Evaluating color descriptors for object and scene recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1 582-1 596.
    [14]
    Shechtman E, Irani M. Matching local self-similarities across images and videos[C]// IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE Press, 2007: 1-8.
    [15]
    Bosch A, Zisserman A, Muoz X. Representing shape with a spatial pyramid kernel[C]// Proceedings of the 6th ACM International Conference on Image and Video Retrieval. New York: ACM Press, 2007:401-408.
    [16]
    Varma M, Babu B R. More generality in efficient multiple kernel learning[C]// Proceedings of the International Conference on Machine Learning. Montreal, Canada: ACM Press, 2009: 1 065-1 072.
    [17]
    Platt J C. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods[J]. Advances in Large Margin Classifiers, 1999: 61-74.
    [18]
    Deng Y N, Manjunath B S, Shin H. Color Image segmentation[C]// IEEE Conference on Computer Vision and Pattern Recognition. Fort Collis: IEEE Computer Society, 1999: 1-6.
    [19]
    Li X R, Chen L, Zhang L, et al. Image annotation by large-scale content-based image retrieval[C]// Proceedings of the 14th Annual ACM International Conference on Multimedia. Santa Barbara, USA: ACM Press, 2006: 607-610.
    [20]
    Zeng H J, He Q C, Chen Z, et al. Learning to cluster web search results[C]// Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Sheffield, UK: ACM Press, 2004: 210-217.
  • 加载中

Catalog

    [1]
    Li J, Wang J Z. Automatic linguistic indexing of pictures by a statistical modeling approach[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(10): 1 075-1 088.
    [2]
    Jeon J, Lavrenko V, Manmatha R. Automatic image annotation and retrieval using cross-media relevance models[C]// Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Toronto, Canada: ACM Press, 2003: 119-126.
    [3]
    Fu H, Zhang Q, Qiu G P. Random forest for image annotation[C]// Proceedings of the 12th European Conference on Computer Vision. Florence, Italy: Springer, 2012: 86-99.
    [4]
    Zhang D S, Islam M M, Lu G J. A review on automatic image annotation techniques[J]. Pattern Recognition, 2012, 45(1): 346-362.
    [5]
    Wang X J, Zhang L, Li X R, et al. Annotating images by mining image search results[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(11): 1 919-1 932.
    [6]
    Wang X J, Zhang L, Liu M, et al. ARISTA - image search to annotation on billions of web photos[C]// IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE Press, 2010: 2 987-2 994.
    [7]
    Lu Y J, Zhang L, Liu J M, et al. Constructing concept lexica with small semantic gaps[J]. IEEE Transactions on Multimedia, 2010, 12(4): 288-299.
    [8]
    Li L J, Su H, Xing E P, et al. Object bank: A high-level image representation for scene classification & semantic feature sparsification[C]// Proceedings of the Advances in Neural Information Processing Systems. Vancouver, Canada, 2010: 1 378-1 386.
    [9]
    Mahajan D K, Sellamanickam S, Nair V. A joint learning framework for attribute models and object descriptions[C]// International Conference on Computer Vision. Barcelona, Spanish: IEEE Press, 2011: 1 227-1 234.
    [10]
    Yu F X, Ji R R, Tsai, M H, et al. Weak attributes for large-scale image retrieval[C]// Proceedings of IEEE International Conference on Computer Vision and Patten Recognition. Portland, USA: IEEE Press, 2012: 2 949-2 956.
    [11]
    Deng J, Dong W, Socher R, et al. ImageNet: A large-scale hierarchical image database[C]// IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE Press, 2009: 248-255.
    [12]
    Torresani L, Szummer M, Fitzgibbon A W. Efficient object category recognition using classemes[C]// Proceedings of the 11th European Conference on Computer Vision: Part I. Heraklion, Greece: IEEE Press, 2010: 776-789.
    [13]
    van de Sande K E A, Gevers T, Snoek C G M. Evaluating color descriptors for object and scene recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1 582-1 596.
    [14]
    Shechtman E, Irani M. Matching local self-similarities across images and videos[C]// IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE Press, 2007: 1-8.
    [15]
    Bosch A, Zisserman A, Muoz X. Representing shape with a spatial pyramid kernel[C]// Proceedings of the 6th ACM International Conference on Image and Video Retrieval. New York: ACM Press, 2007:401-408.
    [16]
    Varma M, Babu B R. More generality in efficient multiple kernel learning[C]// Proceedings of the International Conference on Machine Learning. Montreal, Canada: ACM Press, 2009: 1 065-1 072.
    [17]
    Platt J C. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods[J]. Advances in Large Margin Classifiers, 1999: 61-74.
    [18]
    Deng Y N, Manjunath B S, Shin H. Color Image segmentation[C]// IEEE Conference on Computer Vision and Pattern Recognition. Fort Collis: IEEE Computer Society, 1999: 1-6.
    [19]
    Li X R, Chen L, Zhang L, et al. Image annotation by large-scale content-based image retrieval[C]// Proceedings of the 14th Annual ACM International Conference on Multimedia. Santa Barbara, USA: ACM Press, 2006: 607-610.
    [20]
    Zeng H J, He Q C, Chen Z, et al. Learning to cluster web search results[C]// Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Sheffield, UK: ACM Press, 2004: 210-217.

    Article Metrics

    Article views (35) PDF downloads(62)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return