ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

Human activity recognition based on 3D skeletons and MCRF model

Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2014.04.005
  • Received Date: 11 September 2013
  • Accepted Date: 01 January 2014
  • Rev Recd Date: 01 January 2014
  • Publish Date: 30 April 2014
  • Considering the disadvantages of the traditional human activity recognition system, a human activity recognition system using an MCRF model and 3D skeletons was proposed. Its 3D skeleton data has less data and retains the key information, and the MCRF model has the advantage of being able to combine more features and utilizing adaptive contextual information. First, human activity was divided into global activity, arm activity, and leg activity. Several feature subsets were formed through more feature extraction. Then, CRF models were used on each feature subset to generate CRF units. Finally, all the CRF units were combined to produce the MCRF model which was utilized to recognize human activity. The experimental results indicate that the proposed method can improve detection accuracy.
    Considering the disadvantages of the traditional human activity recognition system, a human activity recognition system using an MCRF model and 3D skeletons was proposed. Its 3D skeleton data has less data and retains the key information, and the MCRF model has the advantage of being able to combine more features and utilizing adaptive contextual information. First, human activity was divided into global activity, arm activity, and leg activity. Several feature subsets were formed through more feature extraction. Then, CRF models were used on each feature subset to generate CRF units. Finally, all the CRF units were combined to produce the MCRF model which was utilized to recognize human activity. The experimental results indicate that the proposed method can improve detection accuracy.
  • loading
  • [1]
    Zhao L, Guo L, Xie J S, et al. Video abnormal target description based on CRF model[C]// International Conference on Audio, Language and Image Processing. Shanghai, China: IEEE Press, 2012: 519-524.
    [2]
    Wang Y, Mori G. Learning a discriminative hidden part model for human action recognition[J]. Advances in Neural Information Processing Systems, 2008, 21: 1 721-1 728.
    [3]
    Gu Junxia, Ding Xiaoqing, Wang Shengjin. A survey of activity analysis algorithms[J]. Journal of Image and Graphics, 2009, 14(3): 377-387.
    谷军霞, 丁晓青,王生进. 行为分析算法综述[J]. 中国图形图像学报, 2009, 14(3): 377-387.
    [4]
    Gu Junxia, Ding Xiaoqing, Wang Shengjin. Human 3D model-based 2D action recognition[J]. Acta Automatica Sinica, 2010, 36(1): 46-53.
    谷军霞, 丁晓青,王生进. 基于人体行为3D模型的2D行为识别[J]. 自动化学报, 2010, 36(1): 46-53.
    [5]
    Polana R, Nelson R. Low level recognition of human motion (or how to get your man without finding his body parts)[C]// Proceedings of the IEEE Workshop on Motion of Non-Rigid and Articulated Objects. Austin, USA: IEEE Press, 1994: 77-82.
    [6]
    Davis J W, Bobick A F. The representation and recognition of human movement using temporal templates[C]// IEEE Conference on Computer Vision and Pattern Recognition. San Juan, USA: IEEE Press, 1997: 928-934.
    [7]
    Yamato J, Ohya J, Ishii K. Recognizing human action in time-sequential images using hidden Markov model[C]// IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Champaign, USA: IEEE Press, 1992: 379-385.
    [8]
    Phillips S J, Anderson R P, Schapire R E. Maximum entropy modeling of species geographic distributions[J]. Ecological modelling, 2006, 190(3-4): 231-259.
    [9]
    Lafferty J, McCallum A, Pereira F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]// Proceedings of the 18th International Conference on Machine Learning. Williamstown, USA: Morgan Kaufmann Publisher, 2001: 282-289.
    [10]
    Zhong P, Wang R S. A multiple conditional random fields ensemble model for urban area detection in remote sensing optical images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2007,45(12): 3 978-3 988.
    [11]
    Smisek J, Jancosek M, Pajdla T. 3D with Kinect[C]// Consumer Depth Cameras for Computer Vision Advances in Computer Vision and Pattern Recognition. Barcelona, Spain: IEEE Press, 2013: 3-25.
    [12]
    谷军霞. 行为表征与行为识别方法研究[D]. 清华大学,2010.
    [13]
    钟平. 面向图像标记的随机场模型研究[D]. 国防科学技术大学, 2008.
    [14]
    Shotton J, Winn J, Rother C, et al. TextonBoost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context[J]. International Journal of Computer Vision, 2009, 81(1): 2-23.
    [15]
    Shotton J, Winn J, Rother C, et al. TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation[J]. Lecture Notes in Computer Science, 2006, 3951: 1-15.
    [16]
    Ramos F, Fox D, Durrant-Whyte, H F. CRF-Matching:condtional random fields for feature-based scan matching[A]// Robotics: Science and Systems, MIT Press, 2007.
    [17]
    Niebles J C, Wang H C, Li F F. Unsupervised learning of human action categories using spatial-temporal words[J]. International Journal of Computer Vision, 2008, 79(3): 299-318.
    [18]
    Uddin M Z, Thang N D, Kim J T, et al. Human activity recognition using body joint-angle features and hidden Markov model[J]. ETRI Journal, 2011, 33(4): 569-579.
  • 加载中

Catalog

    [1]
    Zhao L, Guo L, Xie J S, et al. Video abnormal target description based on CRF model[C]// International Conference on Audio, Language and Image Processing. Shanghai, China: IEEE Press, 2012: 519-524.
    [2]
    Wang Y, Mori G. Learning a discriminative hidden part model for human action recognition[J]. Advances in Neural Information Processing Systems, 2008, 21: 1 721-1 728.
    [3]
    Gu Junxia, Ding Xiaoqing, Wang Shengjin. A survey of activity analysis algorithms[J]. Journal of Image and Graphics, 2009, 14(3): 377-387.
    谷军霞, 丁晓青,王生进. 行为分析算法综述[J]. 中国图形图像学报, 2009, 14(3): 377-387.
    [4]
    Gu Junxia, Ding Xiaoqing, Wang Shengjin. Human 3D model-based 2D action recognition[J]. Acta Automatica Sinica, 2010, 36(1): 46-53.
    谷军霞, 丁晓青,王生进. 基于人体行为3D模型的2D行为识别[J]. 自动化学报, 2010, 36(1): 46-53.
    [5]
    Polana R, Nelson R. Low level recognition of human motion (or how to get your man without finding his body parts)[C]// Proceedings of the IEEE Workshop on Motion of Non-Rigid and Articulated Objects. Austin, USA: IEEE Press, 1994: 77-82.
    [6]
    Davis J W, Bobick A F. The representation and recognition of human movement using temporal templates[C]// IEEE Conference on Computer Vision and Pattern Recognition. San Juan, USA: IEEE Press, 1997: 928-934.
    [7]
    Yamato J, Ohya J, Ishii K. Recognizing human action in time-sequential images using hidden Markov model[C]// IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Champaign, USA: IEEE Press, 1992: 379-385.
    [8]
    Phillips S J, Anderson R P, Schapire R E. Maximum entropy modeling of species geographic distributions[J]. Ecological modelling, 2006, 190(3-4): 231-259.
    [9]
    Lafferty J, McCallum A, Pereira F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]// Proceedings of the 18th International Conference on Machine Learning. Williamstown, USA: Morgan Kaufmann Publisher, 2001: 282-289.
    [10]
    Zhong P, Wang R S. A multiple conditional random fields ensemble model for urban area detection in remote sensing optical images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2007,45(12): 3 978-3 988.
    [11]
    Smisek J, Jancosek M, Pajdla T. 3D with Kinect[C]// Consumer Depth Cameras for Computer Vision Advances in Computer Vision and Pattern Recognition. Barcelona, Spain: IEEE Press, 2013: 3-25.
    [12]
    谷军霞. 行为表征与行为识别方法研究[D]. 清华大学,2010.
    [13]
    钟平. 面向图像标记的随机场模型研究[D]. 国防科学技术大学, 2008.
    [14]
    Shotton J, Winn J, Rother C, et al. TextonBoost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context[J]. International Journal of Computer Vision, 2009, 81(1): 2-23.
    [15]
    Shotton J, Winn J, Rother C, et al. TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation[J]. Lecture Notes in Computer Science, 2006, 3951: 1-15.
    [16]
    Ramos F, Fox D, Durrant-Whyte, H F. CRF-Matching:condtional random fields for feature-based scan matching[A]// Robotics: Science and Systems, MIT Press, 2007.
    [17]
    Niebles J C, Wang H C, Li F F. Unsupervised learning of human action categories using spatial-temporal words[J]. International Journal of Computer Vision, 2008, 79(3): 299-318.
    [18]
    Uddin M Z, Thang N D, Kim J T, et al. Human activity recognition using body joint-angle features and hidden Markov model[J]. ETRI Journal, 2011, 33(4): 569-579.

    Article Metrics

    Article views (28) PDF downloads(61)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return