Facial expression recognition based on fusion of deep learning and dense SIFT

PENG Yuqing; WANG Weihua; LIU Xuan; ZHAO Xiaosong; WEI Ming

doi:10.3969/j.issn.0253-2778.2019.02.004

PDF( 4845 KB)

Open Access JUSTC

Facial expression recognition based on fusion of deep learning and dense SIFT

School of Computer Science and Software, Hebei University of Technology, Tianjin 300400, China

Cite this:

https://doi.org/10.3969/j.issn.0253-2778.2019.02.004

Received Date: 15 June 2019
Rev Recd Date: 18 September 2019
Publish Date: 28 February 2019

Abstract Full text PDF

Abstract

Abstract

With the wide application of facial expression recognition in the field of human-computer interaction, accurate and efficient expression recognition methods are of particular important. A hybrid model that combines the convolutional neural network with Dense SIFT features is proposed. The network structure used in the hybrid model is improved in the idea of depth-separable convolutional neural network MobileNet. Based on the separation of channel convolution ( depth convolution)and space convolution (point convolution), the multi-scale convolution kernel is used in the point convolution part of the MobileNet structure, which ensures the diversity and subtleness of the extracted features and is more suitable for facial expression feature extraction, and the introduction of DenseNet network structure ideas improve the performance of the network structure. Using Dense SIFT's 128-dimension descriptors to provide greater advantages for feature descriptions, the improved MobileNet network is integrated with its fully connected layer, and the Eltwise layer is used to compare the elements of the fully connected layer, taking the maximum value to ensure the diversity of features, as well as greater representation. Using this hybrid model on FER2013 and JAFFE face expression data sets, the recognition rate can reach 73.2% and 96.5%.

Abstract

With the wide application of facial expression recognition in the field of human-computer interaction, accurate and efficient expression recognition methods are of particular important. A hybrid model that combines the convolutional neural network with Dense SIFT features is proposed. The network structure used in the hybrid model is improved in the idea of depth-separable convolutional neural network MobileNet. Based on the separation of channel convolution ( depth convolution)and space convolution (point convolution), the multi-scale convolution kernel is used in the point convolution part of the MobileNet structure, which ensures the diversity and subtleness of the extracted features and is more suitable for facial expression feature extraction, and the introduction of DenseNet network structure ideas improve the performance of the network structure. Using Dense SIFT's 128-dimension descriptors to provide greater advantages for feature descriptions, the improved MobileNet network is integrated with its fully connected layer, and the Eltwise layer is used to compare the elements of the fully connected layer, taking the maximum value to ensure the diversity of features, as well as greater representation. Using this hybrid model on FER2013 and JAFFE face expression data sets, the recognition rate can reach 73.2% and 96.5%.