An exam robot for sentence completion in high school English tests

CHEN Zhigang; LIU Qingwen; LIN Wei; WANG Yang; CHEN XiaoPing

doi:10.3969/j.issn.0253-2778.2016.03.008

PDF( 1025 KB)

Open Access JUSTC Original Paper

An exam robot for sentence completion in high school English tests

1.
Department of Computer Science, University of Science and Technology of China, Hefei 230026, China
2.
Iflytek Research, Hefei 230026, China

Cite this:

https://doi.org/10.3969/j.issn.0253-2778.2016.03.008

Received Date: 12 September 2015
Accepted Date: 29 December 2015
Rev Recd Date: 29 December 2015
Publish Date: 30 March 2016

Abstract Full text PDF

Abstract

Abstract

Addressed in this paper is the problem of sentence completion in Chinese national college or high school entrance English examinations in which the most appropriate word or phrase from a given set shoud be chosen to complete a sentence. Although a variety of methods have been developed to solve this problem in the literature, these approaches mainly focused on language modeling (LM) and latent semantic analysis (LSA) to the best of our knowledge. An exam robot prototype was built by extending the language modeling and latent semantic analysis methods to verb tense analysis and long distance phrase extraction. Specifically speaking, the syntactic, lexical and semantic features are extracted separately using by means of LM and LSA as well as verb tense analysis and phrase extraction two methods developed by the authors. These features are then fed into a learning to rank model to build the exam robot. The proposed approach outperforms LM and LSA models by 4.0 percentage points, achieving 78% accuracy on the question sets for senior entrance exams and 76% accuracy on the question sets for college entrance exams.

Abstract

Addressed in this paper is the problem of sentence completion in Chinese national college or high school entrance English examinations in which the most appropriate word or phrase from a given set shoud be chosen to complete a sentence. Although a variety of methods have been developed to solve this problem in the literature, these approaches mainly focused on language modeling (LM) and latent semantic analysis (LSA) to the best of our knowledge. An exam robot prototype was built by extending the language modeling and latent semantic analysis methods to verb tense analysis and long distance phrase extraction. Specifically speaking, the syntactic, lexical and semantic features are extracted separately using by means of LM and LSA as well as verb tense analysis and phrase extraction two methods developed by the authors. These features are then fed into a learning to rank model to build the exam robot. The proposed approach outperforms LM and LSA models by 4.0 percentage points, achieving 78% accuracy on the question sets for senior entrance exams and 76% accuracy on the question sets for college entrance exams.

FullText(HTML)

References(10)

References

[1]	ZWEIG G, BURGES C J C. The Microsoft research sentence completion challenge[R]. Technical Report MSR-TR-2011-129, Microsoft, 2011.
[2]	ZWEIG G, PLATT J C, MEEK C, et al. Computational approaches to sentence completion[C]// Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Je ju Island, Korea: Association for Computational Linguistics, 2012, 1: 601-610.
[3]	LANDAUER T, DUMAIS S T. Latent semantic analysis[J]. Annual Review of Information Science and Technology, 2004, 38(1): 188-230.
[4]	GUBBINS J, VLACHOS A. Dependency language models for sentence completion[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Seattle, USA: ACL Press, 2013: 1405-1410.
[5]	MIKOLOV T, KARAFIT M, BURGET L. Recurrent neural network based language model[C]// 11th Annual Conference of the International Speech Communication Association. Makuhari, Japan: ISCA Press, 2010: 1045-1048.
[6]	MIKOLOV T. Statistical language models based on neural networks[R]. Presentation at Google, Mountain View, 2nd April 2012.
[7]	MCCARTHY D, NAVIGLI R. The English lexical substitution task[J].Language Resources and Evaluation, 2009, 43(2): 139-159.
[8]	RITTER A, Mausam, Etzioni O. "A latent dirichlet allocation method for selectional preferences[C]// Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2010: 424-434.
[9]	DE MARNEFFE M C, MACCARTNEY B, MANNING C D. Generating typed dependency parses from phrase structure parses[C]// Proceedings of the Language Resources Evaluation Conference. Stanford, USA: IEEE Press, 2006, 6: 449-454.
[10]	MANNING C D, SURDEANU M, BAUER J, et al. The Stanford CoreNLP natural language processing toolkit[C]// Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore, USA: ACL Press, 2014: 55-60.)

Supplements(0)

Track Citations

Proportional views

Proportional views

Get Citation

PDF

XML

[1]	ZWEIG G, BURGES C J C. The Microsoft research sentence completion challenge[R]. Technical Report MSR-TR-2011-129, Microsoft, 2011.
[2]	ZWEIG G, PLATT J C, MEEK C, et al. Computational approaches to sentence completion[C]// Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Je ju Island, Korea: Association for Computational Linguistics, 2012, 1: 601-610.
[3]	LANDAUER T, DUMAIS S T. Latent semantic analysis[J]. Annual Review of Information Science and Technology, 2004, 38(1): 188-230.
[4]	GUBBINS J, VLACHOS A. Dependency language models for sentence completion[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Seattle, USA: ACL Press, 2013: 1405-1410.
[5]	MIKOLOV T, KARAFIT M, BURGET L. Recurrent neural network based language model[C]// 11th Annual Conference of the International Speech Communication Association. Makuhari, Japan: ISCA Press, 2010: 1045-1048.
[6]	MIKOLOV T. Statistical language models based on neural networks[R]. Presentation at Google, Mountain View, 2nd April 2012.
[7]	MCCARTHY D, NAVIGLI R. The English lexical substitution task[J].Language Resources and Evaluation, 2009, 43(2): 139-159.
[8]	RITTER A, Mausam, Etzioni O. "A latent dirichlet allocation method for selectional preferences[C]// Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2010: 424-434.
[9]	DE MARNEFFE M C, MACCARTNEY B, MANNING C D. Generating typed dependency parses from phrase structure parses[C]// Proceedings of the Language Resources Evaluation Conference. Stanford, USA: IEEE Press, 2006, 6: 449-454.
[10]	MANNING C D, SURDEANU M, BAUER J, et al. The Stanford CoreNLP natural language processing toolkit[C]// Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore, USA: ACL Press, 2014: 55-60.)

TrendMD

Volume 46 Issue 3 page: 231-237

Cover

Keywords

Article Metrics

Article views (27) PDF downloads(83)

An exam robot for sentence completion in high school English tests

Abstract

Abstract

References

Proportional views

Catalog

Recommended articles

TrendMD

Article Metrics

Proportional views

Authors

Browse

Contact Us

About

An exam robot for sentence completion in high school English tests

Share

Tools

Abstract

Abstract

References

Proportional views

Catalog

Recommended articles

TrendMD

Article Metrics

Proportional views

Authors

Browse

Contact Us

About

Export File

Citation

Format

Content