ISSN 0253-2778

CN 34-1054/N

Open AccessOpen Access JUSTC Original Paper

An exam robot for sentence completion in high school English tests

Cite this:
https://doi.org/10.3969/j.issn.0253-2778.2016.03.008
  • Received Date: 12 September 2015
  • Accepted Date: 29 December 2015
  • Rev Recd Date: 29 December 2015
  • Publish Date: 30 March 2016
  • Addressed in this paper is the problem of sentence completion in Chinese national college or high school entrance English examinations in which the most appropriate word or phrase from a given set shoud be chosen to complete a sentence. Although a variety of methods have been developed to solve this problem in the literature, these approaches mainly focused on language modeling (LM) and latent semantic analysis (LSA) to the best of our knowledge. An exam robot prototype was built by extending the language modeling and latent semantic analysis methods to verb tense analysis and long distance phrase extraction. Specifically speaking, the syntactic, lexical and semantic features are extracted separately using by means of LM and LSA as well as verb tense analysis and phrase extraction two methods developed by the authors. These features are then fed into a learning to rank model to build the exam robot. The proposed approach outperforms LM and LSA models by 4.0 percentage points, achieving 78% accuracy on the question sets for senior entrance exams and 76% accuracy on the question sets for college entrance exams.
    Addressed in this paper is the problem of sentence completion in Chinese national college or high school entrance English examinations in which the most appropriate word or phrase from a given set shoud be chosen to complete a sentence. Although a variety of methods have been developed to solve this problem in the literature, these approaches mainly focused on language modeling (LM) and latent semantic analysis (LSA) to the best of our knowledge. An exam robot prototype was built by extending the language modeling and latent semantic analysis methods to verb tense analysis and long distance phrase extraction. Specifically speaking, the syntactic, lexical and semantic features are extracted separately using by means of LM and LSA as well as verb tense analysis and phrase extraction two methods developed by the authors. These features are then fed into a learning to rank model to build the exam robot. The proposed approach outperforms LM and LSA models by 4.0 percentage points, achieving 78% accuracy on the question sets for senior entrance exams and 76% accuracy on the question sets for college entrance exams.
  • loading
  • [1]
    ZWEIG G, BURGES C J C. The Microsoft research sentence completion challenge[R]. Technical Report MSR-TR-2011-129, Microsoft, 2011.
    [2]
    ZWEIG G, PLATT J C, MEEK C, et al. Computational approaches to sentence completion[C]// Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Je ju Island, Korea: Association for Computational Linguistics, 2012, 1: 601-610.
    [3]
    LANDAUER T, DUMAIS S T. Latent semantic analysis[J]. Annual Review of Information Science and Technology, 2004, 38(1): 188-230.
    [4]
    GUBBINS J, VLACHOS A. Dependency language models for sentence completion[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Seattle, USA: ACL Press, 2013: 1405-1410.
    [5]
    MIKOLOV T, KARAFIT M, BURGET L. Recurrent neural network based language model[C]// 11th Annual Conference of the International Speech Communication Association. Makuhari, Japan: ISCA Press, 2010: 1045-1048.
    [6]
    MIKOLOV T. Statistical language models based on neural networks[R]. Presentation at Google, Mountain View, 2nd April 2012.
    [7]
    MCCARTHY D, NAVIGLI R. The English lexical substitution task[J].Language Resources and Evaluation, 2009, 43(2): 139-159.
    [8]
    RITTER A, Mausam, Etzioni O. "A latent dirichlet allocation method for selectional preferences[C]// Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2010: 424-434.
    [9]
    DE MARNEFFE M C, MACCARTNEY B, MANNING C D. Generating typed dependency parses from phrase structure parses[C]// Proceedings of the Language Resources Evaluation Conference. Stanford, USA: IEEE Press, 2006, 6: 449-454.
    [10]
    MANNING C D, SURDEANU M, BAUER J, et al. The Stanford CoreNLP natural language processing toolkit[C]// Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore, USA: ACL Press, 2014: 55-60.)
  • 加载中

Catalog

    [1]
    ZWEIG G, BURGES C J C. The Microsoft research sentence completion challenge[R]. Technical Report MSR-TR-2011-129, Microsoft, 2011.
    [2]
    ZWEIG G, PLATT J C, MEEK C, et al. Computational approaches to sentence completion[C]// Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Je ju Island, Korea: Association for Computational Linguistics, 2012, 1: 601-610.
    [3]
    LANDAUER T, DUMAIS S T. Latent semantic analysis[J]. Annual Review of Information Science and Technology, 2004, 38(1): 188-230.
    [4]
    GUBBINS J, VLACHOS A. Dependency language models for sentence completion[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Seattle, USA: ACL Press, 2013: 1405-1410.
    [5]
    MIKOLOV T, KARAFIT M, BURGET L. Recurrent neural network based language model[C]// 11th Annual Conference of the International Speech Communication Association. Makuhari, Japan: ISCA Press, 2010: 1045-1048.
    [6]
    MIKOLOV T. Statistical language models based on neural networks[R]. Presentation at Google, Mountain View, 2nd April 2012.
    [7]
    MCCARTHY D, NAVIGLI R. The English lexical substitution task[J].Language Resources and Evaluation, 2009, 43(2): 139-159.
    [8]
    RITTER A, Mausam, Etzioni O. "A latent dirichlet allocation method for selectional preferences[C]// Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL Press, 2010: 424-434.
    [9]
    DE MARNEFFE M C, MACCARTNEY B, MANNING C D. Generating typed dependency parses from phrase structure parses[C]// Proceedings of the Language Resources Evaluation Conference. Stanford, USA: IEEE Press, 2006, 6: 449-454.
    [10]
    MANNING C D, SURDEANU M, BAUER J, et al. The Stanford CoreNLP natural language processing toolkit[C]// Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore, USA: ACL Press, 2014: 55-60.)

    Article Metrics

    Article views (27) PDF downloads(83)
    Proportional views

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return