Discriminative Approach to Fill-in-the-Blank Quiz Generation for
Language Learners (Sakaguchi et.al., 2013) 1 Introduction This paper focuses on automated distractor generation methods using a large-scale ESL corpus with a discriminative model. They focus on semantically confusing distractors that measure learners competence to distinguish word-sense and select an appropriate word. Evaluation is done using 3 native speakers and 23 non-native speakers of English, and after that correlate their performance on quiz generated with their actual TOEFL/TOEIC scores. Contributions of this paper are: 1) Present method for generating reliable and valid distractors, 2) Demonstrate the effectiveness of ESL corpus and discriminative models on distractor generation. 2 Methodology 2.1 Error-Correction Pair Extraction This step is to collect error-correction pair from Lang-8 corpus of Learner English. Error-correction pairs are extracted by comparing trigrams around the replacement in the original and corrected sentences for considering surrounding context of target. Using the error-correction pairs, conditional probabilities P(wewc), which represent how probable that learners misuse word wc as we are calculated. Based on that, confusion matrix is calculated to generate distractors. Given a sentence, verbs appearing in the matrix are identied and blanked, then outputs distractors candidate that have high confusion matrix. For generating semantic distractors, they used correction as a target, and the misused word (error) as one of the distractor candidates. 2.2 Discriminative Model for Distractor Generation and Selection Multiple classier for each target word are trained using error-correction pairs extracted fromthe previous step. A classier for a target word takes a sentence (in which the target word appears) as an input, and outputs the a verb as the best distractor given the context using following features: 5 gram (+- 1 and 2 words of the target) lemmas and dependency type with the target child. For pronoun, date time, or number the dependent is normalized (e.g. he PRP) to avoid feature space sparse. SUMMARY BY YUNI SUSANTI Two methods for training the classiers: DiscESL: directly use the corrected sentences in Lang-8 corpus, and use the original word (misused word by learners) as a class. See table 1. DiscSimESL: ESL-simulated native corpus, using articles from VOA Learning English. For each target in a given sentence, the target is changed into an incorrect word according to the error probabilities obtained from the learners confusion matrix. For each training sentence, 100 samples are generated, and the target word is replaced into an erroneour word. 3 Evaluation 3.1 Evaluation with Native Speakers This evaluation is to measure the reliability of the distractors. Three native speakers og English are asked to solve the ll-in-the-blank quiz generated by the system. 50 quizzes are generated using different sentences per each method to avoid showing the same sentence. The source sentence to generate the quiz are collected from VOA. Evaluation metric is the ratio of the appropriate distractors. RAD = N AD N ALL Where NALL is the total number of quizzes and NAD is the number of quizzes in which more than or equal to 2 judges agree by selecting the correct answer. When at least 2 judges select the option c (both options are correct), it means that the distractors are inappropriate. The human evaluation shows that 98.3% of distractors are reliable. 3.2 Evaluation with ESL Learners This evaluation measures the validity of the distractors by evaluate the correlation between learners accuracy for the gen- erated quizzes and their TOEIC score. The participants are 24 Japanese native speakers. The result is that DiscSimESL achieved higher level of positive correlation (0.76). SUMMARY BY YUNI SUSANTI 4 Conclusion The paper proposed a method to generate semantic distractors automatically in ll-in-the-blank quiz. The proposed methods employ discriminative models trained using error patterns extracted from ESL corpus and can generate reliable distractors by taking context of a given sentence into consideration.