Automated Collocation Suggestion for Japanese Second Language Learners
Pereira, Lis and Manguilimotan, Erlyn and Matsumoto, Yuji

Article Structure

Abstract

This study addresses issues of Japanese language learning concerning word combinations (collocations).

Introduction

Automated grammatical error correction is emerging as an interesting topic of natural language processing (NLP).

Related Work

Collocation correction currently follows a similar approach used in article and preposition correction.

Topics

distributional similarity

Appears in 9 sentences as: Distributional Similarity (4) distributional similarity (5)
In Automated Collocation Suggestion for Japanese Second Language Learners
  1. rity: l) thesaurus-based word similarity, 2) distributional similarity and 3) confusion set derived from learner corpus.
    Page 3, “Related Work”
  2. Distributional Similarity : Thesaurus-based methods produce weak recall since many words, phrases and semantic connections are not covered by hand-built thesauri, especially for verbs and adjectives.
    Page 3, “Related Work”
  3. As an alternative, distributional similarity models are often used since it gives higher recall.
    Page 3, “Related Work”
  4. On the other hand, distributional similarity models tend to have lower precision (Jurafsky et al., 2009), because the candidate set is larger.
    Page 3, “Related Work”
  5. Setting up a threshold was necessary since the size of the candidate set generated when using Distributional Similarity methods may be quite large, affecting the system performance.
    Page 5, “Related Work”
  6. When computing Distributional Similarity , scores are also assigned to each candidate, thus, when we set up a threshold value n, we consider the list of n candidates with highest scores.
    Page 5, “Related Work”
  7. In order to improve the recall rate, we generated models M2-M6, which use distributional similarity (cosine similarity) and also use corpora other than Mainichi Shimbun corpus to minimize the domain gap problem between the learner’s vocabulary and the newspaper vocabulary found in the Mainichi Shimbun data.
    Page 5, “Related Work”
  8. In order to compare it with other distributional similarity metrics (Dice, KL-Divergence and J enson-Shannon Divergence) and with the method that uses Lang-8 for generating the confusion set, we chose the model with the highest recall value as baseline, which is the one that uses BCCWJ and Lang-8 (M6) and generated other models (M7-M10).
    Page 5, “Related Work”
  9. The best MRR value obtained among all the Distributional Similarity methods was obtained by J enson-Shannon divergence.
    Page 5, “Related Work”

See all papers in Proc. ACL 2013 that mention distributional similarity.

See all papers in Proc. ACL that mention distributional similarity.

Back to top.

similarity measures

Appears in 7 sentences as: Similarity measures (1) similarity measures (6)
In Automated Collocation Suggestion for Japanese Second Language Learners
  1. In this work, we analyze various Japanese corpora using a number of collocation and word similarity measures to deduce and suggest the best collocations for Japanese second language learners.
    Page 1, “Introduction”
  2. In order to build a system that is more sensitive to constructions that are difficult for learners, we use word similarity measures that generate collocation candidates using a large Japanese language learner corpus.
    Page 1, “Introduction”
  3. similarity measures are used.
    Page 2, “Related Work”
  4. Our work follows the general approach, that is, uses similarity measures for generating the confusion set and association measures for ranking the best candidates.
    Page 2, “Related Work”
  5. Similarity measures are used to generate the collocation candidates that are later ranked using association measures.
    Page 2, “Related Work”
  6. Table 4 shows the ten models derived from combining different word similarity measures and the Weighted Dice measure as association measure, using different corpora.
    Page 5, “Related Work”
  7. Considering that the size of the candidate set generated by different word similarity measures vary considerably, we limit the size of the confusion set to 270 for verbs and 160 for nouns, which correspond to the maximum values of the confusion set size for nouns and verbs when using Lang-8 for generating the candidate set.
    Page 5, “Related Work”

See all papers in Proc. ACL 2013 that mention similarity measures.

See all papers in Proc. ACL that mention similarity measures.

Back to top.

co-occurrence

Appears in 6 sentences as: co-occurrence (6)
In Automated Collocation Suggestion for Japanese Second Language Learners
  1. Table 2 Context of a particular noun represented as a co-occurrence vector
    Page 3, “Related Work”
  2. Context is represented as co-occurrence vectors that are based on syntactic dependencies.
    Page 3, “Related Work”
  3. Table 3 Context of a particular noun represented as a co-occurrence vector
    Page 3, “Related Work”
  4. Table 2 and Table 3 show examples of part of co-occurrence vectors for the noun “ El EB [diary]” and the verb “fi/{é [eat]”, respectively.
    Page 3, “Related Work”
  5. The numbers indicate the co-occurrence frequency in the BCCWJ corpus (Maekawa, 2008).
    Page 3, “Related Work”
  6. We computed the similarity between co-occurrence vectors using different metrics: Cosine Similarity, Dice coefficient (Curran, 2004), Kullback—Leibler divergence or KL divergence or relative entropy (Kullback and Leibler, 1951) and the J enson-Shannon divergence (Lee, 1999).
    Page 3, “Related Work”

See all papers in Proc. ACL 2013 that mention co-occurrence.

See all papers in Proc. ACL that mention co-occurrence.

Back to top.

cosine similarity

Appears in 5 sentences as: Cosine Similarity (1) cosine similarity (4)
In Automated Collocation Suggestion for Japanese Second Language Learners
  1. We computed the similarity between co-occurrence vectors using different metrics: Cosine Similarity , Dice coefficient (Curran, 2004), Kullback—Leibler divergence or KL divergence or relative entropy (Kullback and Leibler, 1951) and the J enson-Shannon divergence (Lee, 1999).
    Page 3, “Related Work”
  2. One year data (1991) were used to extract the “noun wo verb” tuples to compute word similarity (using cosine similarity metric) and collocation scores.
    Page 4, “Related Work”
  3. These data are necessary to compute the word similarity (using cosine similarity metric) and collocation scores.
    Page 4, “Related Work”
  4. A) Year 2010 data, which contain 1,288,934 pairs of learner’s sentence and its correction, was used to: i) Compute word similarity (using cosine similarity metric) and collocation scores: We took out the learners’ sentences and used only the corrected sentences.
    Page 4, “Related Work”
  5. In order to improve the recall rate, we generated models M2-M6, which use distributional similarity ( cosine similarity ) and also use corpora other than Mainichi Shimbun corpus to minimize the domain gap problem between the learner’s vocabulary and the newspaper vocabulary found in the Mainichi Shimbun data.
    Page 5, “Related Work”

See all papers in Proc. ACL 2013 that mention cosine similarity.

See all papers in Proc. ACL that mention cosine similarity.

Back to top.