Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing
Sassano, Manabu and Kurohashi, Sadao

Article Structure

Abstract

We investigate active learning methods for Japanese dependency parsing.

Introduction

Reducing annotation cost is very important because supervised learning approaches, which have been successful in natural language processing, require typically a large number of labeled examples.

Active Learning

2.1 Pool-based Active Learning

Japanese Parsing

3.1 Syntactic Units

Active Learning for Parsing

Most of the methods of active learning for parsing in previous work use selection of sentences that seem to contribute to the improvement of accuracy (Tang et al., 2002; Hwa, 2004; Baldridge and Osborne, 2004).

Active Learning for Japanese Dependency Parsing

In this section we describe sample selection methods which we investigated.

Experimental Evaluation and Discussion

6.1 Corpus

Conclusion

We have investigated that active learning methods for Japanese dependency parsing.

Topics

dependency parsing

Appears in 14 sentences as: Dependency Parsing (1) dependency parsing (14)
In Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing
  1. We investigate active learning methods for Japanese dependency parsing .
    Page 1, “Abstract”
  2. Experimental results show that our proposed methods improve considerably the leam-ing curve of Japanese dependency parsing .
    Page 1, “Abstract”
  3. We use Japanese dependency parsing as a target task in this study since a simple and efficient algorithm of parsing is proposed and, to our knowledge, active learning for Japanese dependency parsing has never been studied.
    Page 1, “Introduction”
  4. In Section 5 we describe our proposed methods and others of active learning for Japanese dependency parsing .
    Page 1, “Introduction”
  5. 3.3 Algorithm of Japanese Dependency Parsing
    Page 2, “Japanese Parsing”
  6. We use Sassano’s algorithm (Sassano, 2004) for Japanese dependency parsing .
    Page 2, “Japanese Parsing”
  7. Figure 3: Algorithm of Japanese dependency parsing
    Page 3, “Japanese Parsing”
  8. 4We did not employ query-by-committee (QBC) (Seung et al., 1992), which is another important general framework of active learning, since the selection strategy with large margin classifiers (Section 2.2) is much simpler and seems more practical for active learning in Japanese dependency parsing with smaller constituents.
    Page 3, “Active Learning for Japanese Dependency Parsing”
  9. We set the degree of the kernels to 3 since cubic kernels with SVM have proved effective for Japanese dependency parsing (Kudo and Matsumoto, 2000; Kudo and Matsumoto, 2002).
    Page 5, “Experimental Evaluation and Discussion”
  10. There are features that have been commonly used for Japanese dependency parsing among related papers, e.g., (Kudo and Matsumoto, 2002; Sassano, 2004; Iwatate et al., 2008).
    Page 5, “Experimental Evaluation and Discussion”
  11. It is observed that active learning with large margin classifiers also works well for Sassano’s algorithm of Japanese dependency parsing .
    Page 5, “Experimental Evaluation and Discussion”

See all papers in Proc. ACL 2010 that mention dependency parsing.

See all papers in Proc. ACL that mention dependency parsing.

Back to top.

support vectors

Appears in 7 sentences as: support vectors (6) “Support vectors” (1) “support vectors” (1)
In Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing
  1. Figure 11: Changes of number of support vectors in sentence-Wise active learning
    Page 7, “Experimental Evaluation and Discussion”
  2. Figure 12: Changes of number of support vectors in chunk-Wise active learning (MODSIMPLE)
    Page 7, “Experimental Evaluation and Discussion”
  3. Stopping Criteria It is known that increment rate of the number of support vectors in SVM indicates saturation of accuracy improvement during iterations of active learning (Schohn and Cohn, 2000).
    Page 8, “Experimental Evaluation and Discussion”
  4. We plotted changes of the number of support vectors in the cases of both PASSIVE and MIN in Figure 11 and changes of the number of support vectors in the case of MODSIMPLE in Figure 12.
    Page 8, “Experimental Evaluation and Discussion”
  5. We observed that the increment rate of support vectors mildly gets smaller.
    Page 8, “Experimental Evaluation and Discussion”
  6. 7Following (Freund and Schapire, 1999), we use the term “support vectors” for AP as well as SVM.
    Page 8, “Experimental Evaluation and Discussion”
  7. “Support vectors” of AP means vectors which are selected in the training phase and contribute to the prediction.
    Page 8, “Experimental Evaluation and Discussion”

See all papers in Proc. ACL 2010 that mention support vectors.

See all papers in Proc. ACL that mention support vectors.

Back to top.

dependency relation

Appears in 6 sentences as: dependency relation (3) dependency relations (3)
In Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing
  1. We propose active learning methods of using partial dependency relations in a given sentence for parsing and evaluate their effectiveness empirically.
    Page 1, “Abstract”
  2. When we use this algorithm with a machine learning-based classifier, function Dep() in Figure 3 uses the classifier to decide whether two bunsetsus have a dependency relation .
    Page 2, “Japanese Parsing”
  3. annotators would label either “D” for the two bunsetsu having a dependency relation or “O”, which represents the two does not.
    Page 4, “Active Learning for Japanese Dependency Parsing”
  4. That is “0” does not simply mean that two bunsetsus does not have a dependency relation .
    Page 6, “Experimental Evaluation and Discussion”
  5. Issues on Accessing the Total Cost of Annotation In this paper, we assume that each annotation cost for dependency relations is constant.
    Page 8, “Experimental Evaluation and Discussion”
  6. In addition, as far as we know, we are the first to propose the active learning methods of using partial dependency relations in a given sentence for parsing and we have evaluated the effectiveness of our methods.
    Page 9, “Conclusion”

See all papers in Proc. ACL 2010 that mention dependency relation.

See all papers in Proc. ACL that mention dependency relation.

Back to top.

perceptron

Appears in 6 sentences as: Perceptron (1) perceptron (5)
In Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing
  1. 6.2 Averaged Perceptron
    Page 5, “Experimental Evaluation and Discussion”
  2. We used the averaged perceptron (AP) (Freund and Schapire, 1999) with polynomial kernels.
    Page 5, “Experimental Evaluation and Discussion”
  3. We found the best value of the epoch T of the averaged perceptron by using the development set.
    Page 5, “Experimental Evaluation and Discussion”
  4. We implemented a parser and a tool for the averaged perceptron in C++ and used them for experiments.
    Page 5, “Experimental Evaluation and Discussion”
  5. It is interesting to examine whether the observation for SVM is also useful for support vectors7 of the averaged perceptron .
    Page 8, “Experimental Evaluation and Discussion”
  6. It is observed that active learning of parsing with the averaged perceptron , which is one of the large margin classifiers, works also well for Japanese dependency analysis.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2010 that mention perceptron.

See all papers in Proc. ACL that mention perceptron.

Back to top.

parsing algorithms

Appears in 5 sentences as: parsing algorithm (2) Parsing Algorithms (1) parsing algorithms (3)
In Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing
  1. Section 3 describes the syntactic characteristics of Japanese and the parsing algorithm that we use.
    Page 1, “Introduction”
  2. If we sample smaller units rather than sentences, we have partially annotated sentences and have to use a parsing algorithm that can be trained from incompletely annotated sentences.
    Page 3, “Active Learning for Parsing”
  3. Applicability to Other Languages and Other Parsing Algorithms We discuss here whether or not the proposed methods and the experiments are useful for other languages and other parsing algorithms .
    Page 8, “Experimental Evaluation and Discussion”
  4. Although no one has reported application of (Sassano, 2004) to the languages so far, we believe that similar parsing algorithms will be applicable to them and the discussion in this study would be useful.
    Page 8, “Experimental Evaluation and Discussion”
  5. Even though the use of syntactic constraints is limited, smaller constituents will still be useful for other parsing algorithms that use some deterministic methods with machine learning-based classifiers.
    Page 8, “Experimental Evaluation and Discussion”

See all papers in Proc. ACL 2010 that mention parsing algorithms.

See all papers in Proc. ACL that mention parsing algorithms.

Back to top.

human annotators

Appears in 4 sentences as: human annotators (4)
In Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing
  1. 5 In our experiments human annotators do not give labels.
    Page 5, “Experimental Evaluation and Discussion”
  2. Figure 9 shows the same comparison in terms of required queries to human annotators .
    Page 6, “Experimental Evaluation and Discussion”
  3. Number of queris to human annotators
    Page 7, “Experimental Evaluation and Discussion”
  4. Figure 9: Comparison of MODSIMPLE and SYN in terms of the number of queries to human annotators
    Page 7, “Experimental Evaluation and Discussion”

See all papers in Proc. ACL 2010 that mention human annotators.

See all papers in Proc. ACL that mention human annotators.

Back to top.

SVM

Appears in 4 sentences as: SVM (4)
In Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing
  1. We set the degree of the kernels to 3 since cubic kernels with SVM have proved effective for Japanese dependency parsing (Kudo and Matsumoto, 2000; Kudo and Matsumoto, 2002).
    Page 5, “Experimental Evaluation and Discussion”
  2. Stopping Criteria It is known that increment rate of the number of support vectors in SVM indicates saturation of accuracy improvement during iterations of active learning (Schohn and Cohn, 2000).
    Page 8, “Experimental Evaluation and Discussion”
  3. It is interesting to examine whether the observation for SVM is also useful for support vectors7 of the averaged perceptron.
    Page 8, “Experimental Evaluation and Discussion”
  4. 7Following (Freund and Schapire, 1999), we use the term “support vectors” for AP as well as SVM .
    Page 8, “Experimental Evaluation and Discussion”

See all papers in Proc. ACL 2010 that mention SVM.

See all papers in Proc. ACL that mention SVM.

Back to top.