Head-driven Transition-based Parsing with Top-down Prediction
Hayashi, Katsuhiko and Watanabe, Taro and Asahara, Masayuki and Matsumoto, Yuji

Article Structure

Abstract

This paper presents a novel top-down head-driven parsing algorithm for data-driven projective dependency analysis.

Introduction

Transition-based parsing algorithms, such as shift-reduce algorithms (Nivre, 2004; Zhang and Clark, 2008), are widely used for dependency analysis because of the efficiency and comparatively good performance.

Definition of Dependency Graph

A dependency graph is defined as follows.

Top-down Parsing Algorithm

Our proposed algorithm is a transition-based algorithm, which uses stack and queue data structures.

Correctness

To prove the correctness of the system in Figure l for the projective dependency graph, we use the proof strategy of (Nivre, 2008a).

Weighted Parsing Model

5.1 Stack-based Model

Time Complexity

Our proposed top-down algorithm has three kinds of actions which are scan, comp and predict.

Related Work

Alshawi (1996) proposed head automaton which recognizes an input sentence top-down.

Experiments

Experiments were performed on the English Penn Treebank data and the Chinese CoNLL-06 data.

Conclusion

This paper presents a novel head-driven parsing algorithm and empirically shows that it is as practical as other dependency parsing algorithms.

Topics

shift-reduce

Appears in 14 sentences as: shift-reduce (14)
In Head-driven Transition-based Parsing with Top-down Prediction
  1. This algorithm handles global structures, such as clause and coordination, better than shift-reduce or other bottom-up algorithms.
    Page 1, “Abstract”
  2. Transition-based parsing algorithms, such as shift-reduce algorithms (Nivre, 2004; Zhang and Clark, 2008), are widely used for dependency analysis because of the efficiency and comparatively good performance.
    Page 1, “Introduction”
  3. (2004) pointed out that the drawbacks of shift-reduce parser could be resolved by incorporating top-down information such as root finding.
    Page 1, “Introduction”
  4. This work presents an 0(n2) top-down head-driven transition-based parsing algorithm which can parse complex structures that are not trivial for shift-reduce parsers.
    Page 1, “Introduction”
  5. Yamada and Matsumoto (2003) applied a shift-reduce algorithm to dependency analysis, which is known as arc-standard transition-based algorithm (Nivre, 2004).
    Page 6, “Related Work”
  6. length, comparing with top-down, 2nd-MST and shift-reduce parsers (beam size: 8, pred size: 5)
    Page 7, “Experiments”
  7. We also implemented and experimented Huang and Sagae (2010)’s arc-standard shift-reduce parser.
    Page 7, “Experiments”
  8. We used an early update version of averaged perceptron algorithm (Collins and Roark, 2004) for training of shift-reduce and top-down parsers.
    Page 7, “Experiments”
  9. Table 2: Oracle score, choosing the highest accuracy parse for each sentence on test data from results of top-down (beam 8, pred 5) and shift-reduce (beam 8) and MST(2nd) parsers in Table l.
    Page 7, “Experiments”
  10. After 25 iterations of perceptron training, we achieved 92.94 unlabeled accuracy for top-down parser with the FIRST function and 93.01 unlabeled accuracy for shift-reduce parser on development data by setting the beam size to 8 for both parsers and the prediction size to 5 in top-down parser.
    Page 7, “Experiments”
  11. On the other hand, top-down parser was less accurate than shift-reduce
    Page 7, “Experiments”

See all papers in Proc. ACL 2012 that mention shift-reduce.

See all papers in Proc. ACL that mention shift-reduce.

Back to top.

parsing algorithm

Appears in 10 sentences as: parsing algorithm (6) parsing algorithms (6)
In Head-driven Transition-based Parsing with Top-down Prediction
  1. This paper presents a novel top-down head-driven parsing algorithm for data-driven projective dependency analysis.
    Page 1, “Abstract”
  2. Experiments on the English Penn Treebank data and the Chinese CoNLL-06 data show that the proposed algorithm achieves comparable results with other data-driven dependency parsing algorithms .
    Page 1, “Abstract”
  3. Transition-based parsing algorithms , such as shift-reduce algorithms (Nivre, 2004; Zhang and Clark, 2008), are widely used for dependency analysis because of the efficiency and comparatively good performance.
    Page 1, “Introduction”
  4. This work presents an 0(n2) top-down head-driven transition-based parsing algorithm which can parse complex structures that are not trivial for shift-reduce parsers.
    Page 1, “Introduction”
  5. Experimental results show that the proposed top-down parser achieves competitive results with other data-driven parsing algorithms .
    Page 1, “Introduction”
  6. Eisner and Satta (1999) showed that there is a cubic-time parsing algorithm on the formalism of the head automaton grammars, which are equivalently converted into split-head bilexical context-free grammars (SBCFGs) (McAllester, 1999; Johnson, 2007).
    Page 6, “Related Work”
  7. Although our proposed algorithm does not employ the formalism of SBCFGs, it creates left children before right children, implying that it does not have spurious ambiguities as well as parsing algorithms on the SBCFGs.
    Page 6, “Related Work”
  8. Head-comer parsing algorithm (Kay, 1989) creates dependency tree top-down, and in this our algorithm has similar spirit to it.
    Page 6, “Related Work”
  9. We compared top-down parsing algorithm with other data-driven parsing algorithms in Table l. Top-down parser achieved comparable unlabeled accuracy with others, and outperformed them on the sentence complete rate.
    Page 7, “Experiments”
  10. This paper presents a novel head-driven parsing algorithm and empirically shows that it is as practical as other dependency parsing algorithms .
    Page 8, “Conclusion”

See all papers in Proc. ACL 2012 that mention parsing algorithm.

See all papers in Proc. ACL that mention parsing algorithm.

Back to top.

beam search

Appears in 7 sentences as: Beam Search (1) Beam search (1) beam search (5)
In Head-driven Transition-based Parsing with Top-down Prediction
  1. To improve parsing flexibility in deterministic parsing, our top-down parser uses beam search algorithm with dynamic programming (Huang and Sagae, 2010).
    Page 1, “Introduction”
  2. Algorithm 1 Top-down Parsing with Beam Search 1: inputW= n0,...,nn
    Page 4, “Weighted Parsing Model”
  3. As mentioned in Section 1, we apply beam search and Huang and Sagae (2010)’s DP techniques to our top-down parser.
    Page 4, “Weighted Parsing Model”
  4. Algorithm 1 shows the our beam search algorithm in which top most b states are preserved in a buffer buflé] in each step.
    Page 4, “Weighted Parsing Model”
  5. When we apply beam search to the top-down parser, then we no longer use El but V on predfl and
    Page 4, “Weighted Parsing Model”
  6. Beam search is performed based on the following linear order for the two states p and p’ at the same step, which have (cfw, cm) and (c’fw, c’ 1n) respectively:
    Page 6, “Weighted Parsing Model”
  7. n + (n ) + + Z 2 ( ) As 77.2 for prediction is the most dominant factor, the time complexity of the algorithm is 0(n2) and that of the algorithm with beam search is 0(n2 >x< b).
    Page 6, “Time Complexity”

See all papers in Proc. ACL 2012 that mention beam search.

See all papers in Proc. ACL that mention beam search.

Back to top.

beam size

Appears in 6 sentences as: beam size (7)
In Head-driven Transition-based Parsing with Top-down Prediction
  1. The complexity becomes 0(n2 >x< b) where b is the beam size .
    Page 1, “Introduction”
  2. length, comparing with top-down, 2nd-MST and shift-reduce parsers ( beam size : 8, pred size: 5)
    Page 7, “Experiments”
  3. During training, we fixed the prediction size and beam size to 5 and 16, respectively, judged by pre-
    Page 7, “Experiments”
  4. After 25 iterations of perceptron training, we achieved 92.94 unlabeled accuracy for top-down parser with the FIRST function and 93.01 unlabeled accuracy for shift-reduce parser on development data by setting the beam size to 8 for both parsers and the prediction size to 5 in top-down parser.
    Page 7, “Experiments”
  5. Following English experiments, shift-reduce parser was trained by setting beam size to 16, and top-down parser was trained with the beam size and the prediction size to 16 and 5, respectively.
    Page 8, “Experiments”
  6. Table 3 shows the results on the Chinese test data when setting beam size to 8 for both parsers and prediction size to 5 in top-down parser.
    Page 8, “Experiments”

See all papers in Proc. ACL 2012 that mention beam size.

See all papers in Proc. ACL that mention beam size.

Back to top.

dependency parsing

Appears in 4 sentences as: dependency parser: (1) dependency parsing (3)
In Head-driven Transition-based Parsing with Top-down Prediction
  1. Experiments on the English Penn Treebank data and the Chinese CoNLL-06 data show that the proposed algorithm achieves comparable results with other data-driven dependency parsing algorithms.
    Page 1, “Abstract”
  2. The Earley prediction is tied to a particular grammar rule, but the proposed algorithm is data—driven, following the current trends of dependency parsing (Nivre, 2006; McDonald and Pereira, 2006; Koo et al., 2010).
    Page 1, “Introduction”
  3. We define the set FIRST(-) for our top-down dependency parser:
    Page 6, “Weighted Parsing Model”
  4. This paper presents a novel head-driven parsing algorithm and empirically shows that it is as practical as other dependency parsing algorithms.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2012 that mention dependency parsing.

See all papers in Proc. ACL that mention dependency parsing.

Back to top.

dependency tree

Appears in 3 sentences as: dependency tree (3)
In Head-driven Transition-based Parsing with Top-down Prediction
  1. If a projective dependency graph is connected, we call it a dependency tree , and if not, a dependency forest.
    Page 2, “Definition of Dependency Graph”
  2. where t’ is a POS-tag, Tree is a correct dependency tree which eXists in Corpus, a function lmdescendant(Tree, t’) returns the set of the leftmost descendant node 1d of each nodes in Tree whose POS-tag is t’, and ld.t denotes a POS-tag of 1d.
    Page 6, “Weighted Parsing Model”
  3. Head-comer parsing algorithm (Kay, 1989) creates dependency tree top-down, and in this our algorithm has similar spirit to it.
    Page 6, “Related Work”

See all papers in Proc. ACL 2012 that mention dependency tree.

See all papers in Proc. ACL that mention dependency tree.

Back to top.