Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems
Crescenzi, Pierluigi and Gildea, Daniel and Marino, Andrea and Rossi, Gianluca and Satta, Giorgio

Article Structure

Abstract

We study the problem of finding the best head-driven parsing strategy for Linear Context—Free Rewriting System productions.

Introduction

Linear Context-Free Rewriting Systems (LCFRSs) (Vij ay-Shankar et al., 1987) constitute a very general grammatical formalism which subsumes context-free grammars (CFGs) and tree adjoining grammars (TAGS), as well as the synchronous context-free grammars (SCFGs) and synchronous tree adjoining grammars (STAGs) used as models in machine translation.1 LCFRSs retain the fundamental property of CFGs that grammar nonterminals rewrite independently, but allow nonterminals to generate discontinuous phrases, that is, to generate more than one span in the string being produced.

LCFRSs and parsing complexity

In this section we briefly introduce LCFRSs and define the problem of optimizing head-driven parsing complexity for these formalisms.

NP-completeness results

For an LCFRS production p, let H be its head nonterminal, and let A1, .

Concluding remarks

Head-driven strategies are important in parsing based on LCFRSs, both in order to allow statistical modeling of head-modifier dependencies and in order to generalize the Markovization of CFG parsers

Topics

time complexity

Appears in 13 sentences as: time complexity (14)
In Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems
  1. Whenever it is possible, binarization of LCFRS rules, or reduction of rank to two, is therefore important for parsing, as it reduces the time complexity needed for dynamic programming.
    Page 1, “Introduction”
  2. Gildea (2010) presents a related method for bina-rizing rules while keeping the time complexity of parsing as small as possible.
    Page 2, “Introduction”
  3. Binarization turns out to be possible with no penalty in time complexity, but, again, the factorization algorithm is exponential in the resulting time complexity .
    Page 2, “Introduction”
  4. Gildea (2011) shows that a polynomial time algorithm for factor-izing LCFRSs in order to minimize time complexity would imply an improved approximation algorithm for the well-studied graph-theoretic property known as treewidth.
    Page 2, “Introduction”
  5. However, whether the problem of fac-torizing LCFRSs in order to minimize time complexity is NP-hard is still an open question in the above works.
    Page 2, “Introduction”
  6. For SCFG, Satta and Peserico (2005) showed that the exponent in the time complexity of parsing algorithms must grow at least as fast as the square root of the rule rank, and Gildea and §tefankovic (2007) tightened this bound to be linear in the rank.
    Page 2, “Introduction”
  7. We examine the question of whether we can efficiently find the strategy that minimizes either the time complexity or the space complexity of parsing.
    Page 2, “Introduction”
  8. It can also be shown that, if a partial parse having fanout f is obtained by means of the combination of two partial parses with fanout f1 and f2, respectively, the resulting time complexity will be O(|w|f+f1+f2) (Seki et al., 1991; Gildea, 2010).
    Page 4, “LCFRSs and parsing complexity”
  9. AS an example, in the case of parsing based on CFGS, nonterminals as well as partial parses all have fanout one, resulting in the standard time complexity of O(|w|3) of dynamic programming methods.
    Page 4, “LCFRSs and parsing complexity”
  10. When parsing with TAGS, we have to manipulate objects with fanout two (in the worst case), resulting in time complexity of O(|w|6).
    Page 4, “LCFRSs and parsing complexity”
  11. Optimizing the parsing complexity for a production meanS finding a parsing strategy that results in minimum Space or time complexity .
    Page 4, “LCFRSs and parsing complexity”

See all papers in Proc. ACL 2011 that mention time complexity.

See all papers in Proc. ACL that mention time complexity.

Back to top.

binarization

Appears in 8 sentences as: Binarization (1) binarization (5) binarizes (1) binarizing (1)
In Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems
  1. Whenever it is possible, binarization of LCFRS rules, or reduction of rank to two, is therefore important for parsing, as it reduces the time complexity needed for dynamic programming.
    Page 1, “Introduction”
  2. This has lead to a number of binarization algorithms for LCFRSs, as well as factorization algorithms that factor rules into new rules with smaller rank, without necessarily reducing rank all the way to two.
    Page 1, “Introduction”
  3. Kuhlmann and Satta (2009) present an algorithm for binarizing certain LCFRS rules without increasing their fanout, and Sagot and Satta (2010) show how to reduce rank to the lowest value possible for LCFRS rules of fanout two, again without increasing fanout.
    Page 1, “Introduction”
  4. (2009) present an algorithm for binarization of LCFRSs while keeping fanout as small as possible.
    Page 2, “Introduction”
  5. Binarization turns out to be possible with no penalty in time complexity, but, again, the factorization algorithm is exponential in the resulting time complexity.
    Page 2, “Introduction”
  6. In this paper, we investigate the problem of rule binarization for LCFRSs in the context of head-driven parsing strategies.
    Page 2, “Introduction”
  7. The statistical LCFRS parser of Kallmeyer and Maier (2010) binarizes rules head-outward, and therefore adopts what we refer to as a head-driven strategy.
    Page 2, “Introduction”
  8. However, the binarization used by Kallmeyer and Maier
    Page 2, “Introduction”

See all papers in Proc. ACL 2011 that mention binarization.

See all papers in Proc. ACL that mention binarization.

Back to top.

dynamic programming

Appears in 4 sentences as: dynamic programming (4)
In Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems
  1. General algorithms for parsing LCFRSs build a dynamic programming chart of recognized nonterminals bottom-up, in a manner analogous to the CKY algorithm for CFGs (Hopcroft and Ullman, 1979), but with time and space complexity that are dependent on the rank and fanout of the grammar rules.
    Page 1, “Introduction”
  2. Whenever it is possible, binarization of LCFRS rules, or reduction of rank to two, is therefore important for parsing, as it reduces the time complexity needed for dynamic programming .
    Page 1, “Introduction”
  3. Existing parsing algorithms for LCFRSs exploit dynamic programming .
    Page 3, “LCFRSs and parsing complexity”
  4. AS an example, in the case of parsing based on CFGS, nonterminals as well as partial parses all have fanout one, resulting in the standard time complexity of O(|w|3) of dynamic programming methods.
    Page 4, “LCFRSs and parsing complexity”

See all papers in Proc. ACL 2011 that mention dynamic programming.

See all papers in Proc. ACL that mention dynamic programming.

Back to top.

parsing algorithms

Appears in 4 sentences as: parsing algorithm (1) parsing algorithms (3)
In Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems
  1. For SCFG, Satta and Peserico (2005) showed that the exponent in the time complexity of parsing algorithms must grow at least as fast as the square root of the rule rank, and Gildea and §tefankovic (2007) tightened this bound to be linear in the rank.
    Page 2, “Introduction”
  2. We provide what is to our knowledge the first NP-hardness result for a grammar factorization problem, which we hope will aid in understanding parsing algorithms in general.
    Page 2, “Introduction”
  3. Existing parsing algorithms for LCFRSs exploit dynamic programming.
    Page 3, “LCFRSs and parsing complexity”
  4. The maximum fanout f realized by a parsing strategy determines the space complexity of the parsing algorithm .
    Page 4, “LCFRSs and parsing complexity”

See all papers in Proc. ACL 2011 that mention parsing algorithms.

See all papers in Proc. ACL that mention parsing algorithms.

Back to top.

machine translation

Appears in 3 sentences as: machine translation (3)
In Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems
  1. Similar questions have arisen in the context of machine translation , as the SCFGs used to model translation are also instances of LCFRSs, as already mentioned.
    Page 2, “Introduction”
  2. Grammar factorization for synchronous models is an important component of current machine translation systems (Zhang et al., 2006), and algorithms for factorization have been studied by Gildea et al.
    Page 9, “Concluding remarks”
  3. These algorithms do not result in what we refer as head-driven strategies, although, as machine translation systems improve, lexicalized rules may become important in this setting as well.
    Page 9, “Concluding remarks”

See all papers in Proc. ACL 2011 that mention machine translation.

See all papers in Proc. ACL that mention machine translation.

Back to top.