Abstract | We also present a new transition-based dependency parsing algorithm that gives a complexity of 0(n) for projective parsing and an expected linear time speed for non-projective parsing. |
Conclusion | Coupled with our new hybrid parsing algorithm , ADAGRAD, rich nonlocal features, and bootstrapping, our parser gives higher parsing accuracy than most other transition-based dependency parsers in multiple languages and shows faster parsing speed. |
Experiments | Our greedy approach outperforms both Nivre (2009) and Fernandez-Gonzalez and Gomez-Rodriguez (2012) who use different non-projective parsing algorithms . |
Introduction | We first present a new transition-based parsing algorithm that gives a complexity of 0(n) for projective parsing and an expected linear time speed for non-projective parsing. |
Related work | Our parsing algorithm is most similar to Choi and Palmer (2011) who integrated our LEFT-REDUCE transition into Nivre’s list-based algorithm. |
Related work | There are other transition-based dependency parsing algorithms that take a similar approach; Nivre (2009) integrated a SWAP transition into Nivre’s arc-standard algorithm (Nivre, 2004) and Fernandez-Gonzalez and Gomez-Rodriguez (2012) integrated a buffer transition into Nivre’s arc-eager algorithm to handle non-projectivity. |
Transition-based dependency parsing | We introduce a transition-based dependency parsing algorithm that is a hybrid between Nivre’s arc-eager and list-based algorithms (Nivre, 2003; Nivre, 2008). |
Transition-based dependency parsing | Nivre’s arc-eager is a projective parsing algorithm showing a complexity of Nivre’s list-based algorithm is a non-projective parsing algorithm showing a complexity of 0(n2). |
Transition-based dependency parsing | 2The parsing complexity of a transition-based dependency parsing algorithm is determined by the number of transitions performed with respect to the number of tokens in a sentence, say n (Kubler et a1., 2009). |
Dependency parsing | For certain factorizations, efficient parsing algorithms exist for solving Eq. |
Dependency parsing | We define the order of a part according to the number of dependencies it contains, with analogous terminology for factorizations and parsing algorithms . |
Existing parsing algorithms | Our new third-order dependency parsers build on ideas from existing parsing algorithms . |
Existing parsing algorithms | Since each derivation is defined by two fixed indices (the boundaries of the span) and a third free index (the split point), the parsing algorithm requires 0(n3) time and 0(n2) space (Eisner, 1996; McAllester, 1999). |
Introduction | Dependency grammar has proven to be a very useful syntactic formalism, due in no small part to the development of efficient parsing algorithms (Eisner, 2000; McDonald et al., 2005b; McDonald and Pereira, 2006; Carreras, 2007), which can be leveraged for a wide variety of learning methods, such as feature-rich discriminative models (Lafferty et al., 2001; Collins, 2002; Taskar et al., 2003). |
Introduction | These parsing algorithms share an important characteristic: they factor dependency trees into sets of parts that have limited interactions. |
Introduction | A crucial limitation of factored parsing algorithms is that the associated parts are typically quite small, losing much of the contextual information within the dependency tree. |
Abstract | This paper presents a novel top-down head-driven parsing algorithm for data-driven projective dependency analysis. |
Abstract | Experiments on the English Penn Treebank data and the Chinese CoNLL-06 data show that the proposed algorithm achieves comparable results with other data-driven dependency parsing algorithms . |
Conclusion | This paper presents a novel head-driven parsing algorithm and empirically shows that it is as practical as other dependency parsing algorithms . |
Experiments | We compared top-down parsing algorithm with other data-driven parsing algorithms in Table l. Top-down parser achieved comparable unlabeled accuracy with others, and outperformed them on the sentence complete rate. |
Introduction | Transition-based parsing algorithms , such as shift-reduce algorithms (Nivre, 2004; Zhang and Clark, 2008), are widely used for dependency analysis because of the efficiency and comparatively good performance. |
Introduction | This work presents an 0(n2) top-down head-driven transition-based parsing algorithm which can parse complex structures that are not trivial for shift-reduce parsers. |
Introduction | Experimental results show that the proposed top-down parser achieves competitive results with other data-driven parsing algorithms . |
Related Work | Eisner and Satta (1999) showed that there is a cubic-time parsing algorithm on the formalism of the head automaton grammars, which are equivalently converted into split-head bilexical context-free grammars (SBCFGs) (McAllester, 1999; Johnson, 2007). |
Related Work | Although our proposed algorithm does not employ the formalism of SBCFGs, it creates left children before right children, implying that it does not have spurious ambiguities as well as parsing algorithms on the SBCFGs. |
Related Work | Head-comer parsing algorithm (Kay, 1989) creates dependency tree top-down, and in this our algorithm has similar spirit to it. |
Abstract | Our parser builds a discourse tree by applying an optimal parsing algorithm to probabilities inferred from two Conditional Random Fields: one for intra-sentential parsing and the other for multi-sentential parsing. |
Conclusion | In this paper, we have presented a novel discourse parser that applies an optimal parsing algorithm to probabilities inferred from two CRF models: one for intra-sentential parsing and the other for multi-sentential parsing. |
Introduction | Second, existing parsers apply greedy and suboptimal parsing algorithms to build the DT for a document. |
Our Discourse Parsing Framework | Both of our parsers have the same two components: a parsing model assigns a probability to every possible DT, and a parsing algorithm identifies the most probable DT among the candidate DTs in that scenario. |
Our Discourse Parsing Framework | While the two models are rather different, the same parsing algorithm is shared by the two modules. |
Our Discourse Parsing Framework | Before describing our parsing models and the parsing algorithm , we introduce some terminology that we will use throughout the paper. |
Parsing Models and Parsing Algorithm | Once we obtain the probability of all possible DT constituents, the discourse sub-trees for the sentences are built by applying an optimal probabilistic parsing algorithm (Section 4.4) using one of the methods described in Section 5. |
Parsing Models and Parsing Algorithm | 4.4 Parsing Algorithm |
Parsing Models and Parsing Algorithm | Given the probability of all possible DT constituents in the intra-sentential and multi-sentential scenarios, the job of the parsing algorithm is to find the most probable DT for that scenario. |
Bottom-up tree-building | Therefore, our model incorporates the strengths of both HILDA and Joty et al.’s model, i.e., the efficiency of a greedy parsing algorithm , and the ability to incorporate sequential information with CRFs. |
Introduction | However, as an optimal discourse parser, Joty et al.’s model is highly inefficient in practice, with respect to both their DCRF-based local classifiers, and their CKY-like bottom-up parsing algorithm . |
Introduction | CKY parsing is a bottom-up parsing algorithm which searches all possible parsing paths by dynamic programming. |
Introduction | As a result of successfully avoiding the expensive non-greedy parsing algorithms , our discourse parser is very efficient in practice. |
Related work | Their model assigns a probability to each possible constituent, and a CKY-like parsing algorithm finds the globally optimal discourse tree, given the computed probabilities. |
Related work | The inefficiency lies in both their DCRF—based joint model, on which inference is usually slow, and their CKY-like parsing algorithm , whose issue is more prominent. |
Results and Discussion | gSVMFH in the second row is our implementation of HILDA’s greedy parsing algorithm using Feng and Hirst (2012)’s enhanced feature set. |
Results and Discussion | First, as shown in Table 2, the average number of sentences in a document is 26.11, which is already too large for optimal parsing models, e.g., the CKY—like parsing algorithm in jCRF, let alone the fact that the largest document contains several hundred of EDUs and sentences. |
Abstract | We define a new formalism, based on Sikkel’s parsing schemata for constituency parsers, that can be used to describe, analyze and compare dependency parsing algorithms . |
Dependency parsing schemata | 0 Some of the most popular dependency parsing algorithms , like that of Eisner (1996), work by connecting spans which can represent disconnected dependency graphs. |
Dependency parsing schemata | 3 Some practical examples 3.1 C0196 (Collins, 96) One of the most straightforward projective dependency parsing strategies is the one described by Collins (1996), directly based on the CYK parsing algorithm . |
Dependency parsing schemata | We have defined a variant of Sikkel’s parsing schemata formalism which allows us to represent dependency parsing algorithms in a simple, declarative way7. |
Introduction | The formalism of parsing schemata (Sikkel, 1997) is a useful tool for the study of constituency parsers since it provides formal, high-level descriptions of parsing algorithms that can be used to prove their formal properties (such as correctness), establish relations between them, derive new parsers from existing ones and obtain efficient implementations automatically (Gomez-Rodriguez et al., 2007). |
Abstract | We introduce a shift-reduce parsing algorithm for phrase-based string-to-dependency translation. |
Introduction | While parsing algorithms can be used to parse partial translations in phrase-based decoding, the search space is significantly enlarged since there are exponentially many parse trees for exponentially many translations. |
Introduction | In this paper, we propose a shift-reduce parsing algorithm for phrase-based string-to-dependency translation. |
Introduction | 4. exploiting syntactic information: as the shift-reduce parsing algorithm generates target language dependency trees in decoding, dependency language models (Shen et al., 2008; Shen et al., 2010) can be used to encourage linguistically-motivated reordering. |
Conclusion | The parsing algorithms described in this paper are implemented in the Boli-nas toolkit.3 |
Parsing | Just as CKY deals with substrings (i, j] of the input, the HRG parsing algorithm deals with edge-induced subgraphs I of the input. |
Parsing | This result, in turn, will allow us to bound the complexity of the parsing algorithm in terms of the treewidth of G. |
Parsing | We present the parsing algorithm as a deductive system (Shieber et al., 1995). |
Synchronous Parsing | In the synchronous parsing algorithm , passive items have the form [A, I , X, I’, X’] and active items have the form [A —> R : R’,n,I,¢,I’,¢’]. |
Synchronous Parsing | The complexity of the parsing algorithm is again in 0((3dn)k+1), where k is now the maximum treewidth of the dependency graph as defined in this section. |
Effective PSCFG parsing | The recognition algorithm above can easily be turned into a parsing algorithm by letting an implementation keep track of which items were derived from which other items, as instantiations of the consequent and the antecedents, respectively, of the inference rule in figure 1. |
Effective PSCFG parsing | A probabilistic parsing algorithm that computes pg([w1, 1112]), defined in (1), can also be obtained from the recognition algorithm above, by associating each item with a probability. |
Introduction | Computation of inside probabilities for PSCFGs is a well-known problem that can be solved using off-the-shelf algorithms that extend basic parsing algorithms . |
Introduction | This contrasts with the techniques proposed by J elinek and Lafferty (1991) and by Stolcke (1995), which are extensions of parsing algorithms for probabilistic context-free grammars, and require considerably more involved proofs of correctness. |
Prefix probabilities | Computing paprefixflvl, 212]) directly using a generic probabilistic parsing algorithm for PSCFGs is difficult, due to the presence of epsilon rules and unit rules. |
Prefix probabilities | apply generic probabilistic parsing algorithms for PSCFGs, such as the inside algorithm discussed in section 3, in order to compute the desired prefix probabilities for the source PSCFG Q. |
Active Learning for Parsing | If we sample smaller units rather than sentences, we have partially annotated sentences and have to use a parsing algorithm that can be trained from incompletely annotated sentences. |
Experimental Evaluation and Discussion | Applicability to Other Languages and Other Parsing Algorithms We discuss here whether or not the proposed methods and the experiments are useful for other languages and other parsing algorithms . |
Experimental Evaluation and Discussion | Although no one has reported application of (Sassano, 2004) to the languages so far, we believe that similar parsing algorithms will be applicable to them and the discussion in this study would be useful. |
Experimental Evaluation and Discussion | Even though the use of syntactic constraints is limited, smaller constituents will still be useful for other parsing algorithms that use some deterministic methods with machine learning-based classifiers. |
Introduction | Section 3 describes the syntactic characteristics of Japanese and the parsing algorithm that we use. |
Introduction | In order to obtain a polynomial parsing algorithm , we need to avoid this effect. |
Parsing algorithm | The algorithm can be easily converted into a parsing algorithm . |
Parsing algorithm | The basic idea is to use a parsing algorithm for TAG, and impose on-the-fly additional restrictions on the underlying derivation trees that are being constructed, in order to fulfill the definition of valid TT-MCTAG derivation. |
TT-MCTAG 3.1 Introduction to TT-MCTAG | is crucial for the polynomial parsing algorithm . |
TT-MCTAG 3.1 Introduction to TT-MCTAG | For our parsing algorithm , we want to avoid grouping the instances of elementary trees in a derivation tree into tuple instances. |
Introduction | For SCFG, Satta and Peserico (2005) showed that the exponent in the time complexity of parsing algorithms must grow at least as fast as the square root of the rule rank, and Gildea and §tefankovic (2007) tightened this bound to be linear in the rank. |
Introduction | We provide what is to our knowledge the first NP-hardness result for a grammar factorization problem, which we hope will aid in understanding parsing algorithms in general. |
LCFRSs and parsing complexity | Existing parsing algorithms for LCFRSs exploit dynamic programming. |
LCFRSs and parsing complexity | The maximum fanout f realized by a parsing strategy determines the space complexity of the parsing algorithm . |
Conclusion | In this paper, we have presented an efficient algorithm for deciding whether a dependency graph is 2-planar and a transition-based parsing algorithm that is provably correct for 2-planar dependency forests, neither of which existed in the literature before. |
Introduction | Although these proposals seem to have a very good fit with linguistic data, in the sense that they often cover 99% or more of the structures found in existing treebanks, the development of efficient parsing algorithms for these classes has met with more limited success. |
Introduction | This was originally proposed by Yli-Jyr'a (2003) but has so far played a marginal role in the dependency parsing literature, because no algorithm was known for determining whether an arbitrary tree was m-planar, and no parsing algorithm existed for any constant value of m. The contribution of this paper is twofold. |
Introduction | Secondly, we present a transition-based parsing algorithm for 2-planar dependency trees, developed in two steps. |
Bilingual subtree constraints | Due to the limitations of the parsing algorithm (McDonald and Pereira, 2006; Carreras, 2007), we only use bigram— and trigram-subtrees in our approach. |
Dependency parsing | (2006), which is an extension of the projective parsing algorithm of Eisner (1996). |
Dependency parsing | To use richer second-order information, we also implement parent-child-grandchild features (Carreras, 2007) in the MST parsing algorithm . |
Dependency parsing | The parsing algorithm chooses the tree with the highest score in a bottom-up fashion. |
Introduction | may sometimes appear too computa-tionally expensive for high-end statistical machine translation, there are many alternative parsing algorithms that have seldom been explored in the machine translation literature. |
Introduction | (2005b) present a quadratic-time dependency parsing algorithm that is just 0.7% less accurate than “full-fledged” chart parsing (which, in the case of dependency parsing, runs in time 0(n3) (Eisner, 1996)). |
Introduction | dency structure is built as a byproduct of phrase-based decoding, without reliance on a dynamic-programming or chart parsing algorithm such as CKY or Earley. |
Machine translation experiments | model score computed with the dependency parsing algorithm described in Section 2. |
Conclusion | We use an in-side/outside parsing algorithm to get the Viterbi outside cost of bigram integrated states which is used as an outside estimate for trigram integrated states. |
Decoding to Maximize BLEU | Our algorithm is another dynamic programming decoding pass on the trigram forest, and is similar to the parsing algorithm for maximizing expected labelled recall presented by Goodman (1996). |
Multi-pass LM-Integrated Decoding | Charniak and Johnson (2005) use a PCFG to do a pass of inside-outside parsing to reduce the state space of a subsequent lexicalized n-best parsing algorithm to produce parses that are further re-ranked by a MaxEnt model. |
Multi-pass LM-Integrated Decoding | Klein and Manning (2003) describe an A* parsing framework for monolingual parsing and admissible outside estimates that are computed using in-side/outside parsing algorithm on simplified PCFGs compared to the original PCFG. |
Experimental Framework | Table 1: LAS results with several parsing algorithms , Penn2Ma1t conversion (T: p <0.05, 1;: p <0.005). |
Experimental Framework | Table 2: LAS results with several parsing algorithms in the LTH conversion (T: p <0.05, 1;: p <0.005). |
Results | We can also conclude that automatically acquired clusters are specially effective with the MST parser in both treebank conversions, which suggests that the type of semantic information has a direct relation to the parsing algorithm . |
Bk <— BESTS(X,Bk_1,W) | The parsing algorithm is greedy which explores a tiny fraction of the search space. |
Easy-first dependency parsing | The easy-first dependency parsing algorithm (Goldberg and Elhadad, 2010) builds a dependency tree by performing two types of actions LEFT(i) and RIGHT(i) to a list of subtree structures p1,. |
Introduction | The easy-first dependency parsing algorithm (Goldberg and Elhadad, 2010) is attractive due to its good accuracy, fast speed and simplicity. |
Abstract | Although finding the “minimal” latent tree is NP-hard in general, for the case of projective trees we find that it can be found using bilexical parsing algorithms . |
Abstract | While this criterion is in general NP-hard (Desper and Gascuel, 2005), for projective trees we find that a bilexical parsing algorithm can be used to find an exact solution efficiently (Eisner and Satta, 1999). |
Abstract | However, if we restrict u to be in L1, as we do in the above, then maximizing over Ll can be solved using the bilexical parsing algorithm from Eisner and Satta (1999). |
Decoding | The algorithm was extensions of the parsing algorithm of (Eisner, 1996), which was a modified version of the CKY chart parsing algorithm . |
Decoding | The parsing algorithm independently parses the left and right dependents of a word and combines them later. |
Decoding | Because the parsing algorithm is in the bottom-up style, the nearer children are generated earlier than the farther ones of the same head. |
Abstract | This results in asymptotical running time improvement for known parsing algorithms for this class. |
Conclusion | Given the fact that fanout l bundles can be attached to any adjacent bundle in our factorization, we can show that our algorithm also optimizes time complexity for known tabular parsing algorithms for LCFRSs with fanout 2. |
Introduction | Under a theoretical perspective, the parsing problem for LCFRSS with f = 2 is NP-complete (Satta, 1992), and in known parsing algorithms the running time is exponentially affected by the rank 7“ of the grammar. |
Related Works | Both the graph-based (McDonald et al., 2005a; McDonald and Pereira, 2006; Carreras et al., 2006) and the transition-based (Yamada and Matsumoto, 2003; Nivre et al., 2006) parsing algorithms are related to our word-pair classification model. |
Word-Pair Classification Model | 2.3 Parsing Algorithm |
Word-Pair Classification Model | Algorithm 1 Dependency Parsing Algorithm . |
Abstract | Given a particular grammar, the polynomial degree of efficient STAG parsing algorithms depends directly on the rank of the grammar: the maximum number of correspondences that appear within a single elementary structure. |
Introduction | Given a particular grammar, the polynomial degree of efficient STAG parsing algorithms depends directly on the rank of the grammar: the maximum number of correspondences that appear within a single elementary structure. |
Introduction | This is a critical minimization because k is the feature of the grammar that appears in the exponent of the complexity of parsing algorithms for STAG. |