Abstract | In this paper, we propose a novel method for semi-supervised learning of non-projective log-linear dependency parsers using directly expressed linguistic prior knowledge (e.g. |
Generalized Expectation Criteria | In the following sections we apply GE to non-projective CRF dependency parsing . |
Generalized Expectation Criteria | We first consider an arbitrarily structured conditional random field (Lafferty et al., 2001) p)‘ (y We describe the CRF for non-projective dependency parsing in Section 3.2. |
Generalized Expectation Criteria | We now define a CRF p)‘ (y|x) for unlabeled, non-projective5 dependency parsing . |
Introduction | This paper proposes a method for directly guiding the learning of dependency parsers with naturally encoded linguistic insights. |
Introduction | While a complete exploration of linguistic prior knowledge for dependency parsing is beyond the scope of this paper, we provide several promising demonstrations of the proposed method. |
Linguistic Prior Knowledge | This type of constraint was used in the development of a rule-based dependency parser (De-busmann et al., 2004). |
Related Work | Smith and Eisner (2007) apply entropy regularization to dependency parsing . |
Related Work | There are also a number of methods for unsupervised learning of dependency parsers . |
Related Work | Klein and Manning (2004) use a carefully initialized and structured generative model (DMV) in conjunction with the EM algorithm to get the first positive results on unsupervised dependency parsing . |
Abstract | (2005b) formalized dependency parsing as a maximum spanning tree (MST) problem, which can be solved in quadratic time relative to the length of the sentence. |
Abstract | They show that MST parsing is almost as accurate as cubic-time dependency parsing in the case of English, and that it is more accurate with free word order languages. |
Dependency parsing for machine translation | In this section, we review dependency parsing formulated as a maximum spanning tree problem (McDonald et al., 2005b), which can be solved in quadratic time, and then present its adaptation and novel application to phrase-based decoding. |
Dependency parsing for machine translation | In the case of dependency parsing for Czech, (McDonald et al., 2005b) even outperforms projective parsing, and was one of the top systems in the CoNLL-06 shared task in multilingual dependency parsing . |
Dependency parsing for machine translation | 2.1 0(n2)-time dependency parsing for MT |
Introduction | While deterministic parsers are often deemed inadequate for dealing with ambiguities of natural language, highly accurate 0(n2) algorithms exist in the case of dependency parsing . |
Introduction | (2005b) present a quadratic-time dependency parsing algorithm that is just 0.7% less accurate than “full-fledged” chart parsing (which, in the case of dependency parsing , runs in time 0(n3) (Eisner, 1996)). |
Introduction | Most interestingly, the time complexity of non-projective dependency parsing remains quadratic as the order of the language model increases. |
Abstract | We formulate the problem of non-projective dependency parsing as a polynomial-sized integer linear program. |
Dependency Parsing | Let us first describe formally the set of legal dependency parse trees. |
Dependency Parsing | We define the set of legal dependency parse trees of at (denoted 34:10)) as the set of O-arborescences of D, i.e., we admit each arborescence as a potential dependency tree. |
Dependency Parsing | 3 In this paper, we consider unlabeled dependency parsing , where only the backbone structure (i.e., the arcs without the labels depicted in Fig. |
Dependency Parsing as an ILP | Riedel and Clarke (2006) proposed an ILP formulation for dependency parsing which refines the arc-factored model by imposing linguistically motivated “hard” constraints that forbid some arc configurations. |
Introduction | Much attention has recently been devoted to integer linear programming (ILP) formulations of NLP problems, with interesting results in applications like semantic role labeling (Roth and Yih, 2005; Punyakanok et al., 2004), dependency parsing (Riedel and Clarke, 2006), word alignment for machine translation (Lacoste-Julien et al., 2006), summarization (Clarke and Lapata, 2008), and coreference resolution (Denis and Baldridge, 2007), among others. |
Introduction | Riedel and Clarke (2006) cast dependency parsing as an ILP, but efi‘icient formulations remain an open problem. |
Abstract | We present a novel transition system for dependency parsing , which constructs arcs only between adjacent words but can parse arbitrary non-projective trees by swapping the order of words in the input. |
Background Notions 2.1 Dependency Graphs and Trees | Following Nivre (2008a), we define a transition system for dependency parsing as a quadruple S = (C, T, cs, Ct), where |
Background Notions 2.1 Dependency Graphs and Trees | Figure 2: Transitions for dependency parsing ; Tp = |
Introduction | However, one problem that still has not found a satisfactory solution in data-driven dependency parsing is the treatment of discontinuous syntactic constructions, usually modeled by non-projective |
Introduction | Current approaches to data-driven dependency parsing typically use one of two strategies to deal with non-projective trees (unless they ignore them completely). |
Introduction | In Section 2, we define the formal representations needed and introduce the framework of transition-based dependency parsing . |
Transitions for Dependency Parsing | Having defined the set of configurations, including initial and terminal configurations, we will now focus on the transition set T required for dependency parsing . |
Transitions for Dependency Parsing | 3.1 Projective Dependency Parsing |
Transitions for Dependency Parsing | The minimal transition set Tp for projective dependency parsing contains three transitions: |
Abstract | The dependency backbone of an HP SG analysis is used to provide general linguistic insights which, when combined with state-of-the-art statistical dependency parsing models, achieves performance improvements on out-domain testsflL |
Dependency Parsing with HPSG | In this section, we explore two possible applications of the HPSG parsing onto the syntactic dependency parsing task. |
Dependency Parsing with HPSG | One is to extract dependency backbone from the HP SG analyses of the sentences and directly convert them into the target representation; the other way is to encode the HP SG outputs as additional features into the existing statistical dependency parsing models. |
Dependency Parsing with HPSG | Besides directly using the dependency backbone of the HP SG output, we could also use it for building feature-based models of statistical dependency parsers . |
Introduction | Syntactic dependency parsing is attracting more and more research focus in recent years, partially due to its theory-neutral representation, but also thanks to its wide deployment in various NLP tasks (machine translation, textual entailment recognition, question answering, information extraction, etc.). |
Introduction | In combination with machine learning methods, several statistical dependency parsing models have reached comparable high parsing accuracy (McDonald et al., 2005b; Nivre et al., 2007b). |
Introduction | In the meantime, successful continuation of CoNLL Shared Tasks since 2006 (Buchholz and Marsi, 2006; Nivre et al., 2007a; Surdeanu et al., 2008) have witnessed how easy it has become to train a statistical syntactic dependency parser provided that there is annotated treebank. |
Parser Domain Adaptation | In recent years, two statistical dependency parsing systems, MaltParser (Nivre et al., 2007b) and MS TParser (McDonald et al., 2005b), representing different threads of research in data-driven machine learning approaches have obtained high publicity, for their state-of-the-art performances in open competitions such as CoNLL Shared Tasks. |
Parser Domain Adaptation | In addition, most of the previous work have been focusing on constituent-based parsing, while the domain adaptation of the dependency parsing has not been fully explored. |
Parser Domain Adaptation | Figure 1: Different dependency parsing models and their combinations. |
Abstract | This paper proposes an approach to enhance dependency parsing in a language by using a translated treebank from another language. |
Introduction | Although supervised learning methods bring state-of-the-art outcome for dependency parser inferring (McDonald et al., 2005; Hall et al., 2007), a large enough data set is often required for specific parsing accuracy according to this type of methods. |
Introduction | As different human languages or treebanks should share something common, this makes it possible to let dependency parsing in multiple languages be beneficial with each other. |
Introduction | In this paper, we study how to improve dependency parsing by using (automatically) translated texts attached with transformed dependency information. |
The Related Work | Even the translation outputs are not so good as the expected, a dependency parser for the |
The Related Work | However, although it is not essentially different, we only focus on dependency parsing itself, while the parsing scheme in (Burkett and Klein, 2008) based on a constituent representation. |
Treebank Translation and Dependency Transformation | how a translated English treebank enhances a Chinese dependency parser . |
Automatic Annotation Adaptation | This coincides with the stacking method for combining dependency parsers (Martins et al., 2008; Nivre and McDon- |
Automatic Annotation Adaptation | This is similar to feature design in discriminative dependency parsing (McDonald et al., 2005; Mc- |
Experiments | For example, currently, most Chinese constituency and dependency parsers are trained on some version of CTB, using its segmentation and POS tagging as the defacto standards. |
Related Works | Co-training (Sarkar, 2001) and classifier combination (Nivre and McDonald, 2008) are two technologies for training improved dependency parsers . |
Related Works | The classifier combination lets graph-based and transition-based dependency parsers to utilize the features extracted from each other’s parsing results, to obtain combined, enhanced parsers. |
Approach | A parallel corpus is word-level aligned using an alignment toolkit (Graca et al., 2009) and the source (English) is parsed using a dependency parser (McDonald et al., 2005). |
Conclusion | In this paper, we proposed a novel and effective learning scheme for transferring dependency parses across bitext. |
Introduction | Recently, dependency parsing has gained popularity as a simpler, computationally more efficient alternative to constituency parsing and has spurred several supervised learning approaches (Eisner, 1996; Yamada and Matsumoto, 2003a; Nivre and Nilsson, 2005; McDonald et al., 2005) as well as unsupervised induction (Klein and Manning, 2004; Smith and Eisner, 2006). |
Related Work | In this volume (Druck et al., 2009) use this framework to train a dependency parser based on constraints stated as corpus-wide expected values of linguistic rules. |
Analysis of reference compressions | In addition, sentence compression methods that strongly depend on syntactic parsers have two problems: ‘parse error’ and ‘decoding speed.’ 44% of sentences output by a state-of-the-art Japanese dependency parser contain at least one error (Kudo and Matsumoto, 2005). |
Conclusions | 0 We showed that our method is about 4.3 times faster than Hori’s method which employs a dependency parser . |
Experimental Evaluation | sion of Hori’s method which does not require the dependency parser . |
Results and Discussion | Our method was about 4.3 times faster than Hori’s method due to the latter’s use of dependency parser . |
Features | In order to generate these features we parse each sentence with the broad-coverage dependency parser MIN IPAR (Lin, 1998). |
Features | A dependency parse consists of a set of words and chunks (e.g. |
Implementation | Each sentence of this unstructured text is dependency parsed by MINIPAR to produce a dependency graph. |
Implementation | This chunking is restricted by the dependency parse of the sentence, however, in that chunks must be contiguous in the parse (i.e., no chunks across subtrees). |