Abstract | We propose the first joint model for word segmentation, POS tagging, and dependency parsing for Chinese. |
Abstract | Based on an extension of the incremental joint model for POS tagging and dependency parsing (Hatori et al., 2011), we propose an efficient character-based decoding method that can combine features from state-of-the-art segmentation, POS tagging, and dependency parsing models. |
Abstract | In experiments using the Chinese Treebank (CTB), we show that the accuracies of the three tasks can be improved significantly over the baseline models, particularly by 0.6% for POS tagging and 2.4% for dependency parsing . |
Introduction | In addition, some researchers recently proposed a joint approach to Chinese POS tagging and dependency parsing (Li et al., 2011; Hatori et al., 2011); particularly, Hatori et al. |
Introduction | In this context, it is natural to consider further a question regarding the joint framework: how strongly do the tasks of word segmentation and dependency parsing interact? |
Introduction | Based on these observations, we aim at building a joint model that simultaneously processes word segmentation, POS tagging, and dependency parsing , trying to capture global interaction among |
Related Works | Therefore, we place no restriction on the segmentation possibilities to consider, and we assess the full potential of the joint segmentation and dependency parsing model. |
Related Works | The incremental framework of our model is based on the joint POS tagging and dependency parsing model for Chinese (Hatori et al., 2011), which is an extension of the shift-reduce dependency parser with dynamic programming (Huang and Sagae, 2010). |
Abstract | We compare two parsing models for temporal dependency structures, and show that a deterministic non-projective dependency parser outperforms a graph-based maximum spanning tree parser, achieving labeled attachment accuracy of 0.647 and labeled tree edit distance of 0.596. |
Abstract | Our analysis of the dependency parser errors gives some insights into future research directions. |
Corpus Annotation | train a temporal dependency parsing model. |
Introduction | We then demonstrate how to train a timeline extraction system based on dependency parsing techniques instead of the pairWise classification approaches typical of prior work. |
Introduction | 0 We design a non-projective dependency parser for inferring timelines from text. |
Introduction | The following sections first review some relevant prior work, then describe the corpus annotation and the dependency parsing algorithm, and finally present our evaluation results. |
Parsing Models | We consider two different approaches to learning a temporal dependency parser : a shift-reduce model (Nivre, 2008) and a graph-based model (McDonald et al., 2005). |
Parsing Models | Shift-reduce dependency parsers start with an input queue of unlinked words, and link them into a tree by repeatedly choosing and performing actions like shifting a node to a stack, or popping two nodes from the stack and linking them. |
Parsing Models | Formally, a deterministic shift-reduce dependency parser is defined as (C, T, CF, INIT, TREE) where: |
Related Work | improving the annotation approach to generate the fully connected timeline of a story, and improving the models for timeline extraction using dependency parsing techniques. |
Related Work | We employ methods from syntactic dependency parsing , adapting them to our task by including features typical of temporal relation labeling models. |
Abstract | We apply substructure sharing to a dependency parser and part of speech tagger to obtain significant speedups, and further improve the accuracy of these tools through up-training. |
Incorporating Syntactic Structures | We first present our approach and then demonstrate its generality by applying it to a dependency parser and part of speech tagger. |
Incorporating Syntactic Structures | We now apply this algorithm to dependency parsing and POS tagging. |
Incorporating Syntactic Structures | 3.2 Dependency Parsing |
Introduction | We demonstrate our approach on a local Perceptron based part of speech tagger (Tsuruoka et al., 2011) and a shift reduce dependency parser (Sagae and Tsujii, 2007), yielding significantly faster tagging and parsing of ASR hypotheses. |
Up-Training | (2010) used up-training as a domain adaptation technique: a constituent parser —which is more robust to domain changes — was used to label a new domain, and a fast dependency parser |
Abstract | In this paper, we present an approach to enriching high—order feature representations for graph-based dependency parsing models using a dependency language model and beam search. |
Analysis | Dependency parsers tend to perform worse on heads which have many children. |
Conclusion | We have presented an approach to utilizing the dependency language model to improve graph-based dependency parsing . |
Introduction | In recent years, there are many data-driven models that have been proposed for dependency parsing (McDonald and Nivre, 2007). |
Introduction | Among them, graph-based dependency parsing models have achieved state-of-the-art performance for a wide range of Ian-guages as shown in recent CoNLL shared tasks |
Introduction | In the graph-based models, dependency parsing is treated as a structured prediction problem in which the graphs are usually represented as factored structures. |
Related work | (2008) used a clustering algorithm to produce word clusters on a large amount of unannotated data and represented new features based on the clusters for dependency parsing models. |
Related work | (2009) proposed an approach that extracted partial tree structures from a large amount of data and used them as the additional features to improve dependency parsing . |
Related work | They extended a Semi-supervised Structured Conditional Model (SS-SCM)(Suzuki and Isozaki, 2008) to the dependency parsing problem and combined their method with the approach of Koo et al. |
Dependency Parsing | Given an input sentence x = wowl...wn and its POS tag sequence 1; = totl...tn, the goal of dependency parsing is to build a dependency tree as depicted in Figure l, denoted by d = {(h, m, l) : 0 g h 3 72,0 < m g n,l E L}, where (h,m, l) indicates an directed arc from the head word (also called father) w, to the modifier (also called child or dependent) wm with a dependency label l, and L is the label set. |
Dependency Parsing | We omit the label l because we focus on unlabeled dependency parsing in the present paper. |
Experiments and Analysis | (2011) show that a joint POS tagging and dependency parsing model can significantly improve parsing accuracy over a pipeline model. |
Introduction | 2CTB5 is converted to dependency structures following the standard practice of dependency parsing (Zhang and Clark, 2008b). |
Related Work | These features are tailored to the dependency parsing problem. |
Related Work | Our approach is also intuitively related to stacked learning (SL), a machine learning framework that has recently been applied to dependency parsing to integrate two mainstream parsing models, i.e., graph—based and transition—based models (Nivre and McDonald, 2008; Martins et al., 2008). |
Abstract | Experiments on the English Penn Treebank data and the Chinese CoNLL-06 data show that the proposed algorithm achieves comparable results with other data-driven dependency parsing algorithms. |
Conclusion | This paper presents a novel head-driven parsing algorithm and empirically shows that it is as practical as other dependency parsing algorithms. |
Introduction | The Earley prediction is tied to a particular grammar rule, but the proposed algorithm is data—driven, following the current trends of dependency parsing (Nivre, 2006; McDonald and Pereira, 2006; Koo et al., 2010). |
Weighted Parsing Model | We define the set FIRST(-) for our top-down dependency parser: |
Abstract | Besides using traditional dependency parsers , we also use the dependency structures transformed from PCFG trees and predicate-argument structures (PASS) which are generated by an HPSG parser and a CCG parser. |
Experiments | These results lead us to argue that the robustness of deep syntactic parsers can be advantageous in SMT compared with traditional dependency parsers . |
Gaining Dependency Structures | 2.2 Dependency parsing |
Gaining Dependency Structures | Graph-based and transition-based are two predominant paradigms for data-driven dependency parsing . |
In our dataset only 11% of Candidate Relations are valid. | grounding, the sentence 13k, and a given dependency parse qk of the sentence. |
In our dataset only 11% of Candidate Relations are valid. | The text component features gbc are computed over sentences and their dependency parses . |
In our dataset only 11% of Candidate Relations are valid. | The Stanford parser (de Marneffe et al., 2006) was used to generate the dependency parse information for each sentence. |
Experiments | One major reason is that many features that are predictive for within-sentence instances are no longer applicable (e.g., Dependency parse features). |
Method | (2009), we extract the following three types of features: (1) pairs of words, one from SL and one from SR, as originally proposed by Marcu and Echihabi (2002); (2) dependency parse features in S L, S R, or both; and (3) syntactic production rules in S L, S R, or both. |
Related work | (2009) attempted to recognize implicit discourse relations (discourse relations which are not signaled by explicit connectives) in PDTB by using four classes of features — contextual features, constituent parse features, dependency parse features, and lexical features — and explored their individual influence on performance. |
Experiments | We used syntactic features (i.e., features obtained from the dependency parse tree of a sentence) and lexical features, and entity types, which essentially correspond to the ones developed by Mintz et a1. |
Knowledge-based Distant Supervision | Since two entities mentioned in a sentence do not always have a relation, we select entity pairs from a corpus when: (i) the path of the dependency parse tree between the corresponding two named entities in the sentence is no longer than 4 and (ii) the path does not contain a sentence-like boundary, such as a relative clause1 (Banko et al., 2007; Banko and Etzioni, 2008). |
Wrong Label Reduction | We define a pattern as the entity types of an entity pair2 as well as the sequence of words on the path of the dependency parse tree from the first entity to the second one. |