Abstract | This paper presents experiments with WordNet semantic classes to improve dependency parsing . |
Abstract | We study the effect of semantic classes in three dependency parsers , using two types of constituency-to-dependency conversions of the English Penn Treebank. |
Experimental Framework | We have made use of three parsers representative of successful paradigms in dependency parsing . |
Introduction | This work presents a set of experiments to investigate the use of lexical semantic information in dependency parsing of English. |
Introduction | We will apply different types of semantic information to three dependency parsers . |
Introduction | dependency parsing ? |
Related work | (201 1) successfully introduced WordNet classes in a dependency parser , obtaining improvements on the full PTB using gold POS tags, trying different combinations of semantic classes. |
Related work | (2008) presented a semisupervised method for training dependency parsers , introducing features that incorporate word clusters automatically acquired from a large unannotated corpus. |
Related work | They demonstrated its effectiveness in dependency parsing experiments on the PTB and the Prague Dependency Treebank. |
Abstract | We investigate active learning methods for Japanese dependency parsing . |
Abstract | Experimental results show that our proposed methods improve considerably the leam-ing curve of Japanese dependency parsing . |
Active Learning for Japanese Dependency Parsing | 4We did not employ query-by-committee (QBC) (Seung et al., 1992), which is another important general framework of active learning, since the selection strategy with large margin classifiers (Section 2.2) is much simpler and seems more practical for active learning in Japanese dependency parsing with smaller constituents. |
Experimental Evaluation and Discussion | We set the degree of the kernels to 3 since cubic kernels with SVM have proved effective for Japanese dependency parsing (Kudo and Matsumoto, 2000; Kudo and Matsumoto, 2002). |
Experimental Evaluation and Discussion | There are features that have been commonly used for Japanese dependency parsing among related papers, e.g., (Kudo and Matsumoto, 2002; Sassano, 2004; Iwatate et al., 2008). |
Experimental Evaluation and Discussion | It is observed that active learning with large margin classifiers also works well for Sassano’s algorithm of Japanese dependency parsing . |
Introduction | We use Japanese dependency parsing as a target task in this study since a simple and efficient algorithm of parsing is proposed and, to our knowledge, active learning for Japanese dependency parsing has never been studied. |
Introduction | In Section 5 we describe our proposed methods and others of active learning for Japanese dependency parsing . |
Japanese Parsing | 3.3 Algorithm of Japanese Dependency Parsing |
Japanese Parsing | We use Sassano’s algorithm (Sassano, 2004) for Japanese dependency parsing . |
Japanese Parsing | Figure 3: Algorithm of Japanese dependency parsing |
Abstract | We show relative error reductions of 7.0% over the second-order dependency parser of McDonald and Pereira (2006), 9.2% over the constituent parser of Petrov et al. |
Analysis | Results are for dependency parsing on the dev set for iters:5,training-k:1. |
Introduction | For dependency parsing , we augment the features in the second-order parser of McDonald and Pereira (2006). |
Introduction | (2008) smooth the sparseness of lexical features in a discriminative dependency parser by using cluster-based word-senses as intermediate abstractions in |
Introduction | For the dependency case, we can integrate them into the dynamic programming of a base parser; we use the discriminatively-trained MST dependency parser (McDonald et al., 2005; McDonald and Pereira, 2006). |
Parsing Experiments | We first integrate our features into a dependency parser , where the integration is more natural and pushes all the way into the underlying dynamic program. |
Parsing Experiments | 4.1 Dependency Parsing |
Parsing Experiments | For dependency parsing , we use the discriminatively-trained MSTParser4, an implementation of first and second order MST parsing models of McDonald et a1. |
Web-count Features | pairs, as is standard in the dependency parsing literature (see Figure 3). |
Abstract | Most previous studies of morphological disambiguation and dependency parsing have been pursued independently. |
Baselines | For dependency parsing , our baseline is a “pipeline” parser (§4.2) that infers syntax upon the output of the baseline tagger. |
Baselines | 4.2 Baseline Dependency Parser |
Experimental Results | We compare the performance of the pipeline model (§4) and the joint model (§3) on morphological disambiguation and unlabeled dependency parsing . |
Experimental Results | 6.2 Dependency Parsing |
Introduction | To date, studies of morphological analysis and dependency parsing have been pursued more or less independently. |
Introduction | Morphological taggers disambiguate morphological attributes such as part-of-speech (POS) or case, without taking syntax nfioaummn(Hflimme7azd,2mm;Ihficet al., 2001); dependency parsers commonly assume the “pipeline” approach, relying on morphological information as part of the input (Buchholz and Marsi, 2006; Nivre et al., 2007). |
Introduction | 97% (Toutanova et al., 2003), and that of dependency parsing has reached the low nineties (Nivre et al., 2007). |
Previous Work | We know of only one previous attempt in data-driven dependency parsing for Latin (Bamman and Crane, 2008), with the goal of constructing a dynamic lexicon for a digital library. |
Previous Work | Parsing is performed using the usual pipeline approach, first with the TreeTagger analyzer (Schmid, 1994) and then with a state-of-the-art dependency parser (McDonald et al., 2005). |
Abstract | In this paper, we present a novel approach which incorporates the web-derived selectional preferences to improve statistical dependency parsing . |
Abstract | Experiments show that web-scale data improves statistical dependency parsing , particularly for long dependency relationships. |
Introduction | Dependency parsing is the task of building dependency links between words in a sentence, which has recently gained a wide interest in the natural language processing community. |
Introduction | With the availability of large-scale annotated corpora such as Penn Treebank (Marcus et al., 1993), it is easy to train a high-performance dependency parser using supervised learning methods. |
Introduction | However, current state-of—the-art statistical dependency parsers (McDonald et al., 2005; McDonald and Pereira, 2006; Hall et al., 2006) tend to have |
Abstract | We propose the first joint model for word segmentation, POS tagging, and dependency parsing for Chinese. |
Abstract | Based on an extension of the incremental joint model for POS tagging and dependency parsing (Hatori et al., 2011), we propose an efficient character-based decoding method that can combine features from state-of-the-art segmentation, POS tagging, and dependency parsing models. |
Abstract | In experiments using the Chinese Treebank (CTB), we show that the accuracies of the three tasks can be improved significantly over the baseline models, particularly by 0.6% for POS tagging and 2.4% for dependency parsing . |
Introduction | In addition, some researchers recently proposed a joint approach to Chinese POS tagging and dependency parsing (Li et al., 2011; Hatori et al., 2011); particularly, Hatori et al. |
Introduction | In this context, it is natural to consider further a question regarding the joint framework: how strongly do the tasks of word segmentation and dependency parsing interact? |
Introduction | Based on these observations, we aim at building a joint model that simultaneously processes word segmentation, POS tagging, and dependency parsing , trying to capture global interaction among |
Related Works | Therefore, we place no restriction on the segmentation possibilities to consider, and we assess the full potential of the joint segmentation and dependency parsing model. |
Related Works | The incremental framework of our model is based on the joint POS tagging and dependency parsing model for Chinese (Hatori et al., 2011), which is an extension of the shift-reduce dependency parser with dynamic programming (Huang and Sagae, 2010). |
Abstract | We compare two parsing models for temporal dependency structures, and show that a deterministic non-projective dependency parser outperforms a graph-based maximum spanning tree parser, achieving labeled attachment accuracy of 0.647 and labeled tree edit distance of 0.596. |
Abstract | Our analysis of the dependency parser errors gives some insights into future research directions. |
Corpus Annotation | train a temporal dependency parsing model. |
Introduction | We then demonstrate how to train a timeline extraction system based on dependency parsing techniques instead of the pairWise classification approaches typical of prior work. |
Introduction | 0 We design a non-projective dependency parser for inferring timelines from text. |
Introduction | The following sections first review some relevant prior work, then describe the corpus annotation and the dependency parsing algorithm, and finally present our evaluation results. |
Parsing Models | We consider two different approaches to learning a temporal dependency parser : a shift-reduce model (Nivre, 2008) and a graph-based model (McDonald et al., 2005). |
Parsing Models | Shift-reduce dependency parsers start with an input queue of unlinked words, and link them into a tree by repeatedly choosing and performing actions like shifting a node to a stack, or popping two nodes from the stack and linking them. |
Parsing Models | Formally, a deterministic shift-reduce dependency parser is defined as (C, T, CF, INIT, TREE) where: |
Related Work | improving the annotation approach to generate the fully connected timeline of a story, and improving the models for timeline extraction using dependency parsing techniques. |
Related Work | We employ methods from syntactic dependency parsing , adapting them to our task by including features typical of temporal relation labeling models. |
Abstract | We apply substructure sharing to a dependency parser and part of speech tagger to obtain significant speedups, and further improve the accuracy of these tools through up-training. |
Incorporating Syntactic Structures | We first present our approach and then demonstrate its generality by applying it to a dependency parser and part of speech tagger. |
Incorporating Syntactic Structures | We now apply this algorithm to dependency parsing and POS tagging. |
Incorporating Syntactic Structures | 3.2 Dependency Parsing |
Introduction | We demonstrate our approach on a local Perceptron based part of speech tagger (Tsuruoka et al., 2011) and a shift reduce dependency parser (Sagae and Tsujii, 2007), yielding significantly faster tagging and parsing of ASR hypotheses. |
Up-Training | (2010) used up-training as a domain adaptation technique: a constituent parser —which is more robust to domain changes — was used to label a new domain, and a fast dependency parser |
Abstract | We define a new formalism, based on Sikkel’s parsing schemata for constituency parsers, that can be used to describe, analyze and compare dependency parsing algorithms. |
Abstract | This abstraction allows us to establish clear relations between several existing projective dependency parsers and prove their correctness. |
Dependency parsing schemata | However, parsing schemata are not directly applicable to dependency parsing , since their formal framework is based on constituency trees. |
Dependency parsing schemata | In spite of this problem, many of the dependency parsers described in the literature are constructive, in the sense that they proceed by combining smaller structures to form larger ones until they find a complete parse for the input sentence. |
Dependency parsing schemata | However, in order to define such a formalism we have to tackle some issues specific to dependency parsers: |
Introduction | Dependency parsing consists of finding the structure of a sentence as expressed by a set of directed links (dependencies) between words. |
Introduction | In addition to this, some dependency parsers are able to represent nonprojective structures, which is an important feature when parsing free word order languages in which discontinuous constituents are common. |
Introduction | However, since parsing schemata are defined as deduction systems over sets of constituency trees, they cannot be used to describe dependency parsers . |
Abstract | In this paper, we combine easy-first dependency parsing and POS tagging algorithms with beam search and structured perceptron. |
Easy-first dependency parsing | The easy-first dependency parsing algorithm (Goldberg and Elhadad, 2010) builds a dependency tree by performing two types of actions LEFT(i) and RIGHT(i) to a list of subtree structures p1,. |
Introduction | The easy-first dependency parsing algorithm (Goldberg and Elhadad, 2010) is attractive due to its good accuracy, fast speed and simplicity. |
Introduction | However, to the best of our knowledge, no work in the literature has ever applied the two techniques to easy-first dependency parsing . |
Introduction | While applying beam-search is relatively straightforward, the main difficulty comes from combining easy-first dependency parsing with perceptron-based global learning. |
Training | 4 As shown in (Goldberg and Nivre 2012), most transition-based dependency parsers (Nivre et al., 2003; Huang and Sagae 2010;Zhang and Clark 2008) ignores spurious ambiguity by using a static oracle which maps a dependency tree to a single action sequence. |
Training | Table 1: Feature templates for English dependency parsing . |
Discussion | The performance of our methods depends not only on the quality of the induced tag sets but also on the performance of the dependency parser learned in Step 3 of Section 4.1. |
Discussion | Thus we split the 10,000 data into the first 9,000 data for training and the remaining 1,000 for testing, and then a dependency parser was learned in the same way as in Step 3. |
Experiment | In the training process, the following steps are performed sequentially: preprocessing, inducing a POS tagset for a source language, training a POS tagger and a dependency parser , and training a forest-to-string MT model. |
Experiment | The Japanese sentences are parsed using CaboCha (Kudo and Matsumoto, 2002), which generates dependency structures using a phrasal unit called a bunsetsug, rather than a word unit as in English or Chinese dependency parsing . |
Experiment | Training a POS Tagger and a Dependency Parser |
Introduction | In recent years, syntax-based SMT has made promising progress by employing either dependency parsing (Lin, 2004; Ding and Palmer, 2005; Quirk et al., 2005; Shen et al., 2008; Mi and Liu, 2010) or constituency parsing (Huang et al., 2006; Liu et al., 2006; Galley et al., 2006; Mi and Huang, 2008; Zhang et al., 2008; Cohn and Blunsom, 2009; Liu et al., 2009; Mi and Liu, 2010; Zhang et al., 2011) on the source side, the target side, or both. |
Introduction | However, dependency parsing , which is a popular choice for Japanese, can incorporate only shallow syntactic information, i.e., POS tags, compared with the richer syntactic phrasal categories in constituency parsing. |
Abstract | We present a novel approach, called selectional branching, which uses confidence estimates to decide when to employ a beam, providing the accuracy of beam search at speeds close to a greedy transition-based dependency parsing approach. |
Abstract | We also present a new transition-based dependency parsing algorithm that gives a complexity of 0(n) for projective parsing and an expected linear time speed for non-projective parsing. |
Experiments | For English, we mostly adapt features from Zhang and Nivre (2011) who have shown state-of-the-art parsing accuracy for transition-based dependency parsing . |
Experiments | Bohnet and Nivre (2012)’s transition-based system jointly performs POS tagging and dependency parsing , which shows higher accuracy than ours. |
Introduction | Transition-based dependency parsing has gained considerable interest because it runs fast and performs accurately. |
Introduction | Greedy transition-based dependency parsing has been widely deployed because of its speed (Cer et a1., 2010); however, state-of—the-art accuracies have been achieved by globally optimized parsers using beam search (Zhang and Clark, 2008; Huang and Sagae, 2010; Zhang and Nivre, 2011; Bohnet and Nivre, 2012). |
Introduction | Coupled with dynamic programming, transition-based dependency parsing with beam search can be done very efficiently and gives significant improvement to parsing accuracy. |
Related work | There are other transition-based dependency parsing algorithms that take a similar approach; Nivre (2009) integrated a SWAP transition into Nivre’s arc-standard algorithm (Nivre, 2004) and Fernandez-Gonzalez and Gomez-Rodriguez (2012) integrated a buffer transition into Nivre’s arc-eager algorithm to handle non-projectivity. |
Related work | Our selectional branching method is most relevant to Zhang and Clark (2008) who introduced a transition-based dependency parsing model that uses beam search. |
Transition-based dependency parsing | We introduce a transition-based dependency parsing algorithm that is a hybrid between Nivre’s arc-eager and list-based algorithms (Nivre, 2003; Nivre, 2008). |
Transition-based dependency parsing | 2The parsing complexity of a transition-based dependency parsing algorithm is determined by the number of transitions performed with respect to the number of tokens in a sentence, say n (Kubler et a1., 2009). |
Abstract | This paper introduces a novel pre-ordering approach based on dependency parsing for Chinese-English SMT. |
Dependency-based Pre-ordering Rule Set | Figure 1 shows a constituent parse tree and its Stanford typed dependency parse tree for the same |
Dependency-based Pre-ordering Rule Set | As shown in the figure, the number of nodes in the dependency parse tree (i.e. |
Dependency-based Pre-ordering Rule Set | Because dependency parse trees are generally more concise than the constituent ones, they can conduct long-distance reorderings in a finer way. |
Introduction | Since dependency parsing is more concise than constituent parsing in describing sentences, some research has used dependency parsing in pre-ordering approaches for language pairs such as Arabic-English (Habash, 2007), and English-SOV languages (Xu et al., 2009; Katz-Brown et al., 2011). |
Introduction | In contrast, we propose a set of pre-ordering rules for dependency parsers . |
Introduction | (2007) exist, it is almost impossible to automatically convert their rules into rules that are applicable to dependency parsers . |
Abstract | In this paper, we investigate various strategies to predict both syntactic dependency parsing and contiguous multiword expression (MWE) recognition, testing them on the dependency version of French Treebank (Abeille and Barrier, 2004), as instantiated in the SPMRL Shared Task (Seddah et al., 2013). |
Architectures for MWE Analysis and Parsing | The architectures we investigated vary depending on whether the MWE status of sequences of tokens is predicted via dependency parsing or via an external tool (described in section 5), and this dichotomy applies both to structured MWEs and flat MWEs. |
Architectures for MWE Analysis and Parsing | 0 IRREG—BY—PARSER: the MWE status, flat topology and POS are all predicted via dependency parsing , using representations for training and parsing, with all information for irregular MWEs encoded in topology and labels (as for in vain in Figure 2). |
Architectures for MWE Analysis and Parsing | 0 REG-BY—PARSER: all regular MWE information (topology, status, P08) is predicted Via dependency parsing , using representations with all information for regular MWEs encoded in topology and labels (Figure 2). |
Data: MWEs in Dependency Trees | (2013), who experimented joint dependency parsing and light verb construction identification. |
Data: MWEs in Dependency Trees | In some experiments, we make use of alternative representations, which we refer later as “labeled representation”, in which the MWE features are incorporated in the dependency labels, so that MWE composition and/or the POS of the MWE be totally contained in the tree topology and labels, and thus predictable via dependency parsing . |
Related work | It is a less language-specific system that reranks n-best dependency parses from 3 parsers, informed with features from predicted constituency trees. |
Use of external MWE resources | Both resources help to predict MWE-specific features (section 5.3) to guide the MWE-aware dependency parser . |
Use of external MWE resources | MWE lexicons are exploited as sources of features for both the dependency parser and the external MWE analyzer. |
Use of external MWE resources | Flat MWE features: MWE information can be integrated as features to be used by the dependency parser . |
Approaches | A typical pipeline consists of a POS tagger, dependency parser , and semantic role labeler. |
Approaches | Brown clusters have been used to good effect for various NLP tasks such as named entity recognition (Miller et al., 2004) and dependency parsing (Koo et al., 2008; Spitkovsky et al., 2011). |
Approaches | Unlike syntactic dependency parsing , the graph is not required to be a tree, nor even a connected graph. |
Introduction | 0 Simpler joint CRF for syntactic and semantic dependency parsing than previously reported. |
Related Work | (2013) extend this idea by coupling predictions of a dependency parser with predictions from a semantic role labeler. |
Related Work | (2012) marginalize over latent syntactic dependency parses . |
Related Work | Recent work in fully unsupervised dependency parsing has supplanted these methods with even higher accuracies (Spitkovsky et al., 2013) by arranging optimiz-ers into networks that suggest informed restarts based on previously identified local optima. |
Abstract | Accurate scoring of syntactic structures such as head-modifier arcs in dependency parsing typically requires rich, high-dimensional feature representations. |
Experimental Setup | Methods We compare our model to MST and Turbo parsers on non-projective dependency parsing . |
Introduction | Traditionally, parsing research has focused on modeling the direct connection between the features and the predicted syntactic relations such as head-modifier (arc) relations in dependency parsing . |
Introduction | We implement the low-rank factorization model in the context of first— and third-order dependency parsing . |
Problem Formulation | We will commence here by casting first-order dependency parsing as a tensor estimation problem. |
Problem Formulation | We will start by introducing the notation used in the paper, followed by a more formal description of our dependency parsing task. |
Problem Formulation | 3.2 Dependency Parsing |
Related Work | Selecting Features for Dependency Parsing A great deal of parsing research has been dedicated to feature engineering (Lazaridou et al., 2013; Marton et al., 2010; Marton et al., 2011). |
Related Work | Embedding for Dependency Parsing A lot of recent work has been done on mapping words into vector spaces (Collobert and Weston, 2008; Turian et al., 2010; Dhillon et al., 2011; Mikolov et al., 2013). |
Abstract | The state-of—the-art dependency parsing techniques, the Eisner algorithm and maximum spanning tree (MST) algorithm, are adopted to parse an optimal discourse dependency tree based on the arc-factored model and the large—margin learning techniques. |
Abstract | Experiments show that our discourse dependency parsers achieve a competitive performance on text-level discourse parsing. |
Add arc <eC,ej> to GC with | Referring to the evaluation of syntactic dependency parsing, |
Discourse Dependency Parsing | The goal of discourse dependency parsing is to parse an optimal spanning tree from V><R><V_0. |
Discourse Dependency Parsing | It is well known that projective dependency parsing can be handled with the Eisner algorithm (1996) which is based on the bottom-up dynamic programming techniques with the time complexity of 0(n3). |
Discourse Dependency Parsing | Following the work of McDonald (2005b), we formalize discourse dependency parsing as searching for a maximum spanning tree (MST) in a directed graph. |
Discourse Dependency Structure and Tree Bank | To the best of our knowledge, we are the first to apply the dependency structure and introduce the dependency parsing techniques into discourse analysis. |
Discourse Dependency Structure and Tree Bank | To automatically conduct discourse dependency parsing , constructing a discourse dependency treebank is fundamental. |
Introduction | Dependency Parsing |
Introduction | Since dependency trees contain much fewer nodes and on average they are simpler than constituency based trees, the current dependency parsers can have a relatively low computational complexity. |
Introduction | In our work, we adopt the graph based dependency parsing techniques learned from large sets of annotated dependency trees. |
Abstract | This paper proposes a simple yet effective framework for semi-supervised dependency parsing at entire tree level, referred to as ambiguity-aware ensemble training. |
Abstract | With a conditional random field based probabilistic dependency parser , our training objective is to maximize mixed likelihood of labeled data and auto-parsed unlabeled data with ambiguous labelings. |
Introduction | Supervised dependency parsing has made great progress during the past decade. |
Introduction | A few effective learning methods are also proposed for dependency parsing to implicitly utilize distributions on unlabeled data (Smith and Eisner, 2007; Wang et al., 2008; Suzuki et al., 2009). |
Introduction | However, these methods gain limited success in dependency parsing . |
Abstract | We present a novel approach for inducing unsupervised dependency parsers for languages that have no labeled training data, but have translated text in a resource-rich language. |
Abstract | Our method can be used as a purely monolingual dependency parser , requiring no human translations for the test data, thus making it applicable to a Wide range of resource-poor languages. |
Introduction | In recent years, dependency parsing has gained universal interest due to its usefulness in a Wide range of applications such as synonym generation (Shinyama et al., 2002), relation extraction (Nguyen et al., 2009) and machine translation (Katz-Brown et al., 2011; Xie et al., 2011). |
Introduction | Several supervised dependency parsing algorithms (Nivre and Scholz, 2004; McDonald et al., 2005a; McDonald et al., 2005b; McDonald and Pereira, 2006; Carreras, 2007; K00 and Collins, 2010; Ma and Zhao, 2012; Zhang et al., 2013) have been proposed and achieved high parsing accuracies on several treebanks, due in large part to the availability of dependency treebanks in a number of languages (McDonald et al., 2013). |
Introduction | (2011) proposed an approach for unsupervised dependency parsing with nonparallel multilingual guidance from one or more helper languages, in which parallel data is not used. |
Our Approach | The focus of this work is on building dependency parsers for target languages, assuming that an accurate English dependency parser and some parallel text between the two languages are available. |
Our Approach | The probabilistic model for dependency parsing defines a family of conditional probability p over all y given sentence :13, with a log-linear form: |
Our Approach | One of the most common model training methods for supervised dependency parser is Maximum conditional likelihood estimation. |
Experiments | There is a dual effect of the increase of the parameter k to our transition-based dependency parser . |
GB-grounded GR Extraction | Allowing non-projective dependencies generally makes parsing either by graph-based or transition-based dependency parsing harder. |
Introduction | Different from popular shallow dependency parsing that focus on tree-shaped structures, our GR annotations are represented as general directed graphs that express not only 10-cal but also various long-distance dependencies, such as coordinations, control/raising constructions, topicalization, relative clauses and many other complicated linguistic phenomena that goes beyond shallow syntax (see Fig. |
Introduction | Previous work on dependency parsing mainly focused on structures that can be represented in terms of directed trees. |
Transition-based GR Parsing | The availability of large-scale treebanks has contributed to the blossoming of statistical approaches to build accurate shallow constituency and dependency parsers . |
Transition-based GR Parsing | In particular, transition-based dependency parsing method is studied. |
Transition-based GR Parsing | 3.1 Data-Driven Dependency Parsing |
Abstract | In this paper, we investigate the problem of character-level Chinese dependency parsing , building dependency trees over characters. |
Abstract | Character-level information can benefit downstream applications by offering flexible granularities for word segmentation while improving word-level dependency parsing accuracies. |
Abstract | We present novel adaptations of two major shift-reduce dependency parsing algorithms to character-level parsing. |
Character-Level Dependency Tree | We differentiate intra-word dependencies and inter-word dependencies by the arc type, so that our work can be compared with conventional word segmentation, POS-tagging and dependency parsing pipelines under a canonical segmentation standard. |
Introduction | Such annotations enable dependency parsing on the character level, building dependency trees over Chinese characters. |
Introduction | Character-level dependency parsing is interesting in at least two aspects. |
Introduction | In this paper, we make an investigation of character-level Chinese dependency parsing using Zhang et al. |
Abstract | We present a simple and effective semi-supervised method for training dependency parsers . |
Abstract | We demonstrate the effectiveness of the approach in a series of dependency parsing experiments on the Penn Treebank and Prague Dependency Treebank, and we show that the cluster-based features yield substantial gains in performance across a wide range of conditions. |
Background 2.1 Dependency parsing | Recent work (Buchholz and Marsi, 2006; Nivre et al., 2007) has focused on dependency parsing . |
Background 2.1 Dependency parsing | Dependency parsing depends critically on predicting head-modifier relationships, which can be difficult due to the statistical sparsity of these word-to-word interactions. |
Background 2.1 Dependency parsing | In this paper, we take a part-factored structured classification approach to dependency parsing . |
Introduction | To demonstrate the effectiveness of our approach, we conduct experiments in dependency parsing, which has been the focus of much recent research—e.g., see work in the CoNLL shared tasks on dependency parsing (Buchholz and Marsi, 2006; Nivre et al., 2007). |
Introduction | However, our target task of dependency parsing involves more complex structured relationships than named-entity tagging; moreover, it is not at all clear that word clusters should have any relevance to syntactic structure. |
Introduction | Nevertheless, our experiments demonstrate that word clusters can be quite effective in dependency parsing applications. |
Abstract | We evaluate eight parsers (based on dependency parsing , phrase structure parsing, or deep parsing) using five different parse representations. |
Introduction | Parsing technologies have improved considerably in the past few years, and high-performance syntactic parsers are no longer limited to PCFG—based frameworks (Charniak, 2000; Klein and Manning, 2003; Charniak and Johnson, 2005 ; Petrov and Klein, 2007), but also include dependency parsers (McDonald and Pereira, 2006; Nivre and Nilsson, 2005; Sagae and Tsujii, 2007) and deep parsers (Kaplan et al., 2004; Clark and Curran, 2004; Miyao and Tsujii, 2008). |
Introduction | In this paper, we present a comparative evaluation of syntactic parsers and their output representations based on different frameworks: dependency parsing , phrase structure parsing, and deep parsing. |
Syntactic Parsers and Their Representations | This paper focuses on eight representative parsers that are classified into three parsing frameworks: dependency parsing , phrase structure parsing, and deep parsing. |
Syntactic Parsers and Their Representations | 2.1 Dependency parsing |
Syntactic Parsers and Their Representations | Because the shared tasks of CoNLL-2006 and CoNLL-2007 focused on data-driven dependency parsing , it has recently been extensively studied in parsing research. |
Abstract | Previous studies of data-driven dependency parsing have shown that the distribution of parsing errors are correlated with theoretical properties of the models used for learning and inference. |
Experiments | The data for the experiments are training and test sets for all thirteen languages from the CoNLL-X shared task on multilingual dependency parsing with training sets ranging in size from from 29,000 tokens (Slovene) to 1,249,000 tokens (Czech). |
Experiments | The experimental results presented so far show that feature-based integration is a viable approach for improving the accuracy of both graph-based and transition-based models for dependency parsing , but they say very little about how the integration benefits |
Introduction | This is undoubtedly one of the reasons for the emergence of dependency parsers for a wide range of languages. |
Introduction | Practically all data-driven models that have been proposed for dependency parsing in recent years can be described as either graph-based or transition-based (McDonald and Nivre, 2007). |
Introduction | Both models have been used to achieve state-of-the-art accuracy for a wide range of languages, as shown in the CoNLL shared tasks on dependency parsing (Buchholz and Marsi, 2006; Nivre et al., 2007), but McDonald and Nivre (2007) showed that a detailed error analysis reveals important differences in the distribution of errors associated with the two models. |
Two Models for Dependency Parsing | This is a common constraint in many dependency parsing theories and their implementations. |
Two Models for Dependency Parsing | Graph-based dependency parsers parameterize a model over smaller substructures in order to search the space of valid dependency graphs and produce the most likely one. |
Two Models for Dependency Parsing | As a result, the dependency parsing problem is written: |
Abstract | We present a novel semi-supervised training algorithm for learning dependency parsers . |
Abstract | To demonstrate the benefits of this approach, we apply the technique to learning dependency parsers from combined labeled and unlabeled corpora. |
Dependency Parsing Model | This formulation is sufficiently general to capture most dependency parsing models, including probabilistic dependency models (Eisner, 1996; Wang et al., 2005) as well as non-probabilistic models (McDonald et al., 2005a). |
Efficient Optimization Strategy | This procedure works efficiently on the task of training a dependency parser . |
Experimental Results | algorithm for achieving a global optimum, we now investigate its effectiveness for dependency parsing . |
Experimental Results | We applied the resulting algorithm to learn dependency parsers for both English and Chinese. |
Introduction | Supervised learning algorithms still represent the state of the art approach for inferring dependency parsers from data (McDonald et al., 2005a; McDonald and Pereira, 2006; Wang et al., 2007). |
Introduction | As we will demonstrate below, this approach admits an efficient training procedure that can find a global minimum, and, perhaps surprisingly, can systematically improve the accuracy of supervised training approaches for learning dependency parsers . |
Introduction | ing semi-supervised convex objective to dependency parsing , and obtain significant improvement over the corresponding supervised structured SVM. |
Supervised Structured Large Margin Training | This approach corresponds to the training problem posed in (McDonald et al., 2005a) and has yielded the best published results for English dependency parsing . |
Abstract | In this paper, we propose a novel method for semi-supervised learning of non-projective log-linear dependency parsers using directly expressed linguistic prior knowledge (e.g. |
Generalized Expectation Criteria | In the following sections we apply GE to non-projective CRF dependency parsing . |
Generalized Expectation Criteria | We first consider an arbitrarily structured conditional random field (Lafferty et al., 2001) p)‘ (y We describe the CRF for non-projective dependency parsing in Section 3.2. |
Generalized Expectation Criteria | We now define a CRF p)‘ (y|x) for unlabeled, non-projective5 dependency parsing . |
Introduction | This paper proposes a method for directly guiding the learning of dependency parsers with naturally encoded linguistic insights. |
Introduction | While a complete exploration of linguistic prior knowledge for dependency parsing is beyond the scope of this paper, we provide several promising demonstrations of the proposed method. |
Linguistic Prior Knowledge | This type of constraint was used in the development of a rule-based dependency parser (De-busmann et al., 2004). |
Related Work | Smith and Eisner (2007) apply entropy regularization to dependency parsing . |
Related Work | There are also a number of methods for unsupervised learning of dependency parsers . |
Related Work | Klein and Manning (2004) use a carefully initialized and structured generative model (DMV) in conjunction with the EM algorithm to get the first positive results on unsupervised dependency parsing . |
Abstract | (2005b) formalized dependency parsing as a maximum spanning tree (MST) problem, which can be solved in quadratic time relative to the length of the sentence. |
Abstract | They show that MST parsing is almost as accurate as cubic-time dependency parsing in the case of English, and that it is more accurate with free word order languages. |
Dependency parsing for machine translation | In this section, we review dependency parsing formulated as a maximum spanning tree problem (McDonald et al., 2005b), which can be solved in quadratic time, and then present its adaptation and novel application to phrase-based decoding. |
Dependency parsing for machine translation | In the case of dependency parsing for Czech, (McDonald et al., 2005b) even outperforms projective parsing, and was one of the top systems in the CoNLL-06 shared task in multilingual dependency parsing . |
Dependency parsing for machine translation | 2.1 0(n2)-time dependency parsing for MT |
Introduction | While deterministic parsers are often deemed inadequate for dealing with ambiguities of natural language, highly accurate 0(n2) algorithms exist in the case of dependency parsing . |
Introduction | (2005b) present a quadratic-time dependency parsing algorithm that is just 0.7% less accurate than “full-fledged” chart parsing (which, in the case of dependency parsing , runs in time 0(n3) (Eisner, 1996)). |
Introduction | Most interestingly, the time complexity of non-projective dependency parsing remains quadratic as the order of the language model increases. |
Abstract | In this paper we describe an intuitionistic method for dependency parsing , where a classifier is used to determine whether a pair of words forms a dependency edge. |
Abstract | Experiments show that, the classifier trained on the projected classification instances significantly outperforms previous projected dependency parsers . |
Abstract | More importantly, when this classifier is integrated into a maximum spanning tree (MST) dependency parser , obvious improvement is obtained over the MST baseline. |
Introduction | Supervised dependency parsing achieves the state-of-the-art in recent years (McDonald et al., 2005a; McDonald and Pereira, 2006; Nivre et al., 2006). |
Introduction | For example, the unsupervised dependency parsing (Klein and Manning, 2004) which is totally based on unannotated data, and the semisupervised dependency parsing (Koo et al., 2008) which is based on both annotated and unannotated data. |
Introduction | Meanwhile, we propose an intuitionistic model for dependency parsing , which uses a classifier to determine whether a pair of words form a dependency edge. |
Abstract | We formulate the problem of non-projective dependency parsing as a polynomial-sized integer linear program. |
Dependency Parsing | Let us first describe formally the set of legal dependency parse trees. |
Dependency Parsing | We define the set of legal dependency parse trees of at (denoted 34:10)) as the set of O-arborescences of D, i.e., we admit each arborescence as a potential dependency tree. |
Dependency Parsing | 3 In this paper, we consider unlabeled dependency parsing , where only the backbone structure (i.e., the arcs without the labels depicted in Fig. |
Dependency Parsing as an ILP | Riedel and Clarke (2006) proposed an ILP formulation for dependency parsing which refines the arc-factored model by imposing linguistically motivated “hard” constraints that forbid some arc configurations. |
Introduction | Much attention has recently been devoted to integer linear programming (ILP) formulations of NLP problems, with interesting results in applications like semantic role labeling (Roth and Yih, 2005; Punyakanok et al., 2004), dependency parsing (Riedel and Clarke, 2006), word alignment for machine translation (Lacoste-Julien et al., 2006), summarization (Clarke and Lapata, 2008), and coreference resolution (Denis and Baldridge, 2007), among others. |
Introduction | Riedel and Clarke (2006) cast dependency parsing as an ILP, but efi‘icient formulations remain an open problem. |
Abstract | We present a novel transition system for dependency parsing , which constructs arcs only between adjacent words but can parse arbitrary non-projective trees by swapping the order of words in the input. |
Background Notions 2.1 Dependency Graphs and Trees | Following Nivre (2008a), we define a transition system for dependency parsing as a quadruple S = (C, T, cs, Ct), where |
Background Notions 2.1 Dependency Graphs and Trees | Figure 2: Transitions for dependency parsing ; Tp = |
Introduction | However, one problem that still has not found a satisfactory solution in data-driven dependency parsing is the treatment of discontinuous syntactic constructions, usually modeled by non-projective |
Introduction | Current approaches to data-driven dependency parsing typically use one of two strategies to deal with non-projective trees (unless they ignore them completely). |
Introduction | In Section 2, we define the formal representations needed and introduce the framework of transition-based dependency parsing . |
Transitions for Dependency Parsing | Having defined the set of configurations, including initial and terminal configurations, we will now focus on the transition set T required for dependency parsing . |
Transitions for Dependency Parsing | 3.1 Projective Dependency Parsing |
Transitions for Dependency Parsing | The minimal transition set Tp for projective dependency parsing contains three transitions: |
Abstract | The dependency backbone of an HP SG analysis is used to provide general linguistic insights which, when combined with state-of-the-art statistical dependency parsing models, achieves performance improvements on out-domain testsflL |
Dependency Parsing with HPSG | In this section, we explore two possible applications of the HPSG parsing onto the syntactic dependency parsing task. |
Dependency Parsing with HPSG | One is to extract dependency backbone from the HP SG analyses of the sentences and directly convert them into the target representation; the other way is to encode the HP SG outputs as additional features into the existing statistical dependency parsing models. |
Dependency Parsing with HPSG | Besides directly using the dependency backbone of the HP SG output, we could also use it for building feature-based models of statistical dependency parsers . |
Introduction | Syntactic dependency parsing is attracting more and more research focus in recent years, partially due to its theory-neutral representation, but also thanks to its wide deployment in various NLP tasks (machine translation, textual entailment recognition, question answering, information extraction, etc.). |
Introduction | In combination with machine learning methods, several statistical dependency parsing models have reached comparable high parsing accuracy (McDonald et al., 2005b; Nivre et al., 2007b). |
Introduction | In the meantime, successful continuation of CoNLL Shared Tasks since 2006 (Buchholz and Marsi, 2006; Nivre et al., 2007a; Surdeanu et al., 2008) have witnessed how easy it has become to train a statistical syntactic dependency parser provided that there is annotated treebank. |
Parser Domain Adaptation | In recent years, two statistical dependency parsing systems, MaltParser (Nivre et al., 2007b) and MS TParser (McDonald et al., 2005b), representing different threads of research in data-driven machine learning approaches have obtained high publicity, for their state-of-the-art performances in open competitions such as CoNLL Shared Tasks. |
Parser Domain Adaptation | In addition, most of the previous work have been focusing on constituent-based parsing, while the domain adaptation of the dependency parsing has not been fully explored. |
Parser Domain Adaptation | Figure 1: Different dependency parsing models and their combinations. |
Abstract | This paper proposes an approach to enhance dependency parsing in a language by using a translated treebank from another language. |
Introduction | Although supervised learning methods bring state-of-the-art outcome for dependency parser inferring (McDonald et al., 2005; Hall et al., 2007), a large enough data set is often required for specific parsing accuracy according to this type of methods. |
Introduction | As different human languages or treebanks should share something common, this makes it possible to let dependency parsing in multiple languages be beneficial with each other. |
Introduction | In this paper, we study how to improve dependency parsing by using (automatically) translated texts attached with transformed dependency information. |
The Related Work | Even the translation outputs are not so good as the expected, a dependency parser for the |
The Related Work | However, although it is not essentially different, we only focus on dependency parsing itself, while the parsing scheme in (Burkett and Klein, 2008) based on a constituent representation. |
Treebank Translation and Dependency Transformation | how a translated English treebank enhances a Chinese dependency parser . |
Abstract | We present algorithms for higher-order dependency parsing that are “third-order” in the sense that they can evaluate substructures containing three dependencies, and “efficient” in the sense that they require only O(n4) time. |
Conclusion | A second area for future work lies in applications of dependency parsing . |
Dependency parsing | For a sentence :13, we define dependency parsing as a search for the highest-scoring analysis of :13: |
Existing parsing algorithms | Our new third-order dependency parsers build on ideas from existing parsing algorithms. |
Introduction | Consequently, recent work in dependency parsing has been restricted to applications of second-order parsers, the most powerful of which (Carreras, 2007) requires 0(n4) time and 0(n3) space, while being limited to second-order parts. |
New third-order parsing algorithms | In this section we describe our new third-order dependency parsing algorithms. |
New third-order parsing algorithms | As a final note, the parsing algorithms described in this section fall into the category of projective dependency parsers , which forbid crossing dependencies. |
Parsing experiments | Following standard practice for higher-order dependency parsing (McDonald and Pereira, 2006; Carreras, 2007), Models 1 and 2 evaluate not only the relevant third-order parts, but also the lower-order parts that are implicit in their third-order factorizations. |
Parsing experiments | For example, Model 1 defines feature mappings for dependencies, siblings, grandchildren, and grand-siblings, so that the score of a dependency parse is given by: |
Related work | Eisner (2000) defines dependency parsing models where each word has a set of possible “senses” and the parser recovers the best joint assignment of syntax and senses. |
Abstract | In this paper, we present an approach to enriching high—order feature representations for graph-based dependency parsing models using a dependency language model and beam search. |
Analysis | Dependency parsers tend to perform worse on heads which have many children. |
Conclusion | We have presented an approach to utilizing the dependency language model to improve graph-based dependency parsing . |
Introduction | In recent years, there are many data-driven models that have been proposed for dependency parsing (McDonald and Nivre, 2007). |
Introduction | Among them, graph-based dependency parsing models have achieved state-of-the-art performance for a wide range of Ian-guages as shown in recent CoNLL shared tasks |
Introduction | In the graph-based models, dependency parsing is treated as a structured prediction problem in which the graphs are usually represented as factored structures. |
Related work | (2008) used a clustering algorithm to produce word clusters on a large amount of unannotated data and represented new features based on the clusters for dependency parsing models. |
Related work | (2009) proposed an approach that extracted partial tree structures from a large amount of data and used them as the additional features to improve dependency parsing . |
Related work | They extended a Semi-supervised Structured Conditional Model (SS-SCM)(Suzuki and Isozaki, 2008) to the dependency parsing problem and combined their method with the approach of Koo et al. |
Abstract | Shift-reduce dependency parsers give comparable accuracies to their chart-based counterparts, yet the best shift-reduce constituent parsers still lag behind the state-of-the-art. |
Improved hypotheses comparison | Unlike dependency parsing , constituent parse trees for the same sentence can have different numbers of nodes, mainly due to the existence of unary nodes. |
Introduction | Various methods have been proposed to address the disadvantages of greedy local parsing, among which a framework of beam-search and global discriminative training have been shown effective for dependency parsing (Zhang and Clark, 2008; Huang and Sagae, 2010). |
Introduction | With the use of rich nonlocal features, transition-based dependency parsers achieve state-of-the-art accuracies that are comparable to the best-graph-based parsers (Zhang and Nivre, 2011; Bohnet and Nivre, 2012). |
Introduction | In addition, processing tens of sentences per second (Zhang and Nivre, 2011), these transition-based parsers can be a favorable choice for dependency parsing . |
Semi-supervised Parsing with Large Data | Word clusters are regarded as lexical intermediaries for dependency parsing (Koo et al., 2008) and POS tagging (Sun and Uszkoreit, 2012). |
Semi-supervised Parsing with Large Data | The idea of exploiting lexical dependency information from auto-parsed data has been explored before for dependency parsing (Chen et al., 2009) and constituent parsing (Zhu et al., 2012). |
Semi-supervised Parsing with Large Data | (2008) and is used as additional information for graph-based dependency parsing in Chen et al. |
Abstract | We present a novel transition-based, greedy dependency parser which implements a flexible mix of bottom-up and top-down strategies. |
Concluding Remarks | In the context of transition-based dependency parsers , right spines have also been exploited by Kitagawa and Tanaka—Ishii (2010) to decide where to attach the next word from the buffer. |
Dependency Parser | Transition-based dependency parsers use a stack data structure, where each stack element is associated with a tree spanning some (contiguous) substring of the input 212. |
Dependency Parser | We assume the reader is familiar with the formal framework of transition-based dependency parsing originally introduced by Nivre (2003); see Nivre (2008) for an introduction. |
Introduction | This development is probably due to many factors, such as the increased availability of dependency treebanks and the perceived usefulness of dependency structures as an interface to downstream applications, but a very important reason is also the high efficiency offered by dependency parsers , enabling web-scale parsing with high throughput. |
Introduction | However, while these parsers are capable of processing tens of thousands of tokens per second with the right choice of classifiers, they are also known to perform slightly below the state-of-the-art because of search errors and subsequent error propagation (McDonald and Nivre, 2007), and recent research on transition-based dependency parsing has therefore explored different ways of improving their accuracy. |
Model and Training | Standard transition-based dependency parsers are trained by associating each gold tree with a canonical complete computation. |
Model and Training | In the context of dependency parsing , the strategy of delaying arc construction when the current configuration is not informative is called the easy-first strategy, and has been first explored by Goldberg and Elhadad (2010). |
Static vs. Dynamic Parsing | In the context of dependency parsing , a parsing strategy is called purely bottom-up if every dependency h —> d is constructed only after all dependencies of the form d —> i have been constructed. |
Static vs. Dynamic Parsing | If we consider transition-based dependency parsing (Nivre, 2008), the purely bottom-up strategy is implemented by the arc-standard model of Nivre (2004). |
Catenae as semantic units | 1 shows a dependency parse that generates 21 catenae in total: (using 2' for Xi) 1, 2, 3, 4, 5, 6, 12, 23, 34, 45, 56, 123, 234, 345, 456, 1234, 2345, 3456, 12345, 23456, 123456. |
Catenae as semantic units | This highlights the fact that a single dependency parse may only partially represent the ambiguous semantics of a query. |
Conclusion | We presented a flexible implementation of dependency paths for long queries in ad hoc IR that does not require dependency parsing a collection. |
Introduction | These approaches are motivated by the idea that sentence meaning can be flexibly captured by the syntactic and semantic relations between words, and encoded in dependency parse tree fragments. |
Selection method for catenae | We use a pseudo-projective joint dependency parse and semantic role labelling system (J ohansson and |
Selection method for catenae | Nugues, 2008) to generate the dependency parse . |
Selection method for catenae | However, any dependency parser may be applied instead. |
Abstract | This paper proposes a dependency parsing method that uses bilingual constraints to improve the accuracy of parsing bilingual texts (bitexts). |
Dependency parsing | For dependency parsing , there are two main types of parsing models (Nivre and McDonald, 2008; Nivre and Kubler, 2006): transition-based (Nivre, 2003; Yamada and Matsumoto, 2003) and graph-based (McDonald et al., 2005; Carreras, 2007). |
Dependency parsing | Figure 3 shows an example of dependency parsing . |
Experiments | Table 2: Dependency parsing results of Chinese-source case |
Experiments | Table 3: Dependency parsing results of English-source case |
Introduction | This paper proposes a dependency parsing method, which uses the bilingual constraints that we call bilingual subtree constraints and statistics concerning the constraints estimated from large unlabeled monolingual corpora. |
Introduction | The result is used as additional features for the source side dependency parser . |
Introduction | Section 3 introduces the background of dependency parsing . |
Conclusions | Additionally, this work introduces a parser-centric view of normalization, in which the performance of the normalizer is directly tied to the performance of a downstream dependency parser . |
Conclusions | Using this metric, this work established that, when dependency parsing is the goal, typical word-to-word normalization approaches are insufficient. |
Discussion | The results in Section 5.2 establish a point that has often been assumed but, to the best of our knowledge, has never been explicitly shown: performing normalization is indeed beneficial to dependency parsing on informal text. |
Evaluation | The goal is to evaluate the framework in two aspects: (1) usefulness for downstream applications (specifically dependency parsing ), and (2) domain adaptability. |
Evaluation | We then run an off-the-shelf dependency parser on the gold standard normalized data to produce our gold standard parses. |
Evaluation | These results validate the hypothesis that simple word-to-word normalization is insufficient if the goal of normalization is to improve dependency parsing ; even if a system could produce perfect word-to-word normalization, it would produce lower quality parses than those produced by our approach. |
Introduction | To address this problem, this work introduces an evaluation metric that ties normalization performance directly to the performance of a downstream dependency parser . |
Abstract | Finding a class of structures that is rich enough for adequate linguistic representation yet restricted enough for efficient computational processing is an important problem for dependency parsing . |
Introduction | One of the unresolved issues in this area is the proper treatment of non-projective dependency trees, which seem to be required for an adequate representation of predicate-argument structure, but which undermine the efficiency of dependency parsing (Neuhaus and Broker, 1997; Buch-Kromann, 2006; McDonald and Satta, 2007). |
Introduction | This was originally proposed by Yli-Jyr'a (2003) but has so far played a marginal role in the dependency parsing literature, because no algorithm was known for determining whether an arbitrary tree was m-planar, and no parsing algorithm existed for any constant value of m. The contribution of this paper is twofold. |
Parsing 1-Planar Structures | In the transition-based framework of Nivre (2008), a deterministic dependency parser is defined by a nondeterministic transition system, specifying a set of elementary operations that can be executed during the parsing process, and an oracle that deterministically selects a single transition at each choice point of the parsing process. |
Parsing 1-Planar Structures | A transition system for dependency parsing is a quadruple S = (C, T, cs, Ct) where I. |
Preliminaries | For reasons of computational efficiency, many dependency parsers are restricted to work with projective dependency structures, that is, forests in which the projection of each node corresponds to a contiguous substring of the input: |
Preliminaries | The concept of planarity on its own does not seem to be very relevant as an extension of projectivity for practical dependency parsing . |
Abstract | Even though the quality of unsupervised dependency parsers grows, they often fail in recognition of very basic dependencies. |
Conclusions and Future Work | We proved that such prior knowledge about stop-probabilities incorporated into the standard DMV model significantly improves the unsupervised dependency parsing and, since we are not aware of any other fully unsupervised dependency parser with higher average attachment score over CoNLL data, we state that we reached a new state-of-the-art result.5 |
Conclusions and Future Work | We suppose that many of the current works on unsupervised dependency parsers use gold POS tags only as a simplification of this task, and that the ultimate purpose of this effort is to develop a fully unsupervised induction of linguistic structure from raw texts that would be useful across many languages, domains, and applications. |
Introduction | The task of unsupervised dependency parsing (which strongly relates to the grammar induction task) has become popular in the last decade, and its quality has been greatly increasing during this period. |
Related Work | We have directly utilized the aforementioned criteria for dependency relations in unsupervised dependency parsing in our previous paper (Marecek and Zabokrtsky, 2012). |
Related Work | Dependency Model with Valence (DMV) has been the most popular approach to unsupervised dependency parsing in the recent years. |
Related Work | Other approaches to unsupervised dependency parsing were described e.g. |
Dependency Parsing | Given an input sentence x = wowl...wn and its POS tag sequence 1; = totl...tn, the goal of dependency parsing is to build a dependency tree as depicted in Figure l, denoted by d = {(h, m, l) : 0 g h 3 72,0 < m g n,l E L}, where (h,m, l) indicates an directed arc from the head word (also called father) w, to the modifier (also called child or dependent) wm with a dependency label l, and L is the label set. |
Dependency Parsing | We omit the label l because we focus on unlabeled dependency parsing in the present paper. |
Experiments and Analysis | (2011) show that a joint POS tagging and dependency parsing model can significantly improve parsing accuracy over a pipeline model. |
Introduction | 2CTB5 is converted to dependency structures following the standard practice of dependency parsing (Zhang and Clark, 2008b). |
Related Work | These features are tailored to the dependency parsing problem. |
Related Work | Our approach is also intuitively related to stacked learning (SL), a machine learning framework that has recently been applied to dependency parsing to integrate two mainstream parsing models, i.e., graph—based and transition—based models (Nivre and McDonald, 2008; Martins et al., 2008). |
Experiments | (2011), which additionally uses the Chinese Gigaword Corpus; Li ’11 denotes a generative model that can perform word segmentation, POS tagging and phrase-structure parsing jointly (Li, 2011); Li+ ’12 denotes a unified dependency parsing model that can perform joint word segmentation, POS tagging and dependency parsing (Li and Zhou, 2012); Li ’11 and Li+ ’12 exploited annotated morphological-level word structures for Chinese; Hatori+ ’12 denotes an incremental joint model for word segmentation, POS tagging and dependency parsing (Hatori et al., 2012); they use external dictionary resources including HowNet Word List and page names from the Chinese Wikipedia; Qian+ ’12 denotes a joint segmentation, POS tagging and parsing system using a unified framework for decoding, incorporating a word segmentation model, a POS tagging model and a phrase-structure parsing model together (Qian and Liu, 2012); their word segmentation model is a combination of character-based model and word-based model. |
Related Work | Zhao (2009) studied character-level dependencies for Chinese word segmentation by formalizing segmentsion task in a dependency parsing framework. |
Related Work | Li and Zhou (2012) also exploited the morphological-level word structures for Chinese dependency parsing . |
Related Work | (2012) proposed the first joint work for the word segmentation, POS tagging and dependency parsing . |
Word Structures and Syntax Trees | They studied the influence of such morphology to Chinese dependency parsing (Li and Zhou, 2012). |
Conclusion | A comparison of the parsing accuracy with previous works on Japanese dependency parsing and English CCG parsing indicates that our parser can analyze real-world Japanese texts fairly well and that there is room for improvement in disambiguation models. |
Evaluation | The integrated corpus is divided into training, development, and final test sets following the standard data split in previous works on Japanese dependency parsing (Kudo and Matsumoto, 2002). |
Evaluation | Following conventions in research on Japanese dependency parsing , gold morphological analysis results were input to a parser. |
Evaluation | Comparing the parser’s performance with previous works on Japanese dependency parsing is difficult as our figures are not directly comparable to theirs. |
Introduction | Syntactic parsing for Japanese has been dominated by a dependency-based pipeline in which chunk-based dependency parsing is applied and then semantic role labeling is performed on the dependencies (Sasano and Kurohashi, 2011; Kawahara and Kurohashi, 2011; Kudo and Matsumoto, 2002; Iida and Poesio, 2011; Hayashibe et al., 2011). |
Evaluation | The news have been processed with a to-kenizer, a sentence splitter (Gillick and Favre, 2009), a part-of-speech tagger and dependency parser (Nivre, 2006), a co-reference resolution module (Haghighi and Klein, 2009) and an entity linker based on Wikipedia and Freebase (Milne and Witten, 2008). |
Heuristics-based pattern extraction | (2013), who built well formed relational patterns by extending minimum spanning trees (MST) which connect entity mentions in a dependency parse . |
Introduction | Algorithm 1 HEURISTICEXTRACTOR(T, E): heuristically extract relational patterns for the dependency parse T and the set of entities E. 1: /* Global constants /* 2: global Vp, VC,N10,NC 3: Vc <— {subj,nsubj,nsubjpass,dobj,iobj,a:comp, 4: acomp, ewpl, neg, aux, attr, prt} 5: Vp <— {azcomp} 6 7 8 9 |
Memory-based pattern extraction | To this end, we build a trie of dependency trees (which we call a tree-trie) by scanning all the dependency parses in the news training |
Memory-based pattern extraction | As a result, we are able to load a trie encoding 400M input dependency parses , 170M distinct nodes and 48M distinct sentence structures in under 10GB of RAM. |
Pattern extraction by sentence compression | Instead, we chose to modify the method of Filippova and Altun (2013) because it relies on dependency parse trees and does not use any LM scoring. |
Abstract | Much of the recent work on dependency parsing has been focused on solving inherent combinatorial problems associated with rich scoring functions. |
Conclusions | Our model achieves the best results on the standard dependency parsing benchmark, outperforming parsing methods with elaborate inference procedures. |
Experimental Setup | We apply the Random Walk-based sampling method (see Section 3.2.2) for the standard dependency parsing task. |
Introduction | Dependency parsing is commonly cast as a maximization problem over a parameterized scoring function. |
Related Work | Earlier works on dependency parsing focused on inference with tractable scoring functions. |
Sampling-Based Dependency Parsing with Global Features | In this section, we introduce our novel sampling-based dependency parser which can incorporate |
Abstract | As an illustrative case, we study a generative model for dependency parsing . |
Experiments | All our experiments use the DMV for unsupervised dependency parsing of part-of-speech (POS) tag sequences. |
Introduction | We focus on the well-studied but unsolved task of unsupervised dependency parsing (i.e., depen- |
Related Work | Gimpel and Smith (2012) proposed a concave model for unsupervised dependency parsing using IBM Model 1. |
Related Work | Several integer linear programming (ILP) formulations of dependency parsing (Riedel and Clarke, 2006; Martins et al., 2009; Riedel et al., 2012) inspired our definition of grammar induction as a MP. |
Related Work | For semi-supervised dependency parsing , Wang et al. |
Background and Motivation | (2011) successfully apply this idea to the transfer of dependency parsers , using part-of-speech tags as the shared representation of words. |
Evaluation | In the low-resource setting, we cannot always rely on the availability of an accurate dependency parser for the target language. |
Model Transfer | If a target language is poor in resources, one can obtain a dependency parser for the target language by means of cross-lingual model transfer (Zeman and Resnik, 2008). |
Related Work | Cross-lingual annotation projection (Yarowsky et al., 2001) approaches have been applied extensively to a variety of tasks, including POS tagging (Xi and Hwa, 2005; Das and Petrov, 2011), morphology segmentation (Snyder and Barzilay, 2008), verb classification (Merlo et al., 2002), mention detection (Zitouni and Florian, 2008), LFG parsing (Wroblewska and Frank, 2009), information extraction (Kim et al., 2010), SRL (Pado and Lapata, 2009; van der Plas et al., 2011; Annesi and Basili, 2010; Tonelli and Pi-anta, 2008), dependency parsing (Naseem et al., 2012; Ganchev et al., 2009; Smith and Eisner, 2009; Hwa et al., 2005) or temporal relation pre- |
Results | Secondly, in the model transfer setup it is more important how closely the syntactic-semantic interface on the target side resembles that on the source side than how well it matches the “true” structure of the target language, and in this respect a transferred dependency parser may have an advantage over one trained on target-language data. |
Setup | With respect to the use of syntactic annotation we consider two options: using an existing dependency parser for the target language and obtaining one by means of cross-lingual transfer (see section 4.2). |
Coordination Structures in Treebanks | Some of the treebanks were downloaded individually from the web, but most of them came from previously published collections for dependency parsing campaigns: six languages from CoNLL-2006 (Buchholz and Marsi, 2006), seven languages from CoNLL-2007 (Nivre et al., 2007), two languages from CoNLL-2009 (Hajic and others, 2009), three languages from ICON-2010 (Husain et al., 2010). |
Introduction | In the last decade, dependency parsing has gradually been receiving visible attention. |
Related work | MTT possesses a complex set of linguistic criteria for identifying the governor of a relation (see Mazziotta (2011) for an overview), which lead to MS. MS is preferred in a rule-based dependency parsing system of Lombardo and Lesmo (1998). |
Related work | The primitive format used for CoNLL shared tasks is widely used in dependency parsing , but its weaknesses have already been pointed out (cf. |
Variations in representing coordination structures | Most state-of-the-art dependency parsers can produce labeled edges. |
Automatic Annotation Adaptation | This coincides with the stacking method for combining dependency parsers (Martins et al., 2008; Nivre and McDon- |
Automatic Annotation Adaptation | This is similar to feature design in discriminative dependency parsing (McDonald et al., 2005; Mc- |
Experiments | For example, currently, most Chinese constituency and dependency parsers are trained on some version of CTB, using its segmentation and POS tagging as the defacto standards. |
Related Works | Co-training (Sarkar, 2001) and classifier combination (Nivre and McDonald, 2008) are two technologies for training improved dependency parsers . |
Related Works | The classifier combination lets graph-based and transition-based dependency parsers to utilize the features extracted from each other’s parsing results, to obtain combined, enhanced parsers. |
Conclusion | However, the failure to uncover gains when searching across a variety of possible mechanisms for improvement, training procedures for embeddings, hyperparam-eter settings, tasks, and resource scenarios suggests that these gains (if they do exist) are extremely sensitive to these training conditions, and not nearly as accessible as they seem to be in dependency parsers . |
Conclusion | Indeed, our results suggest a hypothesis that word embeddings are useful for dependency parsing (and perhaps other tasks) because they provide a level of syntactic abstraction which is explicitly annotated in constituency parses. |
Introduction | Dependency parsers have seen gains from distributional statistics in the form of discrete word clusters (Koo et al., 2008), and recent work (Bansal et al., 2014) suggests that similar gains can be derived from embeddings like the ones used in this paper. |
Introduction | The fact that word embedding features result in nontrivial gains for discriminative dependency parsing (Bansal et al., 2014), but do not appear to be effective for constituency parsing, points to an interesting structural difference between the two tasks. |
Introduction | We hypothesize that dependency parsers benefit from the introduction of features (like clusters and embeddings) that provide syntactic abstractions; but that constituency parsers already have access to such abstractions in the form of supervised preterminal tags. |
Experiments | A tweet-specific tokenizer (Gimpel et al., 2011) is employed, and the dependency parsing results are computed by Stanford Parser (Klein and Manning, 2003). |
Experiments | The POS tagging and dependency parsing results are not precise enough for the Twitter data, so these handcrafted rules are rarely matched. |
Introduction | (2011) combine the target-independent features (content and lexicon) and target-dependent features (rules based on the dependency parsing results) together in subjectivity classification and polarity classification for tweets. |
Our Approach | We use the dependency parsing results to find the words syntactically connected with the interested target. |
Our Approach | In Section 3.1, we show how to build recursive structure for target using the dependency parsing results. |
Introduction | To solve the latter problem, we introduce an apparently novel O(|V|2 log algorithm that is similar to the maximum spanning tree (MST) algorithms that are widely used for dependency parsing (McDonald et al., 2005). |
Notation and Overview | The relation identification stage (§4) is similar to a graph-based dependency parser . |
Notation and Overview | Each stage is a discriminatively-trained linear structured predictor with rich features that make use of part-of-speech tagging, named entity tagging, and dependency parsing . |
Related Work | Our approach to relation identification is inspired by graph-based techniques for non-projective syntactic dependency parsing . |
Related Work | Minimum spanning tree algorithms—specifically, the optimum branching algorithm of Chu and Liu (1965) and Edmonds (1967)—were first used for dependency parsing by McDonald et a1. |
Abstract | In order to constrain the exhaustive attachments of function words, we limit to bind them to the nearby syntactic chunks yielded by a target dependency parser . |
Composed Rule Extraction | In the English-to-Japanese translation test case of the present study, the target chunk set is yielded by a state-of-the-art Japanese dependency parser , Cabocha v0.535 (Kudo and Matsumoto, 2002). |
Conclusion | In order to avoid generating too large a derivation forest for a packed forest, we further used chunk-level information yielded by a target dependency parser . |
Introduction | In order to constrain the exhaustive attachments of function words, we further limit the function words to bind to their surrounding chunks yielded by a dependency parser . |
Related Research | Thus, we focus on the realignment of target function words to source tree fragments and use a dependency parser to limit the attachments of unaligned target words. |
Introduction | i-1 i j j+1 (c) Knowledge for dependency parsing |
Introduction | Figure 1: Natural annotations for word segmentation and dependency parsing . |
Introduction | a Chinese phrase (meaning NLP), and it probably corresponds to a connected subgraph for dependency parsing . |
Knowledge in Natural Annotations | For dependency parsing , the subsequence P tends to form a connected dependency graph if it contains more than one word. |
Related Work | When enriching the related work during writing, we found a work on dependency parsing (Spitkovsky et al., 2010) who utilized parsing constraints derived from hypertext annotations to improve the unsupervised dependency grammar induction. |
Approach | A parallel corpus is word-level aligned using an alignment toolkit (Graca et al., 2009) and the source (English) is parsed using a dependency parser (McDonald et al., 2005). |
Conclusion | In this paper, we proposed a novel and effective learning scheme for transferring dependency parses across bitext. |
Introduction | Recently, dependency parsing has gained popularity as a simpler, computationally more efficient alternative to constituency parsing and has spurred several supervised learning approaches (Eisner, 1996; Yamada and Matsumoto, 2003a; Nivre and Nilsson, 2005; McDonald et al., 2005) as well as unsupervised induction (Klein and Manning, 2004; Smith and Eisner, 2006). |
Related Work | In this volume (Druck et al., 2009) use this framework to train a dependency parser based on constraints stated as corpus-wide expected values of linguistic rules. |
Conclusion | For instance, Kawahara and Kurohashi (2006) improved accuracy of dependency parsing based on Japanese semantic frames automatically induced from a raw corpus. |
Our Approach | 1. apply dependency parsing to a raw corpus and extract predicate-argument structures for each verb from the automatic parses, |
Our Approach | We apply dependency parsing to a large raw corpus. |
Our Approach | Then, we extract predicate-argument structures from the dependency parses . |
Experiments | Gold dependency parses were approximated by running the Stanford dependency parser7 over reference compressions. |
Introduction | Following an assumption often used in compression systems, the compressed output in this corpus is constructed by dropping tokens from the input sentence without any paraphrasing or reordering.1 A number of diverse approaches have been proposed for deletion-based sentence compression, including techniques that assemble the output text under an n-gram factorization over the input text (McDonald, 2006; Clarke and Lapata, 2008) or an arc factorization over input dependency parses (Filippova and Strube, 2008; Galanis and Androutsopoulos, 2010; Filippova and Altun, 2013). |
Introduction | Maximum spanning tree algorithms, commonly used in non-projective dependency parsing (McDonald et al., 2005), are not easily adaptable to this task since the maximum-weight subtree is not necessarily a part of the maximum spanning tree. |
Multi-Structure Sentence Compression | C. In addition, we define bigram indicator variables yij E {0, l} to represent whether a particular order-preserving bigram2 (ti, tj> from S is present as a contiguous bigram in C as well as dependency indicator variables zij E {0, 1} corresponding to whether the dependency arc ti —> 253- is present in the dependency parse of C. The score for a given compression 0 can now be defined to factor over its tokens, n-grams and dependencies as follows. |
Experiments | The grammar for ASP contains the annotated lexicon entries and grammar rules in Sections 02-21 of CCGbank, and additional semantic entries produced using a set of dependency parse heuristics. |
Experiments | These entries are instantiated using a set of dependency parse patterns, listed in an online appendix.2 These patterns are applied to the training corpus, heuristically identifying verbs, prepositions, and possessives that express relations, and nouns that express categories. |
Experiments | This approach trains a semantic parser by combining distant semantic supervision with syntactic supervision from dependency parses . |
Experiments | Since our system uses an off-the-shelf dependency parser , and semantic representations are obtained from simple rule-based conversion from dependency trees, there will be only one (right or wrong) interpretation in face of ambiguous sentences. |
Generating On-the-fly Knowledge | For a TH pair, apply dependency parsing and coreference resolution. |
Generating On-the-fly Knowledge | Perform rule-based conversion from dependency parses to DCS trees, which are translated to statements on abstract denotations. |
The Idea | To obtain DCS trees from natural language, we use Stanford CoreNLP5 for dependency parsing (Socher et al., 2013), and convert Stanford dependencies to DCS trees by pattern matching on POS tags and dependency labels.6 Currently we use the following semantic roles: ARG, SUBJ, OBJ, IOBJ, TIME and MOD. |
Experience Detection | The dependency parser is used to ensure a modal marker is indeed associated with the main predicate. |
Experience Detection | In order to make a distinction, we use the dependency parser and a named-entity recognizer (Finkel et al., 2005) that can recognize person pronouns and person names. |
Experience Detection | We used the dependency parser for extracting objective cases using the direct object relation. |
Lexicon Construction | For a POS and grammatical check of a candidate sentence, we used the Stanford POS tagger (Toutanova et al., 2003) and Stanford dependency parser (Klein and Manning, 2003). |
Analysis of reference compressions | In addition, sentence compression methods that strongly depend on syntactic parsers have two problems: ‘parse error’ and ‘decoding speed.’ 44% of sentences output by a state-of-the-art Japanese dependency parser contain at least one error (Kudo and Matsumoto, 2005). |
Conclusions | 0 We showed that our method is about 4.3 times faster than Hori’s method which employs a dependency parser . |
Experimental Evaluation | sion of Hori’s method which does not require the dependency parser . |
Results and Discussion | Our method was about 4.3 times faster than Hori’s method due to the latter’s use of dependency parser . |
Adding Linguistic Knowledge to the Monte-Carlo Framework | Given sentence 3/, and its dependency parse qi, we model the distribution over predicate labels (5;- as: |
Adding Linguistic Knowledge to the Monte-Carlo Framework | The feature function 2; used for predicate labeling on the other hand operates only on a given sentence and its dependency parse . |
Adding Linguistic Knowledge to the Monte-Carlo Framework | It computes features which are the Cartesian product of the candidate predicate label with word attributes such as type, part-of—speech tag, and dependency parse information. |
Abstract | Besides using traditional dependency parsers , we also use the dependency structures transformed from PCFG trees and predicate-argument structures (PASS) which are generated by an HPSG parser and a CCG parser. |
Experiments | These results lead us to argue that the robustness of deep syntactic parsers can be advantageous in SMT compared with traditional dependency parsers . |
Gaining Dependency Structures | 2.2 Dependency parsing |
Gaining Dependency Structures | Graph-based and transition-based are two predominant paradigms for data-driven dependency parsing . |
Features | In order to generate these features we parse each sentence with the broad-coverage dependency parser MIN IPAR (Lin, 1998). |
Features | A dependency parse consists of a set of words and chunks (e.g. |
Implementation | Each sentence of this unstructured text is dependency parsed by MINIPAR to produce a dependency graph. |
Implementation | This chunking is restricted by the dependency parse of the sentence, however, in that chunks must be contiguous in the parse (i.e., no chunks across subtrees). |
Headline generation | PREPROCESSDATA: We start by preprocessing all the news in the news collections with a standard NLP pipeline: tokenization and sentence boundary detection (Gillick, 2009), part-of-speech tagging, dependency parsing (Nivre, 2006), co-reference resolution (Haghighi and Klein, 2009) and entity linking based on Wikipedia and Freebase. |
Headline generation | Figure 1: Pattern extraction process from an annotated dependency parse . |
Headline generation | GETMENTIONNODES: Using the dependency parse T for a sentence 3, we first identify the set of nodes M,- that mention the entities in E. If T does not contain exactly one mention of each target entity in E1, then the sentence is ignored. |
Experiments | from dependency parse tree) along with computing similarity in semantic spaces (using WordNet) clearly produces an improvement in the summarization quality (+1.4 improvement in ROUGE-l F-score). |
Using the Framework | Instead, we model sentences using a structured representation, i.e., its syntax structure using dependency parse trees. |
Using the Framework | We first use a dependency parser (de Mameffe et al., 2006) to parse each sentence and extract the set of dependency relations associated with the sentence. |
Using the Framework | This allows us to perform approximate matching of syntactic treelets obtained from the dependency parses using semantic (WordNet) similarity. |
Experiments | In this section, we evaluate the performance of the MST dependency parser (McDonald et al., 2005b) which is trained by our bilingually-guided model on 5 languages. |
Introduction | In past decades supervised methods achieved the state-of-the-art in constituency parsing (Collins, 2003; Charniak and Johnson, 2005; Petrov et al., 2006) and dependency parsing (McDonald et al., 2005a; McDonald et al., 2006; Nivre et al., 2006; Nivre et al., 2007; K00 and Collins, 2010). |
Introduction | We evaluate the final automatically-induced dependency parsing model on 5 languages. |
Introduction | In the rest of the paper, we first describe the unsupervised dependency grammar induction framework in section 2 (where the unsupervised optimization objective is given), and introduce the bilingual projection method for dependency parsing in section 3 (where the projected optimization objective is given); Then in section 4 we present the bilingually-guided induction strategy for dependency grammar (where the two objectives above are jointly optimized, as shown in Figure 1). |
Abstract | Experiments on the English Penn Treebank data and the Chinese CoNLL-06 data show that the proposed algorithm achieves comparable results with other data-driven dependency parsing algorithms. |
Conclusion | This paper presents a novel head-driven parsing algorithm and empirically shows that it is as practical as other dependency parsing algorithms. |
Introduction | The Earley prediction is tied to a particular grammar rule, but the proposed algorithm is data—driven, following the current trends of dependency parsing (Nivre, 2006; McDonald and Pereira, 2006; Koo et al., 2010). |
Weighted Parsing Model | We define the set FIRST(-) for our top-down dependency parser: |
Introduction | They suffice to operate on well-formed structures and produce projective dependency parse trees. |
Introduction | This is often referred to as conflict in the shift-reduce dependency parsing literature (Huang et al., 2009). |
Introduction | Unfortunately, such oracle turns out to be non-unique even for monolingual shift-reduce dependency parsing (Huang et al., 2009). |
Experiments | The default parser in the experiments is a shift-reduce dependency parser (Nivre and Scholz, 2004). |
Experiments | We convert dependency parses to constituent trees by propagating the part-of-speech tags of the head words to the corresponding phrase structures. |
Experiments | Our fast deterministic dependency parser does not generate a packed forest. |
Experiments | We used syntactic features (i.e., features obtained from the dependency parse tree of a sentence) and lexical features, and entity types, which essentially correspond to the ones developed by Mintz et a1. |
Knowledge-based Distant Supervision | Since two entities mentioned in a sentence do not always have a relation, we select entity pairs from a corpus when: (i) the path of the dependency parse tree between the corresponding two named entities in the sentence is no longer than 4 and (ii) the path does not contain a sentence-like boundary, such as a relative clause1 (Banko et al., 2007; Banko and Etzioni, 2008). |
Wrong Label Reduction | We define a pattern as the entity types of an entity pair2 as well as the sequence of words on the path of the dependency parse tree from the first entity to the second one. |
Abstract | We model the problem as a joint dependency parsing and semantic role labeling task. |
Introduction | We model our problem as a joint dependency parsing and role labeling task, assuming a Bayesian generative process. |
Problem Formulation | We formalize the learning problem as a dependency parsing and role labeling problem. |
Experimental Setup | In addition to gold standard dependency parses , the dataset also contains automatic parses obtained from the MaltParser (Nivre et al., 2007). |
Learning Setting | Given a dependency parse of a sentence, our system identifies argument instances and assigns them to clusters. |
Split-Merge Role Induction | Figure l: A sample dependency parse with dependency labels SBJ (subject), OBJ (object), NMOD (nominal modifier), OPRD (object predicative complement), PRD (predicative complement), and IM (infinitive marker). |
Experiments | Preprocessing of the ACE documents: We used the Stanford parser6 for syntactic and dependency parsing . |
Experiments | (2008) for dependency parsing . |
Introduction | Given an entity pair and a sentence containing the pair, both approaches usually start with multiple level analyses of the sentence such as tokenization, partial or full syntactic parsing, and dependency parsing . |
In our dataset only 11% of Candidate Relations are valid. | grounding, the sentence 13k, and a given dependency parse qk of the sentence. |
In our dataset only 11% of Candidate Relations are valid. | The text component features gbc are computed over sentences and their dependency parses . |
In our dataset only 11% of Candidate Relations are valid. | The Stanford parser (de Marneffe et al., 2006) was used to generate the dependency parse information for each sentence. |
Generating summary from nested tree | After the document tree is obtained, we use a dependency parser to obtain the syntactic dependency trees of sentences. |
Introduction | We propose a method of summarizing a single document that utilizes dependency between sentences obtained from rhetorical structures and dependency between words obtained from a dependency parser . |
Introduction | The sentence tree is a tree that has words as nodes and head modifier relationships between words obtained by the dependency parser as edges. |
Conclusion | Our experiments on generating surveys for Question Answering and Dependency Parsing show how surveys generated using such context information along with citation sentences have higher quality than those built using citations alone. |
Impact on Survey Generation | that contains two sets of cited papers and corresponding citing sentences, one on Question Answering (QA) with 10 papers and the other on Dependency Parsing (DP) with 16 papers. |
Introduction | lBuchholz and Marsi “CoNLL-X Shared Task On Multilingual Dependency Parsing” , CoNLL 2006 |
Introduction | Research in dependency parsing — computational methods to predict such representations — has increased dramatically, due in large part to the availability of dependency treebanks in a number of languages. |
Introduction | In particular, the CoNLL shared tasks on dependency parsing have provided over twenty data sets in a standardized format (Buch-holz and Marsi, 2006; Nivre et al., 2007). |
Towards A Universal Treebank | We use the so-called basic dependencies (with punctuation included), where every dependency structure is a tree spanning all the input tokens, because this is the kind of representation that most available dependency parsers require. |
Architecture of BRAINSUP | The sentence generation process is based on morpho-syntactic patterns which we automatically discover from a corpus of dependency parsed sentences ’P. |
Conclusion | BRAINSUP makes heavy use of dependency parsed data and statistics collected from dependency treebanks to ensure the grammaticality of the generated sentences, and to trim the search space while seeking the sentences that maximize the user satisfaction. |
Evaluation | Dependency operators were learned by dependency parsing the British National Corpus7. |
Experiments | One major reason is that many features that are predictive for within-sentence instances are no longer applicable (e.g., Dependency parse features). |
Method | (2009), we extract the following three types of features: (1) pairs of words, one from SL and one from SR, as originally proposed by Marcu and Echihabi (2002); (2) dependency parse features in S L, S R, or both; and (3) syntactic production rules in S L, S R, or both. |
Related work | (2009) attempted to recognize implicit discourse relations (discourse relations which are not signaled by explicit connectives) in PDTB by using four classes of features — contextual features, constituent parse features, dependency parse features, and lexical features — and explored their individual influence on performance. |
Features | While spanning trees are familiar from non-projective dependency parsing , features based on the linear order of the words or on lexical identi- |
Features | ties or syntactic word classes, which are primary drivers for dependency parsing , are mostly uninformative for taxonomy induction. |
Structured Taxonomy Induction | Note that finding taxonomy trees is a structurally identical problem to directed spanning trees (and thereby non-proj ective dependency parsing ), for which belief propagation has previously been worked out in depth (Smith and Eisner, 2008). |