Abstract | In this paper, we investigate the problem of character-level Chinese dependency parsing , building dependency trees over characters. |
Abstract | Character-level information can benefit downstream applications by offering flexible granularities for word segmentation while improving word-level dependency parsing accuracies. |
Abstract | We present novel adaptations of two major shift-reduce dependency parsing algorithms to character-level parsing. |
Character-Level Dependency Tree | We differentiate intra-word dependencies and inter-word dependencies by the arc type, so that our work can be compared with conventional word segmentation, POS-tagging and dependency parsing pipelines under a canonical segmentation standard. |
Introduction | Such annotations enable dependency parsing on the character level, building dependency trees over Chinese characters. |
Introduction | Character-level dependency parsing is interesting in at least two aspects. |
Introduction | In this paper, we make an investigation of character-level Chinese dependency parsing using Zhang et al. |
Experiments | There is a dual effect of the increase of the parameter k to our transition-based dependency parser . |
GB-grounded GR Extraction | Allowing non-projective dependencies generally makes parsing either by graph-based or transition-based dependency parsing harder. |
Introduction | Different from popular shallow dependency parsing that focus on tree-shaped structures, our GR annotations are represented as general directed graphs that express not only 10-cal but also various long-distance dependencies, such as coordinations, control/raising constructions, topicalization, relative clauses and many other complicated linguistic phenomena that goes beyond shallow syntax (see Fig. |
Introduction | Previous work on dependency parsing mainly focused on structures that can be represented in terms of directed trees. |
Transition-based GR Parsing | The availability of large-scale treebanks has contributed to the blossoming of statistical approaches to build accurate shallow constituency and dependency parsers . |
Transition-based GR Parsing | In particular, transition-based dependency parsing method is studied. |
Transition-based GR Parsing | 3.1 Data-Driven Dependency Parsing |
Abstract | We present a novel approach for inducing unsupervised dependency parsers for languages that have no labeled training data, but have translated text in a resource-rich language. |
Abstract | Our method can be used as a purely monolingual dependency parser , requiring no human translations for the test data, thus making it applicable to a Wide range of resource-poor languages. |
Introduction | In recent years, dependency parsing has gained universal interest due to its usefulness in a Wide range of applications such as synonym generation (Shinyama et al., 2002), relation extraction (Nguyen et al., 2009) and machine translation (Katz-Brown et al., 2011; Xie et al., 2011). |
Introduction | Several supervised dependency parsing algorithms (Nivre and Scholz, 2004; McDonald et al., 2005a; McDonald et al., 2005b; McDonald and Pereira, 2006; Carreras, 2007; K00 and Collins, 2010; Ma and Zhao, 2012; Zhang et al., 2013) have been proposed and achieved high parsing accuracies on several treebanks, due in large part to the availability of dependency treebanks in a number of languages (McDonald et al., 2013). |
Introduction | (2011) proposed an approach for unsupervised dependency parsing with nonparallel multilingual guidance from one or more helper languages, in which parallel data is not used. |
Our Approach | The focus of this work is on building dependency parsers for target languages, assuming that an accurate English dependency parser and some parallel text between the two languages are available. |
Our Approach | The probabilistic model for dependency parsing defines a family of conditional probability p over all y given sentence :13, with a log-linear form: |
Our Approach | One of the most common model training methods for supervised dependency parser is Maximum conditional likelihood estimation. |
Abstract | This paper proposes a simple yet effective framework for semi-supervised dependency parsing at entire tree level, referred to as ambiguity-aware ensemble training. |
Abstract | With a conditional random field based probabilistic dependency parser , our training objective is to maximize mixed likelihood of labeled data and auto-parsed unlabeled data with ambiguous labelings. |
Introduction | Supervised dependency parsing has made great progress during the past decade. |
Introduction | A few effective learning methods are also proposed for dependency parsing to implicitly utilize distributions on unlabeled data (Smith and Eisner, 2007; Wang et al., 2008; Suzuki et al., 2009). |
Introduction | However, these methods gain limited success in dependency parsing . |
Abstract | The state-of—the-art dependency parsing techniques, the Eisner algorithm and maximum spanning tree (MST) algorithm, are adopted to parse an optimal discourse dependency tree based on the arc-factored model and the large—margin learning techniques. |
Abstract | Experiments show that our discourse dependency parsers achieve a competitive performance on text-level discourse parsing. |
Add arc <eC,ej> to GC with | Referring to the evaluation of syntactic dependency parsing, |
Discourse Dependency Parsing | The goal of discourse dependency parsing is to parse an optimal spanning tree from V><R><V_0. |
Discourse Dependency Parsing | It is well known that projective dependency parsing can be handled with the Eisner algorithm (1996) which is based on the bottom-up dynamic programming techniques with the time complexity of 0(n3). |
Discourse Dependency Parsing | Following the work of McDonald (2005b), we formalize discourse dependency parsing as searching for a maximum spanning tree (MST) in a directed graph. |
Discourse Dependency Structure and Tree Bank | To the best of our knowledge, we are the first to apply the dependency structure and introduce the dependency parsing techniques into discourse analysis. |
Discourse Dependency Structure and Tree Bank | To automatically conduct discourse dependency parsing , constructing a discourse dependency treebank is fundamental. |
Introduction | Dependency Parsing |
Introduction | Since dependency trees contain much fewer nodes and on average they are simpler than constituency based trees, the current dependency parsers can have a relatively low computational complexity. |
Introduction | In our work, we adopt the graph based dependency parsing techniques learned from large sets of annotated dependency trees. |
Abstract | Accurate scoring of syntactic structures such as head-modifier arcs in dependency parsing typically requires rich, high-dimensional feature representations. |
Experimental Setup | Methods We compare our model to MST and Turbo parsers on non-projective dependency parsing . |
Introduction | Traditionally, parsing research has focused on modeling the direct connection between the features and the predicted syntactic relations such as head-modifier (arc) relations in dependency parsing . |
Introduction | We implement the low-rank factorization model in the context of first— and third-order dependency parsing . |
Problem Formulation | We will commence here by casting first-order dependency parsing as a tensor estimation problem. |
Problem Formulation | We will start by introducing the notation used in the paper, followed by a more formal description of our dependency parsing task. |
Problem Formulation | 3.2 Dependency Parsing |
Related Work | Selecting Features for Dependency Parsing A great deal of parsing research has been dedicated to feature engineering (Lazaridou et al., 2013; Marton et al., 2010; Marton et al., 2011). |
Related Work | Embedding for Dependency Parsing A lot of recent work has been done on mapping words into vector spaces (Collobert and Weston, 2008; Turian et al., 2010; Dhillon et al., 2011; Mikolov et al., 2013). |
Approaches | A typical pipeline consists of a POS tagger, dependency parser , and semantic role labeler. |
Approaches | Brown clusters have been used to good effect for various NLP tasks such as named entity recognition (Miller et al., 2004) and dependency parsing (Koo et al., 2008; Spitkovsky et al., 2011). |
Approaches | Unlike syntactic dependency parsing , the graph is not required to be a tree, nor even a connected graph. |
Introduction | 0 Simpler joint CRF for syntactic and semantic dependency parsing than previously reported. |
Related Work | (2013) extend this idea by coupling predictions of a dependency parser with predictions from a semantic role labeler. |
Related Work | (2012) marginalize over latent syntactic dependency parses . |
Related Work | Recent work in fully unsupervised dependency parsing has supplanted these methods with even higher accuracies (Spitkovsky et al., 2013) by arranging optimiz-ers into networks that suggest informed restarts based on previously identified local optima. |
Abstract | In this paper, we investigate various strategies to predict both syntactic dependency parsing and contiguous multiword expression (MWE) recognition, testing them on the dependency version of French Treebank (Abeille and Barrier, 2004), as instantiated in the SPMRL Shared Task (Seddah et al., 2013). |
Architectures for MWE Analysis and Parsing | The architectures we investigated vary depending on whether the MWE status of sequences of tokens is predicted via dependency parsing or via an external tool (described in section 5), and this dichotomy applies both to structured MWEs and flat MWEs. |
Architectures for MWE Analysis and Parsing | 0 IRREG—BY—PARSER: the MWE status, flat topology and POS are all predicted via dependency parsing , using representations for training and parsing, with all information for irregular MWEs encoded in topology and labels (as for in vain in Figure 2). |
Architectures for MWE Analysis and Parsing | 0 REG-BY—PARSER: all regular MWE information (topology, status, P08) is predicted Via dependency parsing , using representations with all information for regular MWEs encoded in topology and labels (Figure 2). |
Data: MWEs in Dependency Trees | (2013), who experimented joint dependency parsing and light verb construction identification. |
Data: MWEs in Dependency Trees | In some experiments, we make use of alternative representations, which we refer later as “labeled representation”, in which the MWE features are incorporated in the dependency labels, so that MWE composition and/or the POS of the MWE be totally contained in the tree topology and labels, and thus predictable via dependency parsing . |
Related work | It is a less language-specific system that reranks n-best dependency parses from 3 parsers, informed with features from predicted constituency trees. |
Use of external MWE resources | Both resources help to predict MWE-specific features (section 5.3) to guide the MWE-aware dependency parser . |
Use of external MWE resources | MWE lexicons are exploited as sources of features for both the dependency parser and the external MWE analyzer. |
Use of external MWE resources | Flat MWE features: MWE information can be integrated as features to be used by the dependency parser . |
Abstract | This paper introduces a novel pre-ordering approach based on dependency parsing for Chinese-English SMT. |
Dependency-based Pre-ordering Rule Set | Figure 1 shows a constituent parse tree and its Stanford typed dependency parse tree for the same |
Dependency-based Pre-ordering Rule Set | As shown in the figure, the number of nodes in the dependency parse tree (i.e. |
Dependency-based Pre-ordering Rule Set | Because dependency parse trees are generally more concise than the constituent ones, they can conduct long-distance reorderings in a finer way. |
Introduction | Since dependency parsing is more concise than constituent parsing in describing sentences, some research has used dependency parsing in pre-ordering approaches for language pairs such as Arabic-English (Habash, 2007), and English-SOV languages (Xu et al., 2009; Katz-Brown et al., 2011). |
Introduction | In contrast, we propose a set of pre-ordering rules for dependency parsers . |
Introduction | (2007) exist, it is almost impossible to automatically convert their rules into rules that are applicable to dependency parsers . |
Abstract | This paper presents experiments with WordNet semantic classes to improve dependency parsing . |
Abstract | We study the effect of semantic classes in three dependency parsers , using two types of constituency-to-dependency conversions of the English Penn Treebank. |
Experimental Framework | We have made use of three parsers representative of successful paradigms in dependency parsing . |
Introduction | This work presents a set of experiments to investigate the use of lexical semantic information in dependency parsing of English. |
Introduction | We will apply different types of semantic information to three dependency parsers . |
Introduction | dependency parsing ? |
Related work | (201 1) successfully introduced WordNet classes in a dependency parser , obtaining improvements on the full PTB using gold POS tags, trying different combinations of semantic classes. |
Related work | (2008) presented a semisupervised method for training dependency parsers , introducing features that incorporate word clusters automatically acquired from a large unannotated corpus. |
Related work | They demonstrated its effectiveness in dependency parsing experiments on the PTB and the Prague Dependency Treebank. |
Abstract | Much of the recent work on dependency parsing has been focused on solving inherent combinatorial problems associated with rich scoring functions. |
Conclusions | Our model achieves the best results on the standard dependency parsing benchmark, outperforming parsing methods with elaborate inference procedures. |
Experimental Setup | We apply the Random Walk-based sampling method (see Section 3.2.2) for the standard dependency parsing task. |
Introduction | Dependency parsing is commonly cast as a maximization problem over a parameterized scoring function. |
Related Work | Earlier works on dependency parsing focused on inference with tractable scoring functions. |
Sampling-Based Dependency Parsing with Global Features | In this section, we introduce our novel sampling-based dependency parser which can incorporate |
Evaluation | The news have been processed with a to-kenizer, a sentence splitter (Gillick and Favre, 2009), a part-of-speech tagger and dependency parser (Nivre, 2006), a co-reference resolution module (Haghighi and Klein, 2009) and an entity linker based on Wikipedia and Freebase (Milne and Witten, 2008). |
Heuristics-based pattern extraction | (2013), who built well formed relational patterns by extending minimum spanning trees (MST) which connect entity mentions in a dependency parse . |
Introduction | Algorithm 1 HEURISTICEXTRACTOR(T, E): heuristically extract relational patterns for the dependency parse T and the set of entities E. 1: /* Global constants /* 2: global Vp, VC,N10,NC 3: Vc <— {subj,nsubj,nsubjpass,dobj,iobj,a:comp, 4: acomp, ewpl, neg, aux, attr, prt} 5: Vp <— {azcomp} 6 7 8 9 |
Memory-based pattern extraction | To this end, we build a trie of dependency trees (which we call a tree-trie) by scanning all the dependency parses in the news training |
Memory-based pattern extraction | As a result, we are able to load a trie encoding 400M input dependency parses , 170M distinct nodes and 48M distinct sentence structures in under 10GB of RAM. |
Pattern extraction by sentence compression | Instead, we chose to modify the method of Filippova and Altun (2013) because it relies on dependency parse trees and does not use any LM scoring. |
Introduction | To solve the latter problem, we introduce an apparently novel O(|V|2 log algorithm that is similar to the maximum spanning tree (MST) algorithms that are widely used for dependency parsing (McDonald et al., 2005). |
Notation and Overview | The relation identification stage (§4) is similar to a graph-based dependency parser . |
Notation and Overview | Each stage is a discriminatively-trained linear structured predictor with rich features that make use of part-of-speech tagging, named entity tagging, and dependency parsing . |
Related Work | Our approach to relation identification is inspired by graph-based techniques for non-projective syntactic dependency parsing . |
Related Work | Minimum spanning tree algorithms—specifically, the optimum branching algorithm of Chu and Liu (1965) and Edmonds (1967)—were first used for dependency parsing by McDonald et a1. |
Conclusion | However, the failure to uncover gains when searching across a variety of possible mechanisms for improvement, training procedures for embeddings, hyperparam-eter settings, tasks, and resource scenarios suggests that these gains (if they do exist) are extremely sensitive to these training conditions, and not nearly as accessible as they seem to be in dependency parsers . |
Conclusion | Indeed, our results suggest a hypothesis that word embeddings are useful for dependency parsing (and perhaps other tasks) because they provide a level of syntactic abstraction which is explicitly annotated in constituency parses. |
Introduction | Dependency parsers have seen gains from distributional statistics in the form of discrete word clusters (Koo et al., 2008), and recent work (Bansal et al., 2014) suggests that similar gains can be derived from embeddings like the ones used in this paper. |
Introduction | The fact that word embedding features result in nontrivial gains for discriminative dependency parsing (Bansal et al., 2014), but do not appear to be effective for constituency parsing, points to an interesting structural difference between the two tasks. |
Introduction | We hypothesize that dependency parsers benefit from the introduction of features (like clusters and embeddings) that provide syntactic abstractions; but that constituency parsers already have access to such abstractions in the form of supervised preterminal tags. |
Experiments | A tweet-specific tokenizer (Gimpel et al., 2011) is employed, and the dependency parsing results are computed by Stanford Parser (Klein and Manning, 2003). |
Experiments | The POS tagging and dependency parsing results are not precise enough for the Twitter data, so these handcrafted rules are rarely matched. |
Introduction | (2011) combine the target-independent features (content and lexicon) and target-dependent features (rules based on the dependency parsing results) together in subjectivity classification and polarity classification for tweets. |
Our Approach | We use the dependency parsing results to find the words syntactically connected with the interested target. |
Our Approach | In Section 3.1, we show how to build recursive structure for target using the dependency parsing results. |
Experiments | Since our system uses an off-the-shelf dependency parser , and semantic representations are obtained from simple rule-based conversion from dependency trees, there will be only one (right or wrong) interpretation in face of ambiguous sentences. |
Generating On-the-fly Knowledge | For a TH pair, apply dependency parsing and coreference resolution. |
Generating On-the-fly Knowledge | Perform rule-based conversion from dependency parses to DCS trees, which are translated to statements on abstract denotations. |
The Idea | To obtain DCS trees from natural language, we use Stanford CoreNLP5 for dependency parsing (Socher et al., 2013), and convert Stanford dependencies to DCS trees by pattern matching on POS tags and dependency labels.6 Currently we use the following semantic roles: ARG, SUBJ, OBJ, IOBJ, TIME and MOD. |
Experiments | The grammar for ASP contains the annotated lexicon entries and grammar rules in Sections 02-21 of CCGbank, and additional semantic entries produced using a set of dependency parse heuristics. |
Experiments | These entries are instantiated using a set of dependency parse patterns, listed in an online appendix.2 These patterns are applied to the training corpus, heuristically identifying verbs, prepositions, and possessives that express relations, and nouns that express categories. |
Experiments | This approach trains a semantic parser by combining distant semantic supervision with syntactic supervision from dependency parses . |
Experiments | Gold dependency parses were approximated by running the Stanford dependency parser7 over reference compressions. |
Introduction | Following an assumption often used in compression systems, the compressed output in this corpus is constructed by dropping tokens from the input sentence without any paraphrasing or reordering.1 A number of diverse approaches have been proposed for deletion-based sentence compression, including techniques that assemble the output text under an n-gram factorization over the input text (McDonald, 2006; Clarke and Lapata, 2008) or an arc factorization over input dependency parses (Filippova and Strube, 2008; Galanis and Androutsopoulos, 2010; Filippova and Altun, 2013). |
Introduction | Maximum spanning tree algorithms, commonly used in non-projective dependency parsing (McDonald et al., 2005), are not easily adaptable to this task since the maximum-weight subtree is not necessarily a part of the maximum spanning tree. |
Multi-Structure Sentence Compression | C. In addition, we define bigram indicator variables yij E {0, l} to represent whether a particular order-preserving bigram2 (ti, tj> from S is present as a contiguous bigram in C as well as dependency indicator variables zij E {0, 1} corresponding to whether the dependency arc ti —> 253- is present in the dependency parse of C. The score for a given compression 0 can now be defined to factor over its tokens, n-grams and dependencies as follows. |
Conclusion | For instance, Kawahara and Kurohashi (2006) improved accuracy of dependency parsing based on Japanese semantic frames automatically induced from a raw corpus. |
Our Approach | 1. apply dependency parsing to a raw corpus and extract predicate-argument structures for each verb from the automatic parses, |
Our Approach | We apply dependency parsing to a large raw corpus. |
Our Approach | Then, we extract predicate-argument structures from the dependency parses . |
Generating summary from nested tree | After the document tree is obtained, we use a dependency parser to obtain the syntactic dependency trees of sentences. |
Introduction | We propose a method of summarizing a single document that utilizes dependency between sentences obtained from rhetorical structures and dependency between words obtained from a dependency parser . |
Introduction | The sentence tree is a tree that has words as nodes and head modifier relationships between words obtained by the dependency parser as edges. |
Features | While spanning trees are familiar from non-projective dependency parsing , features based on the linear order of the words or on lexical identi- |
Features | ties or syntactic word classes, which are primary drivers for dependency parsing , are mostly uninformative for taxonomy induction. |
Structured Taxonomy Induction | Note that finding taxonomy trees is a structurally identical problem to directed spanning trees (and thereby non-proj ective dependency parsing ), for which belief propagation has previously been worked out in depth (Smith and Eisner, 2008). |