Abstract | We thus propose to combine the advantages of both, and present a novel constituency-to-dependency translation model, which uses constituency forests on the source side to direct the translation, and dependency trees on the target side (as a language model) to ensure grammaticality. |
Introduction | a novel constituency-to-dependency model, which uses constituency forests on the source side to direct translation, and dependency trees on the target side to guarantee grammaticality of the output. |
Introduction | Our new constituency-to-dependency model (Section 2) extracts rules from word-aligned pairs of source constituency forests and target dependency trees (Section 3), and translates source constituency forests into target dependency trees with a set of features (Section 4). |
Model | Figure 1 shows a word-aligned source constituency forest FC and target dependency tree De, our constituency to dependency translation model can be formalized as: |
Model | 2.2 Dependency Trees on the Target Side |
Model | A dependency tree for a sentence represents each word and its syntactic dependents through directed arcs, as shown in the following examples. |
Abstract | In this paper, we present a transition system for 2-planar dependency trees — trees that can be decomposed into at most two planar graphs — and show that it can be used to implement a classifier-based parser that runs in linear time and outperforms a state-of-the-art transition-based parser on four data sets from the CoNLL-X shared task. |
Introduction | One of the unresolved issues in this area is the proper treatment of non-projective dependency trees , which seem to be required for an adequate representation of predicate-argument structure, but which undermine the efficiency of dependency parsing (Neuhaus and Broker, 1997; Buch-Kromann, 2006; McDonald and Satta, 2007). |
Introduction | (2009) have shown how well-nested dependency trees with bounded gap degree can be parsed in polynomial time, the best time complexity for lexicalized parsing of this class remains a prohibitive 0(n7), which makes the practical usefulness questionable. |
Introduction | In this paper, we explore another characterization of mildly non-projective dependency trees based on the notion of multiplanarity. |
Preliminaries | Such a forest is called a dependency tree . |
Preliminaries | Projective dependency trees correspond to the set of structures that can be induced from lexicalised context-free derivations (Kuhlmann, 2007; Gaif-man, 1965). |
Preliminaries | Like context-free grammars, projective dependency trees are not sufficient to represent all the linguistic phenomena observed in natural languages, but they have the advantage of being efficiently parsable: their parsing problem can be solved in cubic time with chart parsing techniques (Eisner, 1996; Gomez-Rodriguez et al., 2008), while in the case of general non-projective dependency forests, it is only tractable under strong independence assumptions (McDonald et al., 2005b; McDonald and Satta, 2007). |
Boosting an MST Parser | As described previously, the score of a dependency tree given by a word-pair classifier can be factored into each candidate dependency edge in this tree. |
Experiments | The constituent trees in the two treebanks are transformed to dependency trees according to the head-finding rules of Yamada and Matsumoto (2003). |
Experiments | For a dependency tree with n words, only n —1 positive dependency instances can be extracted. |
Experiments | The English sentences are then parsed by an implementation of 2nd-ordered MST model of McDonald and Pereira (2006), which is trained on dependency trees extracted from WSJ. |
Related Works | Jiang and Liu (2009) refer to alignment matrix and a dynamic programming search algorithm to obtain better projected dependency trees . |
Related Works | Because of the free translation, the word alignment errors, and the heterogeneity between two languages, it is reluctant and less effective to project the dependency tree completely to the target language sentence. |
Word-Pair Classification Model | y denotes the dependency tree for sentence x, and (i, j) E y represents a dependency edge from word :10,- to word as], where :10, is the parent of ccj. |
Word-Pair Classification Model | Follow the edge based factorization method (Eisner, 1996), we factorize the score of a dependency tree s(x, y) into its dependency edges, and design a dynamic programming algorithm to search for the candidate parse with maximum score. |
Word-Pair Classification Model | Where 3/ is searched from the set of well-formed dependency trees . |
Dependency parsing | A complete analysis of a sentence is given by a dependency tree : a set of dependencies that forms a rooted, directed tree spanning the words of the sentence. |
Dependency parsing | Every dependency tree is rooted at a special “*” token, allowing the |
Dependency parsing | A common strategy, and one which forms the focus of this paper, is to factor each dependency tree into small parts, which can be scored in isolation. |
Existing parsing algorithms | The first type of parser we describe uses a “first-order” factorization, which decomposes a dependency tree into its individual dependencies. |
Introduction | These parsing algorithms share an important characteristic: they factor dependency trees into sets of parts that have limited interactions. |
Introduction | By exploiting the additional constraints arising from the factorization, maximizations or summations over the set of possible dependency trees can be performed efficiently and exactly. |
Introduction | A crucial limitation of factored parsing algorithms is that the associated parts are typically quite small, losing much of the contextual information within the dependency tree . |
New third-order parsing algorithms | The first parser, Model 0, factors each dependency tree into a set of grandchild parts—pairs of dependencies connected head-to-tail. |
Background 2.1 Ontology Learning | It can be viewed as a structured prediction problem, where a semantic parse is formed by partitioning the input sentence (or a syntactic analysis such as a dependency tree ) into meaning units and assigning each unit to the logical form representing an entity or relation (Figure 1). |
Background 2.1 Ontology Learning | Bottom: a semantic parse consists of a partition of the dependency tree and an assignment of its parts. |
Background 2.1 Ontology Learning | Recently, we developed the USP system (Poon and Domingos, 2009), the first unsupervised approach for semantic parsing.2 USP inputs dependency trees of sentences and first transforms them into quasi-logical forms (QLFs) by converting each node to a unary atom and each dependency edge to a binary atom (e.g., the node for “induces” becomes induces(e1) and the subject dependency becomes nsubj(e1, e2), where ei’s are Skolem constants indexed by the nodes. |
Experiments | USP (Poon and Domingos, 2009) parses the input text using the Stanford dependency parser (Klein and Manning, 2003; de Marneffe et al., 2006), learns an MLN for semantic parsing from the dependency trees , and outputs this MLN and the MAP semantic parses of the input sentences. |
Unsupervised Ontology Induction with Markov Logic | Given the dependency tree T of a sentence, the conditional probability of a semantic parse L is given by P7“(L|T) oc exp (2, wini(T, The MAP semantic parse is simply |
Unsupervised Ontology Induction with Markov Logic | OntoUSP uses the same learning objective as USP, i.e., to find parameters 6 that maximizes the log-likelihood of observing the dependency trees T, summing out the unobserved semantic parses L: |
Experiments: Ranking Paraphrases | We use a vector model based on dependency trees obtained from parsing the English Gigaword corpus (LDC2003T05). |
Experiments: Ranking Paraphrases | We modify the dependency trees by “folding” prepositions into the edge labels to make the relation between a head word and the head noun of a prepositional phrase explicit. |
Related Work | Figure l: Co-occurrence graph of a small sample corpus of dependency trees . |
The model | Figure 1 shows the co-occurrence graph of a small sample corpus of dependency trees : Words are represented as nodes in the graph, possible dependency relations between them are drawn as labeled edges, with weights corresponding to the observed frequencies. |
The model | In the simplest case, a) would denote the frequency in a corpus of dependency trees of w occurring together with w’ in relation r. In the experiments reported below, we use pointwise mutual information (Church and Hanks, 1990) instead as it proved superior to |
Analysis Scheme | Specifically, we assume MINIPAR-style (Lin, 1993) dependency trees where nodes represent text expressions and edges represent the syntactic relations between them. |
Analysis Scheme | Dependency trees are a popular choice in RTE since they offer a fairly semantics-oriented account of the sentence structure that can still be constructed robustly. |
Integrating Discourse References into Entailment Recognition | Transformations create revised trees that cover previously uncovered target components in H. The output of each transformation, T1, is comprised of copies of the components used to construct it, and is appended to the discourse forest, which includes the dependency trees of all sentences and their generated consequents. |
Integrating Discourse References into Entailment Recognition | We assume that we have access to a dependency tree for H, a dependency forest for T and its discourse context, as well as the output of a perfect discourse processor, i.e., a complete set of both coreference and bridging relations, including the type of bridging relation (e. g. part-0f, cause). |