Experiment and Results | It consists of a set of unordered labelled syntactic dependency trees whose nodes are labelled with word forms, part of speech categories, partial morphosyntactic information such as tense and number and, in some cases, a sense tag identifier. |
Experiment and Results | The chunking was performed by retrieving from the Penn Treebank (PTB), for each phrase type, the yields of the constituents of that type and by using the alignment between words and dependency tree nodes provided by the organisers of the SR Task. |
Experiment and Results | Using this chunked data, we then ran the generator on the corresponding SR Task dependency trees and stored separately, the input dependency trees for which generation succeeded and the input dependency trees for which generation failed. |
Introduction | Dependency Trees |
Introduction | For instance, when generating sentences from dependency trees , as was proposed recently in the Generation Challenge Surface Realisation Task (SR Task, (Belz et al., 2011)), it would be useful to be able to apply error mining on the input trees to find the most likely causes of generation failure. |
Introduction | We adapt an existing algorithm for tree mining which we then use to mine the Generation Challenge dependency trees and identify the most likely causes of generation failure. |
Mining Dependency Trees | First, dependency trees are converted to Breadth—First Canonical Form whereby lexicographic order can apply to the word forms labelling tree nodes, to their part of speech, to their dependency relation or to any combination thereof3. |
Mining Dependency Trees | 3For convenience, the dependency relation labelling the edges of dependency trees is brought down to the daughter node of the edge. |
Mining Trees | In the next section, we will show how to modify this algorithm to mine for errors in dependency trees . |
Abstract | We annotate a corpus of children’s stories with temporal dependency trees , achieving agreement (Krippendorff’s Alpha) of 0.856 on the event words, 0.822 on the links between events, and of 0.700 on the ordering relation labels. |
Evaluations | For temporal dependency trees , we assume each operation costs 1.0. |
Evaluations | It has been argued that graph-based models like the maximum spanning tree parser should be able to produce more globally consistent and correct dependency trees , yet we do not observe that here. |
Introduction | The temporal language in a text often fails to specify a total ordering over all the events, so we annotate the timelines as temporal dependency structures, where each event is a node in the dependency tree , and each edge between nodes represents a temporal ordering relation such as BEFORE, AFTER, OVERLAP 01‘ IDENTITY. |
Introduction | W6 construct an evaluation corpus by annotating such temporal dependency trees over a set of children’s stories. |
Introduction | 0 We propose a new approach to characterizing temporal structure via dependency trees . |
Parsing Models | .70” is a sequence of event words, and 7r 6 H is a dependency tree 7r 2 (V, E) where: |
Parsing Models | o TREE E (OF —> H) is a function that extracts a dependency tree 7r from a final parser state CF |
Parsing Models | o c 2 (L1, L2, Q, E) is a parser configuration, where L1 and L2 are lists for temporary storage, Q is the queue of input words, and E is the set of identified edges of the dependency tree . |
Gaining Dependency Structures | 2.1 Dependency tree |
Gaining Dependency Structures | We follow the definition of dependency graph and dependency tree as given in (McDonald and Nivre, 2011). |
Gaining Dependency Structures | A dependency graph G for sentence 3 is called a dependency tree when it satisfies, (1) the nodes cover all the words in s besides the ROOT; (2) one node can have one and only one head (word) with a determined syntactic role; and (3) the ROOT of the graph is reachable from all other nodes. |
Introduction | For example, using the constituent-to-dependency conversion approach proposed by J ohansson and Nugues (2007), we can easily yield dependency trees from PCFG style trees. |
Discussion on Related Work | (2010) also mentioned their method yielded no improvement when applied to dependency trees in their initial experiments. |
Discussion on Related Work | Genzel (2010) dealt with the data sparseness problem by using window heuristic, and learned reordering pattern sequence from dependency trees . |
Experiments | For English, we train a dependency parser as (Nivre and Scholz, 2004) on WSJ portion of Penn Tree-bank, which are converted to dependency trees using Stanford Parser (Marneffe et al., 2006). |
Ranking Model Training | 2In our experiments, there are nodes with more than 10 children for English dependency trees . |
Ranking Model Training | For English-to-Japanese task, we extract features from Stanford English Dependency Tree (Marneffe et al., 2006), including lexicons, Part-of-Speech tags, dependency labels, punc-tuations and tree distance between head and dependent. |
Ranking Model Training | For J apanese-to-English task, we use a chunk-based Japanese dependency tree (Kudo and Matsumoto, 2002). |
Word Reordering as Syntax Tree Node Ranking | We use children to denote direct descendants of tree nodes for constituent trees; while for dependency trees , children of a node include not only all direct dependents, but also the head word itself. |
Word Reordering as Syntax Tree Node Ranking | Constituent tree is shown above the source sentence; arrows below the source sentences show head-dependent arcs for dependency tree ; word alignment links are lines without arrow between the source and target sentences. |
Word Reordering as Syntax Tree Node Ranking | For example, consider the node rooted at trying in the dependency tree in Figure 1. |
Dependency language model | (2008), to score entire dependency trees . |
Dependency language model | Let y be a dependency tree for ac and H be a set that includes the words that have at least one dependent. |
Dependency language model | For a dependency tree , we calculate the probability as follows: |
Implementation Details | Given the dependency trees , we estimate the probability distribution by relative frequency: |
Introduction | The basic idea behind is that we use the DLM to evaluate whether a valid dependency tree (McDonald and Nivre, 2007) |
Introduction | The parsing model searches for the final dependency trees by considering the original scores and the scores of DLM. |
Parsing with dependency language model | Let T(Gx) be the set of all the subgraphs of Ga; that are valid dependency trees (McDonald and Nivre, 2007) for sentence :10. |
Parsing with dependency language model | The formulation defines the score of a dependency tree y E T(Gx) to be the sum of the edge scores, |
Introduction | Figure 1 shows two dependency trees for the sentence “the camera is great” in the camera domain and the sentence “the movie is excellent” in the movie domain, respectively. |
Introduction | Figure 1: Examples of dependency tree structure. |
Introduction | More specifically, we use the shortest path between a topic word and a sentiment word in the corresponding dependency tree to denote the relation between them. |
Definition of Dependency Graph | If a projective dependency graph is connected, we call it a dependency tree , and if not, a dependency forest. |
Related Work | Head-comer parsing algorithm (Kay, 1989) creates dependency tree top-down, and in this our algorithm has similar spirit to it. |
Weighted Parsing Model | where t’ is a POS-tag, Tree is a correct dependency tree which eXists in Corpus, a function lmdescendant(Tree, t’) returns the set of the leftmost descendant node 1d of each nodes in Tree whose POS-tag is t’, and ld.t denotes a POS-tag of 1d. |
Dependency Parsing | Given an input sentence x = wowl...wn and its POS tag sequence 1; = totl...tn, the goal of dependency parsing is to build a dependency tree as depicted in Figure l, denoted by d = {(h, m, l) : 0 g h 3 72,0 < m g n,l E L}, where (h,m, l) indicates an directed arc from the head word (also called father) w, to the modifier (also called child or dependent) wm with a dependency label l, and L is the label set. |
Dependency Parsing | To guarantee the efficiency of the decoding algorithms, the score of a dependency tree is factored into the scores of some small parts (subtrees). |
Dependency Parsing with QG Features | During both the training and test phases, the target parser are inspired by the source annotations, and the score of a target dependency tree becomes |
Experiments | The first column is the dependency parser with supervised training only and the last column is the constituent parser (after converting to dependency trees .) |
Incorporating Syntactic Structures | Dependency trees are built by processing the words left-to-right and the classifier assigns a distribution over the actions at each step. |
Syntactic Language Models | The baseline score b(w, a) can be a feature, yielding the dot product notation: S(w,a) = (a,<l>(a,w,sl,...,sm)) Our LM uses features from the dependency tree and part of speech (POS) tag sequence. |