Conclusion | We also discussed the effectiveness of sentence subtree selection that did not restrict rooted subtrees . |
Experiment | Rooted sentence subtree only selects rooted sentence subtrees 2. |
Experiment | As we can see, subtree selection only selected important subtrees that did not include the parser’s root, e.g., purpose-clauses and that-clauses. |
Generating summary from nested tree | In particular, we extract a rooted document subtree from the document tree, and sentence subtrees from sentence trees in the document tree. |
Generating summary from nested tree | to extract non-rooted sentence subtrees , as we previously mentioned. |
Generating summary from nested tree | Constraints (6)-(10) allow the model to extract subtrees that have an arbitrary root node. |
Introduction | Our method jointly utilizes relations between sentences and relations between words, and extracts a rooted document subtree from a document tree whose nodes are arbitrary subtrees of the sentence tree. |
Related work | However, these studies have only extracted rooted subtrees from sentences. |
Related work | The method of Filippova and Strube (2008) allows the model to extract non-rooted subtrees in sentence compression tasks that compress a single sentence with a given compression ratio. |
Introduction | Therefore, it runs in linear time and can take advantage of arbitrarily complex structural features from already constructed subtrees . |
Joint POS Tagging and Parsing with Nonlocal Features | Assuming an input sentence contains n words, in order to reach a terminal state, the initial state requires n sh—x actions to consume all words in 6, and n — l rl/rr—x actions to construct a complete parse tree by consuming all the subtrees in 0. |
Joint POS Tagging and Parsing with Nonlocal Features | One advantage of transition-based constituent parsing is that it is capable of incorporating arbitrarily complex structural features from the already constructed subtrees in 0 and unprocessed words in 6. |
Joint POS Tagging and Parsing with Nonlocal Features | Instead, we attempt to extract nonlocal features from newly constructed subtrees during the decoding process as they become incrementally available and score newly generated parser states with them. |
Transition-based Constituent Parsing | A parser state 8 E S is defined as a tuple s = (o, 6), where o is a stack which is maintained to hold partial subtrees that are already constructed, and 6 is a queue which is used for storing word-POS pairs that remain unprocessed. |
Transition-based Constituent Parsing | 0 REDUCE-BINARY—{L/R}-X (rl/rr—X): pop the top two subtrees from 0, combine them into a new tree with a node labeled with X, then push the new subtree back onto 0. |
Shift-Reduce with Beam-Search | Following Zhang and Clark (2011), we define each item in the parser as a pair (3, q), where q is a queue of remaining input, consisting of words and a set of possible lexical categories for each word (with go being the front word), and s is the stack that holds subtrees so, 31, (with so at the top). |
Shift-Reduce with Beam-Search | Subtrees on the stack are partial deriva- |
The Dependency Model | If |s| > 0 and the subtrees on s can lead to a correct derivation in (D9 using further actions, we say 3 is a partial-realization of G, denoted as s N G. And we define s N G for |s| = 0. |
The Dependency Model | 2a; then a stack containing the two subtrees in Fig. |
The Dependency Model | 3a is a partial-realization, while a stack containing the three subtrees in Fig. |
Introduction | They define distributions over the trees specified by a context-free grammar, but unlike probabilistic context-free grammars, they “learn” distributions over the possible subtrees of a user-specified set of “adapted” nonterminals. |
Introduction | set of parameters, if the set of possible subtrees of the adapted nonterminals is infinite). |
Introduction | Informally, Adaptor Grammars can be viewed as caching entire subtrees of the adapted nonterminals. |
Word segmentation with Adaptor Grammars | Because Word is an adapted nonterminal, the adaptor grammar memoises Word subtrees , which corresponds to learning the phone sequences for the words of the language. |
Supervised Dependency Parsing | Under the graph-based model, the score of a dependency tree is factored into the scores of small subtrees p. |
Supervised Dependency Parsing | Figure 2: Two types of scoring subtrees in our second-order graph-based parsers. |
Supervised Dependency Parsing | We adopt the second-order graph-based dependency parsing model of McDonald and Pereira (2006) as our core parser, which incorporates features from the two kinds of subtrees in Fig. |
Generating from the KBGen Knowledge-Base | First, the subtrees whose root node is indexed with an entity variable are extracted. |
Generating from the KBGen Knowledge-Base | Second, the subtrees capturing relations between variables are extracted. |
Generating from the KBGen Knowledge-Base | The minimal tree containing all and only the dependent variables D(X) of a variable X is then extracted and associated with the set of literals (I) such that (l) = {R(Y,Z) | (Y = X/\Z E D(X))\/(Y,Z E D(X This procedure extracts the subtrees relating the argument variables of a semantic func-tors such as an event or a role e.g., a tree describing a verb and its arguments as shown in the top |
Experiments | General setup: In order to test the accuracy of structured prediction on medium-sized full-domain taxonomies, we extracted from WordNet 3.0 all bottomed-out full subtrees which had a tree-height of 3 (i.e., 4 nodes from root to leaf), and contained (10, 50] terms.11 This gives us 761 non-overlapping trees, which we partition into |
Experiments | 13We tried this training regimen as different from that of the general setup (which contains only bottomed-out subtrees ), so as to match the animal test tree, which is of depth 12 and has intermediate nodes from higher up in WordNet. |
Experiments | For scaling the 2nd order sibling model, one can use approximations, e. g., pruning the set of sibling factors based on lst order link marginals, or a hierarchical coarse-to-fine approach based on taxonomy induction on subtrees , or a greedy approach of adding a few sibling factors at a time. |
Introduction | Note, question decomposition only operates on the original question and question spans covered by complete dependency subtrees . |
Introduction | 0 hsyntaasubtreefl), which counts the number of spans in Q that are (I) converted to formal triples, whose predicates are not Null, and (2) covered by complete dependency subtrees at the same time. |
Introduction | The underlying intuition is that, dependency subtrees of Q should be treated as units for question translation. |
Bottom-up tree-building | Rather, each relation node R j attempts to model the relation of one single constituent U], by taking U j’s left and right subtrees U j; and U LR as its first-layer nodes; if U j is a single EDU, then the first-layer node of R J- is simply U j, and R j is a special relation symbol LEAF3. |
Conclusions | In future work, we wish to further explore the idea of post-editing, since currently we use only the depth of the subtrees as upper-level information. |
Features | Substructure features: The root node of the left and right discourse subtrees of each unit. |
Our Discourse-Based Measures | In the present work, we use the convolution TK defined in (Collins and Duffy, 2001), which efficiently calculates the number of common subtrees in two trees. |
Our Discourse-Based Measures | Note that this kernel was originally designed for syntactic parsing, where the subtrees are subject to the constraint that their nodes are taken with either all or none of the children. |
Our Discourse-Based Measures | the nuclearity and the relations, in order to allow the tree kernel to give partial credit to subtrees that differ in labels but match in their skeletons. |
Discourse Dependency Parsing | The algorithm begins by initializing all length-one subtrees to a score of 0.0. |
Discourse Dependency Parsing | over all the internal indices (iSqu) in the span, and calculating the value of merging the two subtrees and adding one new arc. |
Discourse Dependency Parsing | This algorithm considers all the possible subtrees . |
Introduction | However, while the former problem can be solved efficiently using the dynamic programming approach of McDonald (2006), there are no efficient algorithms to recover maximum weighted non-projective subtrees in a general directed graph. |
Multi-Structure Sentence Compression | 2.4 Dependency subtrees |
Multi-Structure Sentence Compression | In addition, to avoid producing multiple disconnected subtrees , only one dependency is permitted to attach to the ROOT pseudo-token. |