Dependency-based Pre-ordering Rule Set | Figure 1 shows a constituent parse tree and its Stanford typed dependency parse tree for the same |
Dependency-based Pre-ordering Rule Set | 9) is much fewer than that in its corresponding constituent parse tree (i.e. |
Experiments | First, we converted the constituent parse trees in the results of the Berkeley Parser into dependency parse trees by employing a tool in the Stanford Parser (Klein and Manning, 2003). |
Experiments | In our opinion, the reason for the great decrease was that the dependency parse trees were more concise than the constituent parse trees in describing sentences and they could also describe the reordering at the sentence level in a finer way. |
Experiments | In contrast, the constituent parse trees were more redundant and they needed more nodes to conduct long-distance reordering. |
Introduction | Syntax-based pre-ordering by employing constituent parsing have demonstrated effectiveness in many language pairs, such as English-French (Xia and McCord, 2004), German-English (Collins et al., 2005), Chinese-English (Wang et al., 2007; Zhang et al., 2008), and English-Japanese (Lee et al., 2010). |
Introduction | Since dependency parsing is more concise than constituent parsing in describing sentences, some research has used dependency parsing in pre-ordering approaches for language pairs such as Arabic-English (Habash, 2007), and English-SOV languages (Xu et al., 2009; Katz-Brown et al., 2011). |
Introduction | They created a set of pre-ordering rules for constituent parsers for Chinese-English PBSMT. |
Abstract | We propose three improvements to address the drawbacks of state-of-the-art transition-based constituent parsers . |
Introduction | Constituent parsing is one of the most fundamental tasks in Natural Language Processing (NLP). |
Introduction | Transition-based constituent parsing (Sagae and Lavie, 2005; Wang et al., 2006; Zhang and Clark, 2009) is an attractive alternative. |
Introduction | However, there is still room for improvement for these state-of-the-art transition-based constituent parsers . |
Transition-based Constituent Parsing | This section describes the transition-based constituent parsing model, which is the basis of Section 3 and the baseline model in Section 4. |
Transition-based Constituent Parsing | 2.1 Transition-based Constituent Parsing Model |
Transition-based Constituent Parsing | A transition-based constituent parsing model is a quadruple C = (S, T, 30, St), where S is a set of parser states (sometimes called configurations), T is a finite set of actions, so is an initialization function to map each input sentence into a unique initial state, and St E S is a set of terminal states. |
Abstract | Do continuous word embeddings encode any useful information for constituency parsing ? |
Conclusion | It is important to emphasize that these results do not argue against the use of continuous representations in a parser’s state space, nor argue more generally that constituency parsers cannot possibly benefit from word embeddings. |
Conclusion | Indeed, our results suggest a hypothesis that word embeddings are useful for dependency parsing (and perhaps other tasks) because they provide a level of syntactic abstraction which is explicitly annotated in constituency parses . |
Introduction | This paper investigates a variety of ways in which word embeddings might augment a constituency parser with a discrete state space. |
Introduction | It has been less clear how (and indeed whether) word embeddings in and of themselves are useful for constituency parsing . |
Introduction | The fact that word embedding features result in nontrivial gains for discriminative dependency parsing (Bansal et al., 2014), but do not appear to be effective for constituency parsing , points to an interesting structural difference between the two tasks. |
Three possible benefits of word embeddings | We are interested in the question of whether a state-of-the-art discrete-variable constituency parser can be improved with word embeddings, and, more precisely, what aspect (or aspects) of the parser can be altered to make effective use of embeddings. |
Ambiguity-aware Ensemble Training | To construct parse forests for unlabeled data, we employ three diverse parsers, i.e., our baseline GParser, a transition-based parser (ZPar3) (Zhang and Nivre, 2011), and a generative constituent parser (Berkeley Parser4) (Petrov and Klein, 2007). |
Conclusions | For future work, among other possible extensions, we would like to see how our approach performs when employing more diverse parsers to compose the parse forest of higher quality for the unlabeled data, such as the easy-first nondirectional dependency parser (Goldberg and Elhadad, 2010) and other constituent parsers (Collins and Koo, 2005; Charniak and Johnson, 2005; Finkel et al., 2008). |
Experiments and Analysis | We believe the reason is that being a generative model designed for constituent parsing , Berkeley Parser is more different from discriminative dependency parsers, and therefore can provide more divergent syntactic structures. |
Introduction | Although working well on constituent parsing (McClosky et al., 2006; Huang and Harper, 2009), self-training is shown unsuccessful for dependency parsing (Spreyer and Kuhn, 2009). |
Introduction | To construct parse forest on unlabeled data, we employ three supervised parsers based on different paradigms, including our baseline graph-based dependency parser, a transition-based dependency parser (Zhang and Nivre, 2011), and a generative constituent parser (Petrov and Klein, 2007). |
Introduction | We first employ a generative constituent parser for semi-supervised dependency parsing. |
Supervised Dependency Parsing | Instead, we build a log-linear CRF-based dependency parser, which is similar to the CRF-based constituent parser of Finkel et al. |
Abstract | We propose a spectral approach for unsupervised constituent parsing that comes with theoretical guarantees on latent structure recovery. |
Abstract | More specifically, we approach unsupervised constituent parsing from the perspective of structure learning as opposed to parameter learning. |
Abstract | This undirected latent tree is then directed via a direction mapping to give the final constituent parse . |
Abstract | On the SPMRL 2013 multilingual constituency parsing shared task (Seddah et al., 2013), our system outperforms the top single parser system of Bjorkelund et al. |
Conclusion | To date, the most successful constituency parsers have largely been generative, and operate by refining the grammar either manually or automatically so that relevant information is available locally to each parsing decision. |
Conclusion | We build up a small set of feature templates as part of a discriminative constituency parser and outperform the Berkeley parser on a wide range of languages. |