Abstract | Conventional n-best reranking techniques often suffer from the limited scope of the n-best list, which rules out many potentially good alternatives. |
Abstract | We instead propose forest reranking, a method that reranks a packed forest of exponentially many parses. |
Abstract | Our final result, an F—score of 91.7, outperforms both 50-best and 100-best reranking baselines, and is better than any previously reported systems trained on the Treebank. |
Introduction | Discriminative reranking has become a popular technique for many NLP problems, in particular, parsing (Collins, 2000) and machine translation (Shen et al., 2005). |
Introduction | Typically, this method first generates a list of top-n candidates from a baseline system, and then reranks this n-best list with arbitrary features that are not computable or intractable to compute within the baseline system. |
Introduction | conventional reranking only at the root DP-based discrim. |
Packed Forests as Hypergraphs | Such a Treebank-style forest is easier to work with for reranking , since many features can be directly expressed in it. |
Evaluation Methodology | KSDEP 1%CONLL RERANK NO—RERANK BERKELEY STANFORD ENJU ENJU—GENIA |
Evaluation Methodology | For the other parsers, we input the concatenation of W8] and GENIA for the retraining, while the reranker of RERANK was not retrained due to its cost. |
Evaluation Methodology | Since the parsers other than NO-RERANK and RERANK require an external POS tagger, a WSJ-trained POS tagger is used with WSJ-trained parsers, and geniatagger (Tsuruoka et al., 2005) is used with GENIA-retrained parsers. |
Experiments | Among these parsers, RERANK performed slightly better than the other parsers, although the difference in the f-score is small, while it requires much higher parsing cost. |
Experiments | retraining yielded only slight improvements for RERANK , BERKELEY, and STANFORD, while larger improvements were observed for MST, KSDEP, NO-RERANK, and ENJU. |
Syntactic Parsers and Their Representations | RERANK Charniak and Johnson (2005)’s rerank-ing parser. |
Syntactic Parsers and Their Representations | The reranker of this parser receives n-best4 parse results from NO-RERANK, and selects the most likely result by using a maximum entropy model with manually engineered features. |
Related Work | (2006), who applied a reranked parser to a large unsupervised corpus in order to obtain additional training data for the parser; this self-training appraoch was shown to be quite effective in practice. |
Related Work | However, their approach depends on the usage of a high-quality parse reranker , whereas the method described here simply augments the features of an existing parser. |
Related Work | Note that our two approaches are compatible in that we could also design a reranker and apply self-training techniques on top of the cluster-based features. |