A Latent Variable Parser | For our experiments, we used the latent variable-based Berkeley parser (Petrov et al., 2006). |
A Latent Variable Parser | The Berkeley parser automates the process of finding such distinctions. |
A Latent Variable Parser | The Berkeley parser has been applied to the TuBaD/Z corpus in the constituent parsing shared task of the ACL-2008 Workshop on Parsing German (Petrov and Klein, 2008), achieving an F1-measure of 85.10% and 83.18% with and without gold standard POS tags respectively2. |
Abstract | We report the results of topological field parsing of German using the unlexicalized, latent variable-based Berkeley parser (Petrov et al., 2006) Without any language- or model-dependent adaptation, we achieve state-of—the-art results on the TuBa-D/Z corpus, and a modified NEGRA corpus that has been automatically annotated with topological fields (Becker and Frank, 2002). |
Introduction | To facilitate comparison with previous work, we also conducted experiments on a modified NEGRA corpus that has been automatically annotated with topological fields (Becker and Frank, 2002), and found that the Berkeley parser outperforms the method described in that work. |
Introduction | This model includes several enhancements, which are also found in the Berkeley parser . |
Introduction | DTR is comparable to the idea of latent variable grammars on which the Berkeley parser is based, in that both consider the observed treebank to be less than ideal and both attempt to refine it by splitting and merging nonterminals. |
Annotations | While we do not do as well as the Berkeley parser , we will see in Section 6 that our parser does a substantially better job of generalizing to other languages. |
Other Languages | We show that this is indeed the case: on nine languages, our system is competitive with or better than the Berkeley parser , which is the best single |
Other Languages | We compare to the Berkeley parser (Petrov and Klein, 2007) as well as two variants. |
Other Languages | (2013) (Berkeley-Rep), which is their best single parser.5 The “Replaced” system modifies the Berkeley parser by replacing rare words with morphological descriptors of those words computed using language-specific modules, which have been handcrafted for individual languages or are trained with additional annotation layers in the treebanks that we do not exploit. |
Experiments and Analysis | Default parameter settings are used for training ZPar and Berkeley Parser . |
Experiments and Analysis | For Berkeley Parser , we use the model after 5 split-merge iterations to avoid over-fitting the training data according to the manual. |
Experiments and Analysis | The phrase-structure outputs of Berkeley Parser are converted into dependency structures using the same head-finding rules. |
Analysis | 9Default: we ran the Berkeley parser in its default ‘fast’ mode; the output kr-best lists are ordered by max-rule-score. |
Analysis | Table 3: Parsing results for reranking 50-best lists of Berkeley parser (Dev is WSJ section 22 and Test is WSJ section 23, all lengths). |
Introduction | For example, in the Berkeley parser (Petrov et al., 2006), about 20% of the errors are prepositional phrase attachment errors as in Figure l, where a preposition-headed (IN) phrase was assigned an incorrect parent in the implied dependency tree. |
Introduction | Here, the Berkeley parser (solid blue edges) incorrectly attaches from debt to the noun phrase $ 30 billion whereas the correct attachment (dashed gold edges) is to the verb raising. |
Introduction | Figure l: A PP attachment error in the parse output of the Berkeley parser (on Penn Treebank). |
Parsing Experiments | We also evaluate the utility of web-scale features on top of a state-of—the-art constituent parser — the Berkeley parser (Petrov et al., 2006), an unlexical-ized phrase-structure parser. |
Parsing Experiments | Our baseline system is the Berkeley parser , from which we obtain k-best lists for the development set (WSJ section 22) and test set (WSJ section 23) using a grammar trained on all the training data (WSJ sections 2-21).8 To get k-best lists for the training set, we use 3-fold jackknifing where we train a grammar |
Parsing Experiments | Table 2: Oracle Fl-scores for kr-best lists output by Berkeley parser for English WSJ parsing (Dev is section 22 and Test is section 23, all lengths). |
Abstract | We show that the conversion is extremely difficult to perform, but are able to fairly compare the parsers on a representative subset of the PTB test section, obtaining results for the CCG parser that are statistically no different to those for the Berkeley parser . |
Conclusion | One question that is often asked of the CCG parsing work is “Why not convert back into the PTB representation and perform a Parseval evaluation?” By showing how difficult the conversion is, we believe that we have finally answered this question, as well as demonstrating comparable performance with the Berkeley parser . |
Evaluation | The Berkeley parser (Petrov and Klein, 2007) provides performance close to the state-of-the-art for the PTB parsing task, with reported F-scores of around 90%. |
Evaluation | As can be seen from the scores, these sentences form a slightly easier subset than the full section ()0, but this is a subset which can be used for a fair comparison against the Berkeley parser , since the conversion process is not lossy for this subset. |
Evaluation | We compare the CCG parser to the Berkeley parser using the accurate mode of the Berkeley parser , together with the model supplied with the publicly available version. |
Introduction | PTB parser we use for comparison is the publicly available Berkeley parser (Petrov and Klein, 2007). |
Experimental setup | We use the Maryland implementation of the Berkeley parser as our baseline for the kernel-smoothed lexicon, and the Maryland featured parser as our baseline for the embedding-featured lexicon.1 For all experiments, we use 50-dimensional word embeddings. |
Experimental setup | For each training corpus size we also choose a different setting of the number of splitting iterations over which the Berkeley parser is run; for 300 sentences this is two splits, and for |
Parser extensions | For the experiments in this paper, we will use the Berkeley parser (Petrov and Klein, 2007) and the related Maryland parser (Huang and Harper, 2011). |
Parser extensions | The Berkeley parser induces a latent, state-split PCFG in which each symbol V of the (observed) X-bar grammar is refined into a set of more specific symbols {V1, V2, . |
Parser extensions | First, these parsers are among the best in the literature, with a test performance of 90.7 F1 for the baseline Berkeley parser on the Wall Street Journal corpus (compared to 90.4 for Socher et al. |
Three possible benefits of word embeddings | These are precisely the kinds of distinctions between determiners that state-splitting in the Berkeley parser has shown to be useful (Petrov and Klein, 2007), and existing work (Mikolov et al., 2013b) has observed that such regular embedding structure extends to many other parts of speech. |
Experiments | The Berkeley Parser (Petrov et al., 2006) was employed for parsing the Chinese sentences. |
Experiments | For training the Berkeley Parser , we used Chinese Treebank (CTB) 7.0. |
Experiments | We conducted our dependency-based pre-ordering experiments on the Berkeley Parser and the Mate Parser (Bohnet, 2010), which were shown to be the two best parsers for Stanford typed dependencies (Che et al., 2012). |
Capturing Syntagmatic Relations via Constituency Parsing | On the whole, the Berkeley parser processes IV words slightly better than our tagger, but processes OOV words significantly worse. |
Capturing Syntagmatic Relations via Constituency Parsing | The numbers in this table clearly shows the main weakness of the Berkeley parser is the the predictive power of the OOV words. |
Combining Both | We still use a Bagging model to integrate the discriminative tagger and the Berkeley parser . |
Combining Both | 11 indicate that the parsing accuracy of the Berkeley parser can be simply improved by inputting the Berkeley parser with the POS Bagging results. |
Introduction | We then present a comparative study of our tagger and the Berkeley parser , and show that the combination of the two models can significantly improve tagging accuracy. |
Conclusion | The resulting supervised parser outperforms the Berkeley parser , a state-of-the-art chart parser, in both accuracies and speeds. |
Experiments | From the results we can see that our extended parser (baseline + padding + supervised features) outperforms the Berkeley parser by 0.3% on English, and is comparable with the Berkeley parser on Chinese (—0.1% less). |
Experiments | Specifically, the semi-supervised parser is 7 times faster than the Berkeley parser . |
Introduction | On standard evaluations using both the Penn Treebank and the Penn Chinese Treebank, our parser gave higher accuracies than the Berkeley parser (Petrov and Klein, 2007), a state-of-the-art chart parser. |
Introduction | In addition, our parser runs with over 89 sentences per second, which is 14 times faster than the Berkeley parser , and is the fastest that we are aware of for phrase-structure parsing. |
Analysis and Discussion | Given that with the more refined SVM ranker, the Berkeley parser worked nearly as well as all three parsers together using the complete feature set, the prospects for future work on a more realistic scenario using the OpenCCG parser in an SVM ranker for self-monitoring now appear much more promising, either using OpenCCG’s reimplemen-tation of Hockenmaier & Steedman’s generative CCG model, or using the Berkeley parser trained on OpenCCG’s enhanced version of the CCGbank, along the lines of Fowler and Penn (2010). |
Reranking with SVMs 4.1 Methods | Finally, since the Berkeley parser yielded the best results on its own, we also tested models using all the feature classes but only using this parser by itself. |
Reranking with SVMs 4.1 Methods | Somewhat surprisingly, the Berkeley parser did as well as all three parsers using just the overall precision and recall features, but not quite as well using all features. |
Simple Reranking | We chose the Berkeley parser (Petrov et al., 2006), Brown parser (Chamiak and Johnson, 2005) and Stanford parser (Klein and Manning, 2003) to parse the realizations generated by the |
Simple Reranking | Simple ranking with the Berkeley parser of the generative model’s n-best realizations raised the BLEU score from 85.55 to 86.07, well below the averaged perceptron model’s BLEU score of 87.93. |
Abstract | We demonstrate that our method is faster than coarse-to-fine pruning, exemplified in both the Charniak and Berkeley parsers, by empirically comparing our parser to the Berkeley parser using the same grammar and under identical operating conditions. |
Conclusion and Future Work | 2We run the Berkeley parser with the default search parameterization to achieve the fastest possible parsing time. |
Conclusion and Future Work | Using this framework, we have shown that we can decrease parsing time by 65% over a standard beam-search without any loss in accuracy, and parse significantly faster than both the Berkeley parser and Chart Constraints. |
Results | Both our parser and the Berkeley parser are written in Java, both are run with Viterbi decoding, and both parse with the same grammar, so a direct comparison of speed and accuracy is fair.2 |
Experiments | We used the human-annotated parses for the sentences in the Penn Treebank, but parsed the Gigaword and BLLIP sentences with the Berkeley Parser . |
Experiments | PCFG-LA The Berkeley Parser in language model mode. |
Experiments | 7We use signatures generated by the Berkeley Parser . |
Conclusion | The Berkeley parser’s grammars—by virtue of being unlexicalized—can be applied uniformly to all parse items. |
Introduction | Their system uses a grammar based on the Berkeley parser (Petrov and Klein, 2007) (which is particularly amenable to GPU processing), “compiling” the grammar into a sequence of GPU kernels that are applied densely to every item in the parse chart. |
Minimum Bayes risk parsing | It is of course important verify the correctness of our system; one easy way to do so is to examine parsing accuracy, as compared to the original Berkeley parser . |
Minimum Bayes risk parsing | These results are nearly identical to the Berkeley parsers most comparable numbers: 89.8 for Viterbi, and 90.9 for their “Max-Rule-Sum” MBR algorithm. |