Index of papers in Proc. ACL that mention

precision and recall

Seen in text as:

precision and recall (119)
Precision and recall (8)

Seen in 122 sentences in 26 papers.

1. DErivBase: Inducing and Evaluating a Derivational Morphology Resource for German

Zeller, Britta and Šnajder, Jan and Padó, Sebastian

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Building the Resource	We then revised the rules with the aim of increasing both precision and recall .
Evaluation	To obtain reliable estimates of both precision and recall , we decided to draw two different samples: (l) a sample of lemma pairs sampled from the induced derivational families, on which we estimate precision (P-sample) and (2) a sample of lemma pairs sampled from the set of possibly derivationally related lemma pairs, on which we estimate recall (R-sample).
Evaluation	Table 5: Precision and recall on test samples
Introduction	We conduct a thorough evaluation of the induced derivational families both regarding precision and recall .
Results	We omit the F1 score because its use for precision and recall estimates from different samples is unclear.
Results	The string distance-based approaches achieve more balanced precision and recall scores.
Results	Note that for these methods, precision and recall can be traded off against each other by varying the number of clusters; we chose the number of clusters by optimizing the F1 score on the calibration and validaton sets.

precision and recall is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

2. Better Alignments = Better Translations?

Ganchev, Kuzman and Graça, João V. and Taskar, Ben

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We attempt to tease apart the effects that this simple but effective modification has on alignment precision and recall tradeoffs, and how rare and common words are affected across several language pairs.
Word alignment results	Consequently, in addition to AER, we focus on precision and recall .
Word alignment results	Figure 3 shows the change in precision and recall with the amount of provided training data for the Hansards corpus.
Word alignment results	We see that agreement constraints improve both precision and recall when we

precision and recall is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

3. Predicting the relevance of distributional semantic similarity with contextual information

Muller, Philippe and Fabre, Cécile and Adam, Clémentine

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Evaluation of lexical similarity in context	Figure 3 shows the influence of the threshold value to select relevant pairs, when considering precision and recall of the pairs that are kept when choosing the threshold, evaluated against the human annotation of relevance in context.
Evaluation of lexical similarity in context	In case one wants to optimize the F-score (the harmonic mean of precision and recall ) when extracting relevant pairs, we can see that the optimal point is at .24 for a threshold of .22 on Lin’s score.
Experiments: predicting relevance in context	Figure 3: Precision and recall on relevant links with respect to a threshold on the similarity measure (Lin’s score)
Experiments: predicting relevance in context	We have seen that the relevant/not relevant classification is very imbalanced, biased towards the “not relevant” category (about 11%/89%), so we applied methods dedicated to counterbalance this, and will focus on the precision and recall of the predicted relevant links.
Experiments: predicting relevance in context	Other popular methods (maximum entropy, SVM) have shown slightly inferior combined F-score, even though precision and recall might yield more important variations.

precision and recall is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

4. That's Not What I Meant! Using Parsers to Avoid Structural Ambiguities in Generated Text

Duan, Manjuan and White, Michael

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Therefore, to develop a more nuanced self-monitoring reranker that is more robust to such parsing mistakes, we trained an SVM using dependency precision and recall features for all three parses, their n-best parsing results, and per-label precision and recall for each type of dependency, together with the realizer’s normalized perceptron model score as a feature.
Reranking with SVMs 4.1 Methods	precision and recall labeled and unlabeled precision and recall for each parser’s best parse
Reranking with SVMs 4.1 Methods	per-label precision and recall (dep) precision and recall for each type of dependency obtained from each parser’s best parse (using zero if not defined for lack of predicted or gold dependencies with a given label)
Reranking with SVMs 4.1 Methods	n-best precision and recall (nbest) labeled and unlabeled precision and recall for each parser’s top five parses, along with the same features for the most accurate of these parses

precision and recall is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

perceptron (29)
SVM (23)
BLEU (20)

5. Minimized Models and Grammar-Informed Initialization for Supertagging with Highly Ambiguous Lexicons

Ravi, Sujith and Baldridge, Jason and Knight, Kevin

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	Table 6: Comparison of grammar/lexicon observed in the model tagging vs. gold tagging in terms of precision and recall measures for supertagging on CCG—TUT.
Experiments	Precision and recall of grammar and lexicon.
Experiments	Table 3: Comparison of grammar/lexicon observed in the model tagging vs. gold tagging in terms of precision and recall measures for supertagging on CCGbank data.
Experiments	We can obtain a more-fine grained understanding of how the models differ by considering the precision and recall values for the grammars and lexicons of the different models, given in Table 3.

precision and recall is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

6. Measure Word Generation for English-Chinese SMT Systems

Zhang, Dongdong and Li, Mu and Duan, Nan and Li, Chi-Ho and Zhou, Ming

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Experimental results show our method can achieve high precision and recall in measure word generation.
Conclusion and Future Work	Experimental results show that our method not only achieves high precision and recall for generating measure words, but also improves the quality of English-to-Chinese SMT systems.
Experiments	Table 3 and Table 4 show the precision and recall of our measure word generation method.
Experiments	In addition to precision and recall , we also evaluate the Bleu score (Papineni et al., 2002) changes before and after applying our measure word generation method to the SMT output.
Experiments	Table 6 and Table 7 show the precision and recall when using different features.

precision and recall is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

7. Detecting Experiences from Weblogs

Park, Keun Chan and Jeong, Yoonjae and Myaeng, Sung Hyon

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion and Future Work	For experience detection, the performance was very promising, closed to 92% in precision and recall when all the features were used.
Experience Detection	We not only compared our results with the baseline in terms of precision and recall but also
Experience Detection	The performance for the best case with all the features included is very promising, closed to 92% precision and recall .
Experience Detection	In order to see the effect of including individual features in the feature set, precision and recall were measured after eliminating a particular feature from the full set.
Lexicon Construction	Note that the precision and recall are macro-averaged values across the two classes, activity and state.

precision and recall is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

8. Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations

Hoffmann, Raphael and Zhang, Congle and Ling, Xiao and Zettlemoyer, Luke and Weld, Daniel S.

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	Aggregate Extraction Let A6 be the set of extracted relations for any of the systems; we compute aggregate precision and recall by comparing A6 with A.
Experimental Setup	We then report precision and recall for each system on this set of sampled sentences.
Experiments	Since the data contains an unbalanced number of instances of each relation, we also report precision and recall for each of the ten most frequent relations.
Experiments	Table 1 presents this approximate precision and recall for MULTIR on each of the relations, along with statistics we computed to measure the quality of the weak supervision.
Introduction	use supervised learning of relation-specific examples, which can achieve high precision and recall .
Related Work	While they offer high precision and recall , these methods are unlikely to scale to the thousands of relations found in text on the Web.

precision and recall is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

9. Structured Learning for Taxonomy Induction with Belief Propagation

Bansal, Mohit and Burkett, David and de Melo, Gerard and Klein, Dan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	We also present results on the precision and recall tradeoffs inherent in this task.
Experiments	We note that previous work achieves higher ancestor precision, while our approach achieves a more even balance between precision and recall .
Experiments	Of course, precision and recall should both ideally be high, even if some applications weigh one over the other.
Introduction	We note that our approach falls at a different point in the space of performance tradeoffs from past work — by producing complete, highly articulated trees, we naturally see a more even balance between precision and recall , while past work generally focused on precision.1 To
Introduction	1While different applications will value precision and recall differently, and past work was often intentionally precision-focused, it is certainly the case that an ideal solution would maximize both.

precision and recall is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

10. Learning 5000 Relational Extractors

Hoffmann, Raphael and Zhang, Congle and Weld, Daniel S.

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	Many researchers are trying to use IE to create large-scale knowledge bases from natural language text on the Web, but existing relation-specific techniques do not scale to the thousands of relations encoded in Web text — while relation-independent techniques suffer from lower precision and recall , and do not canonicalize the relations.
Extraction with Lexicons	We expect that lists with higher similarity are more likely to contain phrases which are related to our seeds; hence, by varying the similarity threshold one may produce lexicons representing different compromises between lexicon precision and recall .
Introduction	Open extraction is more scalable, but has lower precision and recall .
Related Work	Open IE, self-supervised learning of unlexicalized, relation-independent extractors (Banko et al., 2007), is a more scalable approach, but suffers from lower precision and recall , and doesn’t canonicalize the relations.
Related Work	The goal of set expansion techniques is to generate high precision sets of related items; hence, these techniques are evaluated based on lexicon precision and recall .

precision and recall is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

CRF (10)
F1 score (8)
overfitting (6)

11. Phrase Table Training for Precision and Recall: What Makes a Good Phrase and a Good Phrase Pair?

Deng, Yonggang and Xu, Jia and Gao, Yuqing

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this work, the problem of extracting phrase translation is formulated as an information retrieval process implemented with a log—linear model aiming for a balanced precision and recall .
Conclusions	In this paper, the problem of extracting phrase translation is formulated as an information retrieval process implemented with a log-linear model aiming for a balanced precision and recall .
Discussions	The generic phrase training algorithm follows an information retrieval perspective as in (Venugopal et al., 2003) but aims to improve both precision and recall with the trainable log-linear model.
Discussions	It implies a balancing process between precision and recall .
Introduction	As in information retrieval, precision and recall issues need to be addressed with a right balance for building a phrase translation table.

precision and recall is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

12. Open Information Extraction Using Wikipedia

Wu, Fei and Weld, Daniel S.

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This paper presents WOE, an open IE system which improves dramatically on TextRunner’s precision and recall .
Abstract	WOE can operate in two modes: when restricted to P08 tag features, it runs as quickly as TextRunner, but when set to use dependency-parse features its precision and recall rise even higher.
Introduction	high precision and recall , they are limited by the availability of training data and are unlikely to scale to the thousands of relations found in text on the Web.
Introduction	WOE can operate in two modes: when restricted to shallow features like part-of-speech (POS) tags, it runs as quickly as Textrunner, but when set to use dependency-parse features its precision and recall rise even higher.

precision and recall is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

13. Bootstrapping coreference resolution using word associations

Kobdani, Hamidreza and Schuetze, Hinrich and Schiehlen, Michael and Kamp, Hans

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Results and Discussion	Both precision and recall are improved with two exceptions: recall of B3 decreases from line 2 to 3 and from 15 to 16.
Results and Discussion	In contrast to F1, there is no consistent trend for precision and recall .
Results and Discussion	But this higher variability for precision and recall is to be expected since every system trades the two measures off differently.

precision and recall is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

14. Using Deep Morphology to Improve Automatic Error Detection in Arabic Handwriting Recognition

Habash, Nizar and Roth, Ryan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Results	We present the results in terms of F-score only for simplicity; we then conduct an error analysis that examines precision and recall .
Results	We consider the performance in terms of precision and recall in addition to F-score — see Table 7 (a).
Results	Overall, there is no major tradeoff between precision and recall across the different settings; although we can observe the following: (i) adding more training data helps precision more than recall (over three times more) — compare the last two columns in Table 7 (a); and (ii) the best setting has a slightly lower precision than all features, although a much better recall — compare columns 4 and 5 in Table 7 (a).

precision and recall is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

15. Open-Domain Semantic Role Labeling by Modeling Word Spans

Huang, Fei and Yates, Alexander

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	do not report (NR) separate values for precision and recall on this dataset.
Introduction	Differences in both precision and recall between the baseline and the other systems are statistically significant at p < 0.01 using the two-tailed Fisher’s exact test.
Introduction	Differences in both precision and recall between the baseline and the Span-HMM systems are statistically significant at p < 0.01 using the two-tailed Fisher’s exact test.

precision and recall is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

16. Conundrums in Noun Phrase Coreference Resolution: Making Sense of the State-of-the-Art

Stoyanov, Veselin and Gilbert, Nathan and Cardie, Claire and Riloff, Ellen

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Coreference Subtask Analysis	The MUC scoring algorithm (Vilain et a1., 1995) computes the F1 score (harmonic mean) of precision and recall based on the identifcation of unique coreference links.
Coreference Subtask Analysis	The B3 algorithm (Bagga and Baldwin, 1998) computes a precision and recall score for each CE:
Coreference Subtask Analysis	Precision and recall for a set of documents are computed as the mean over all CEs in the documents and the F1 score of precision and recall is reported.

precision and recall is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

17. Model-Based Aligner Combination Using Dual Decomposition

DeNero, John and Macherey, Klaus

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	The resulting predictions improve the precision and recall of both alignment links and extraced phrase pairs in Chinese-English experiments.
Experimental Results	The bidirectional model improves both precision and recall relative to all heuristic combination techniques, including grow-diag-final (Koehn et al., 2003).
Experimental Results	As our model only provides small improvements in alignment precision and recall for the union combiner, the magnitude of the BLEU improvement is not surprising.

precision and recall is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

18. Enhanced Word Decomposition by Calibrating the Decision Threshold of Probabilistic Models and Using a Model Ensemble

Spiegler, Sebastian and Flach, Peter A.

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments and Results	It seems that the model gets quickly saturated in terms of incorporating new information and therefore precision and recall do not drastically change for increasing dataset sizes.
Experiments and Results	For this reason we broke down the summary measures of precision and recall into their original components: true/false positive (TP/FF) and negative (TN/FN) counts presented in the 2 x 2 contingency table of Figure 1.
Experiments and Results	The optimal solution applying [1* = 0.38 is more balanced between precision and recall and

precision and recall is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

19. A Latent Dirichlet Allocation Method for Selectional Preferences

Ritter, Alan and Mausam and Etzioni, Oren

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	In figure 5 we compare the precision and recall of LDA-SP against the top two performing systems described by Pantel et al.
Experiments	We find that LDA-SP achieves both higher precision and recall than ISP.IIM-\/.
Experiments	Figure 5: Precision and recall on the inference filtering task.

precision and recall is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

topic models (12)
WordNet (10)
LDA (8)

20. Simple Unsupervised Grammar Induction from Raw Text with Cascaded Finite State Models

Ponvert, Elias and Baldridge, Jason and Erk, Katrin

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

CD	While the first level of constituent analysis has high precision and recall on NPs, the second level often does well finding prepositional phrases (PPS), especially in WSJ; see Table 7.
Phrasal punctuation revisited	The table shows absolute improvement (+) or decline (—) in precision and recall when phrasal punctuation is removed from the data.
Tasks and Benchmark	It measures precision and recall on constituents produced by a parser as compared to gold standard constituents.

precision and recall is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

21. PORT: a Precision-Order-Recall MT Evaluation Metric for Tuning

Chen, Boxing and Kuhn, Roland and Larkin, Samuel

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

BLEU and PORT	2.2.1 Precision and Recall
BLEU and PORT	To combine precision and recall , we tried four averaging methods: arithmetic (A), geometric (G), harmonic (H), and quadratic (Q) mean.
BLEU and PORT	We chose the quadratic mean to combine precision and recall , as follows:

precision and recall is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

22. Extracting Social Networks from Literary Fiction

Elson, David and Dames, Nicholas and McKeown, Kathleen

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Extracting Conversational Networks from Literature	The precision and recall of our method for detecting conversations is shown in Table 2.
Extracting Conversational Networks from Literature	To calculate precision and recall for the two baseline social networks, we set a threshold 75 to derive a binary prediction from the continuous edge weights.
Extracting Conversational Networks from Literature	The precision and recall values shown for the baselines in Table 2 represent the highest performance we achieved by varying t between 0 and 1 (maximizing F-measure over 25).

precision and recall is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

23. Minimized Models for Unsupervised Part-of-Speech Tagging

Ravi, Sujith and Knight, Kevin

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Fitting the Model	We also measure the quality of the two observed grammars/dictionaries by computing their precision and recall against the grammar/dictionary we observe in the gold tagging.4 We find that precision of the observed grammar increases from 0.73 (EM) to 0.94 (IP+EM).
Fitting the Model	Figure 6: Comparison of observed grammars from the model tagging vs. gold tagging in terms of precision and recall measures.
Restarts and More Data	Figure 7: Comparison of observed dictionaries from the model tagging vs. gold tagging in terms of precision and recall measures.

precision and recall is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

24. Two Is Bigger (and Better) Than One: the Wikipedia Bitaxonomy Project

Flati, Tiziano and Vannella, Daniele and Pasini, Tommaso and Navigli, Roberto

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Comparative Evaluation	MENTA is the closest system to ours, obtaining slightly higher precision and recall .
Comparative Evaluation	Notably, however, MENTA outputs the first WordNet sense of entity for 13.17% of all the given answers, which, despite being correct and accounted in precision and recall , is uninformative.
Phase 1: Inducing the Page Taxonomy	Not only does our taxonomy show high precision and recall in extracting ambiguous hypernyms, it also disambiguates more than 3/4 of the hypernyms with high precision.

precision and recall is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

25. DEPEVAL(summ): Dependency-based Evaluation for Automatic Summaries

Owczarzak, Karolina

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Dependency-based evaluation	Overlap between the candidate bag and the reference bag is calculated in the form of precision, recall, and the f-measure (with precision and recall equally weighted).
Discussion and future work	Much like the inverse relation of precision and recall , changes and additions that improve a metric’s correlation with human scores for model summaries often weaken the correlation for system summaries, and vice versa.
Lexical-Functional Grammar and the LFG parser	(2004) obtains high precision and recall rates.

precision and recall is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

26. Shift-Reduce CCG Parsing with a Dependency Model

Xu, Wenduan and Clark, Stephen and Zhang, Yue

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Figure 4: Labeled precision and recall relative to normal-form model is used.
Experiments	Table 1 shows the accuracies of all parsers on the development set, in terms of labeled precision and recall over the predicate-argument dependencies in CCGBank.
Experiments	To probe this further we compare labeled precision and recall relative to dependency length, as measured by the distance between the two words in a dependency, grouped into bins of 5 values.

precision and recall is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: