SciSurf: Index of "F-measure" in Proc. ACL 2010

Index of papers in Proc. ACL 2010 that mention

F-measure

Seen in text as:

F-measure (24)
f-measure (16)

Seen in 41 sentences in 7 papers.

1. Enhanced Word Decomposition by Calibrating the Decision Threshold of Probabilistic Models and Using a Model Ensemble

Spiegler, Sebastian and Flach, Peter A.

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions	At an optimal decision threshold, however, both yielded a similar f-measure result.
Experiments and Results	F-measure
Experiments and Results	F-measure
Experiments and Results	Since both algorithms show different behaviour with increasing experience and PROMODES-H yields a higher f-measure across all datasets, we will investigate in the next experiments how these differences manifest themselves at the boundary level.

F-measure is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

2. Hierarchical Search for Word Alignment

Riesa, Jason and Marcu, Daniel

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Our model outperforms a GIZA++ Model-4 baseline by 6.3 points in F-measure , yielding a 1.1 BLEU score increase over a state-of-the-art syntax-based machine translation system.
Conclusion	We treat word alignment as a parsing problem, and by taking advantage of English syntax and the hypergraph structure of our search algorithm, we report significant increases in both F-measure and BLEU score over standard baselines in use by most state-of-the-art MT systems today.
Discriminative training	Note that Equation 2 is equivalent to maximizing the sum of the F-measure and model score of y:
Experiments	These plots show the current F-measure on the training set as time passes.
Experiments	F-measure
Experiments	The first three columns of Table 2 show the balanced F-measure , Precision, and Recall of our alignments versus the two GIZA++ Model-4 baselines.
Word Alignment as a Hypergraph	(1) that using the structure of l-best English syntactic parse trees is a reasonable way to frame and drive our search, and (2) that F-measure approximately decomposes over hyperedges.

F-measure is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

3. Identifying Generic Noun Phrases

Reiter, Nils and Frank, Anette

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Neither recall nor f-measure are reported.
Introduction	The feature whose omission causes the biggest drop in f-measure is set aside as a strong feature.
Introduction	From the ranked list of features f1 to fn we evaluate increasingly extended feature sets f1.. fi for i = 2..n. We select the feature set that yields the best balanced performance, at 45.7% precision and 53.6% f-measure .

F-measure is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

4. Open Information Extraction Using Wikipedia

Wu, Fei and Weld, Daniel S.

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	Comparing with TextRunner, WOEPOS runs at the same speed, but achieves an F-measure which is between 18% and 34% greater on three corpora; WOEparse achieves an F-measure which is between 72% and 91% higher than that of TextRunner, but runs about 30X times slower due to the time required for parsing.
Experiments	Figure 3: WOEPOS achieves an F-measure , which is between 18% and 34% better than TextRunner’s.
Experiments	Figure 4: WOEparse’s F-measure decreases more slowly with sentence length than WOEPOS and TextRunner, due to its better handling of difficult sentences using parser features.
Introduction	Compared with TextRunner (the state of the art) on three corpora, WOE yields between 72% and 91% improved F-measure — generalizing well beyond Wikipedia.
Wikipedia-based Open IE	As shown in the experiments on three corpora, WOEparse achieves an F-measure which is between 72% to 91% greater than TextRunner’s.
Wikipedia-based Open IE	As shown in the experiments, WOEPOS achieves an improved F-measure over TextRunner between 18% to 34% on three corpora, and this is mainly due to the increase on precision.

F-measure is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

5. Extracting Social Networks from Literary Fiction

Elson, David and Dames, Nicholas and McKeown, Kathleen

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Extracting Conversational Networks from Literature	Table 2: Precision, recall, and F-measure of three methods for detecting bilateral conversations in literary texts.
Extracting Conversational Networks from Literature	The precision and recall values shown for the baselines in Table 2 represent the highest performance we achieved by varying t between 0 and 1 (maximizing F-measure over 25).
Extracting Conversational Networks from Literature	Both baselines performed significantly worse in precision and F-measure than our quoted speech adjacency method for detecting conversations.

F-measure is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

6. Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation

Han, Xianpei and Zhao, Jun

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	F-Measure (F): the harmonic mean of purity and inverse purity.
Experiments	We use F-measure as the primary measure just liking WePSl and WePS2.
Experiments	The F-Measure vs. 1 on three data sets

F-measure is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

7. Exploring Syntactic Structural Features for Sub-Tree Alignment Using Bilingual Tree Kernels

Sun, Jun and Zhang, Min and Tan, Chew Lim

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Substructure Spaces for BTKs	The coefficient Oi for the composite kernel are tuned with respect to F-measure (F) on the development set of HIT corpus.
Substructure Spaces for BTKs	Those thresholds are also tuned on the development set of HIT corpus with respect to F-measure .
Substructure Spaces for BTKs	The evaluation is conducted by means of Precision (P), Recall (R) and F-measure (F).

F-measure is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: