Index of papers in Proc. ACL that mention

error rate

Seen in text as:

error rate (187)
error rates (62)
Error Rate (16)
(4)
Error rate (3)

Seen in 242 sentences in 38 papers.

1. Grounded Language Modeling for Automatic Speech Recognition of Sports Video

Fleischman, Michael and Roy, Deb

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Results show that grounded language models improve perplexity and word error rate over text based language models, and further, support video information retrieval better than human generated speech transcriptions.
Evaluation	We evaluate our grounded language modeling approach using 3 metrics: perpleXity, word error rate , and precision on an information retrieval task.
Evaluation	4.2 Word Accuracy and Error Rate
Evaluation	Word error rate (WER) is a normalized measure of the number of word insertions, substitutions, and deletions required to transform the output transcription of an ASR system to a human generated gold standard transcription of the same utterance.
Introduction	Results indicate improved performance using three metrics: perplexity, word error rate , and precision on an information retrieval task.

error rate is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

2. Automatic Syllabification with Structured SVMs for Letter-to-Phoneme Conversion

Bartlett, Susan and Kondrak, Grzegorz and Cherry, Colin

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In comparison with a state-of-the-art syllabification system, we reduce the syllabification word error rate for English by 33%.
Introduction	With this approach, we reduce the error rate for English by 33%, relative to the best existing system.
L2P Performance	In English, perfect syllabification produces a relative error reduction of 10.6%, and our model captures over half of the possible improvement, reducing the error rate by 6.0%.
L2P Performance	Although perfect syllabification reduces their L2P relative error rate by 18%, they find that their learned model actually increases the error rate .
L2P Performance	For Dutch, perfect syllabification reduces the relative L2P error rate by 17.5%; we realize over 70% of the available improvement with our syllabification model, reducing the relative error rate by 12.4%.
Syllabification Experiments	Syllable break error rate (SB ER) captures the incorrect tags that cause an error in syllabification.
Syllabification Experiments	Table 1 presents the word accuracy and syllable break error rate achieved by each of our tag sets on both the CELEX and NETtalk datasets.
Syllabification Experiments	Overall, our best tag set lowers the error rate by one-third, relative to SbA’s performance.

error rate is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

3. Which Words Are Hard to Recognize? Prosodic, Lexical, and Disfluency Factors that Increase ASR Error Rates

Goldwater, Sharon and Jurafsky, Dan and Manning, Christopher D.

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This paper analyzes a variety of lexical, prosodic, and disfluency factors to determine which are likely to increase ASR error rates .
Abstract	(3) Although our results are based on output from a system with speaker adaptation, speaker differences are a major factor influencing error rates , and the effects of features such as frequency, pitch, and intensity may vary between speakers.
Data	The standard measure of error used in ASR is word error rate (WER), computed as 100(I —\|— D —\|—S ) / R, where I , D and S are the number of insertions, deletions, and substitutions found by aligning the ASR hypotheses with the reference transcriptions, and R is the number of reference words.
Data	Since we wish to know what features of a reference word increase the probability of an error, we need a way to measure the errors attributable to individual words — an individual word error rate (IWER).
Introduction	Previous work on recognition of spontaneous monologues and dialogues has shown that infrequent words are more likely to be misrecognized (Fosler—Lussier and Morgan, 1999; Shinozaki and Furui, 2001) and that fast speech increases error rates (Siegler and Stern, 1995; Fosler—Lussier and Morgan, 1999; Shinozaki
Introduction	Siegler and Stern (1995) and Shinozaki and Furui (2001) also found higher error rates in very slow speech.
Introduction	Word length (in phones) has also been found to be a useful predictor of higher error rates (Shinozaki and Furui, 2001).

error rate is mentioned in 36 sentences in this paper.

Topics mentioned in this paper:

4. Applying a Grammar-Based Language Model to a Simplified Broadcast-News Transcription Task

Kaufmann, Tobias and Pfister, Beat

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We report a significant reduction in word error rate compared to a state-of-the-art baseline system.
Experiments	For a given test set we could then compare the word error rate of the baseline system with that of the extended system employing the grammar-based language model.
Experiments	tionally high baseline word error rate .
Experiments	These classes are interviews (a word error rate of 36.1%), sports reports (28.4%) and press conferences (25.7%).
Language Model 2.1 The General Approach	The influence of N on the word error rate is discussed in the results section.

error rate is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

5. Conditional Random Fields for Word Hyphenation

Trogkanis, Nikolaos and Elkan, Charles

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We create new training sets for English and Dutch from the CELEX European lexical resource, and achieve error rates for English of less than 0.1% for correctly allowed hyphens, and less than 0.01% for Dutch.
Abstract	Experiments show that both the Knuth/Liang method and a leading current commercial altema-tive have error rates several times higher for both languages.
Experimental design	In order to measure accuracy, we compute the confusion matrix for each method, and from this we compute error rates .
Experimental design	We report both word-level and letter-level error rates .
Experimental design	The word-level error rate is the fraction of words on which a method makes at least one mistake.
Experimental results	Figure 1 shows how the error rate is affected by increasing the CRF probability threshold for each language.
Experimental results	Figure 1 shows confidence intervals for the error rates .
Experimental results	All differences between rows in Table 2 are significant, with one exception: the serious error rates for PAT GEN and TALO are not statistically significantly different.
History of automated hyphenation	The lowest per-letter test error rate reported is about 2%.

error rate is mentioned in 28 sentences in this paper.

Topics mentioned in this paper:

error rate (28)
CRF (27)
word-level (5)

6. Interactive Topic Modeling

Hu, Yuening and Boyd-Graber, Jordan and Satinoff, Brianna

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Getting Humans in the Loop	Figure 6: The relative error rate (using round 0 as a baseline) of the best Mechanical Turk user session for each of the four numbers of topics.
Simulation Experiment	The lower the classification error rate , the better the model has captured the structure of the corpus.4
Simulation Experiment	While Null sees no constraints, it serves as an upper baseline for the error rate (lower error being better) but shows the effect of additional inference.
Simulation Experiment	All Full is a lower baseline for the error rate since it both sees the constraints at the beginning and also runs for the maximum number of total iterations.

error rate is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

7. Randomized Language Models via Perfect Hash Functions

Talbot, David and Brants, Thorsten

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions	Experiments have shown that this randomized language model can be combined with entropy pruning to achieve further memory reductions; that error rates occurring in practice are much lower than those predicted by theoretical analysis due to the use of runtime sanity checks; and that the same translation quality as a lossless language model representation can be achieved when using 12 ‘error’ bits, resulting in approx.
Experiments	Section (3) analyzed the theoretical error rate; here, we measure error rates in practice when retrieving n-grams for approx.
Experiments	The error rates for bigrams are close to their expected values.
Perfect Hash-based Language Models	There is a tradeoff between space and error rate since the larger B is, the lower the probability of a false positive.
Perfect Hash-based Language Models	For example, if \|V\| is 128 then taking B = 1024 gives an error rate of e = 128/1024 = 0.125 with each entry in A using [log2 1024] = 10 bits.
Perfect Hash-based Language Models	Querying each of these arrays for each n-gram requested would be inefficient and inflate the error rate since a false positive could occur on each individual array.
Scaling Language Models	The space required in such a lossy encoding depends only on the range of values associated with the n-grams and the desired error rate , i.e.

error rate is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

8. Improving Statistical Machine Translation with Monolingual Collocation

Liu, Zhanyi and Wang, Haifeng and Wu, Hua and Li, Sheng

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	The evaluation results showed that the proposed method significantly improved word alignment, achieving an absolute error rate reduction of 29% on multi-word alignment.
Experiments on Phrase-Based SMT	And Koehn's implementation of minimum error rate training (Och, 2003) is used to tune the feature weights on the development set.
Experiments on Word Alignment	For multi-word alignments, our methods significantly outperform the baseline method in terms of both precision and recall, achieving up to 18% absolute error rate reduction.
Experiments on Word Alignment	CM-3, the error rate of multi-word alignment results is further reduced.
Experiments on Word Alignment	We can see that WA-l achieves lower alignment error rate as compared to the baseline method, since the performance of the improved one-directional alignment method is better than that of GIZA++.
Introduction	The evaluation results show that the proposed method in this paper significantly improves mul-ti-word alignment, achieving an absolute error rate reduction of 29%.

error rate is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

9. Advancements in Reordering Models for Statistical Machine Translation

Feng, Minwei and Peter, Jan-Thorsten and Ney, Hermann

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	The CRFs achieves lower error rate on the tagging task but RNN trained model is better for the translation task.
Experiments	Several experiments have been done to find the suitable hy-perparameter p1 and p2, we choose the model with lowest error rate on validation corpus for translation experiments.
Experiments	The error rate of the chosen model on test corpus (the test corpus in Table 2) is 25.75% for token error rate and 69.39% for sequence error rate .
Experiments	The RNN has a token error rate of 27.31% and a sentence error rate of 77.00% over the test corpus in Table 2.
Translation System Overview	The model scaling factors M” are trained with Minimum Error Rate Training (MERT).

error rate is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

CRFs (19)
LM (14)
BLEU (8)

10. Automatic Editing in a Back-End Speech-to-Text System

Bisani, Maximilian and Vozila, Paul and Divay, Olivier and Adams, Jeff

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental evaluation	It is hard to quote the verbatim word error rate of the recognizer, because this would require a careful and time-consuming manual transcription of the test set.
Experimental evaluation	Using the alignment we compute precision and recall for sections headings and punctuation marks as well as the overall token error rate .
Experimental evaluation	It should be noted that the so derived error rate is not comparable to word error rates usually reported in speech recognition research.
Probabilistic model	The decision rule (1) minimizes the document error rate .
Transformation based learning	This method iteratively improves the match (as measured by token error rate ) of a collection of corresponding source and target token sequences by positing and applying a sequence of substitution rules.

error rate is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

error rate (6)
probabilistic model (3)

11. Discriminative Pronunciation Modeling: A Large-Margin, Feature-Rich Approach

Tang, Hao and Keshet, Joseph and Livescu, Karen

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In experiments on a subset of the Switchboard conversational speech corpus, our models thus far improve classification error rates from a previously published result of 29.1% to about 15%.
Experiments	We measure performance by error rate (ER), the proportion of test examples predicted incorrectly.
Experiments	We can see that, by adding just the Levenshtein distance, the error rate drops signif-
Experiments	Table 2: Lexical access error rates (ER) on the same data split as in (Livescu and Glass, 2004; Jyothi et al., 2011).
Introduction	For generative models, phonetic error rate of generated pronunciations (Venkataramani and Byme, 2001) and

error rate is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

12. Unsupervised Transcription of Historical Documents

Berg-Kirkpatrick, Taylor and Durrett, Greg and Klein, Dan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Overall, our system substantially outperforms state-of-the-art solutions for this task, achieving a 31% relative reduction in word error rate over the leading commercial system for historical transcription, and a 47% relative reduction over Tesseract, Google’s open source OCR system.
Experiments	We evaluate the output of our system and the baseline systems using two metrics: character error rate (CER) and word error rate (WER).
Experiments	Table 1: We evaluate the predicted transcriptions in terms of both character error rate (CER) and word error rate (WER), and report macro-averages across documents.
Introduction	For example, even state-of-the-art OCR systems produce word error rates of over 50% on the documents shown in Figure l. Unsurprisingly, such error rates are too high for many research projects (Arlitsch and Herbert, 2004; Shoemaker, 2005; Holley, 2010).
Learning	On document (a), which exhibits noisy typesetting, our system achieves a word error rate (WER) of 25.2.

error rate is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

13. Joint Decoding with Multiple Translation Models

Liu, Yang and Mi, Haitao and Feng, Yang and Liu, Qun

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	One possible reason is that we only used n-best derivations instead of all possible derivations for minimum error rate training.
Extended Minimum Error Rate Training	Minimum error rate training (Och, 2003) is widely used to optimize feature weights for a linear model (Och and Ney, 2002).
Extended Minimum Error Rate Training	The key idea of MERT is to tune one feature weight to minimize error rate each time while keep others fixed.
Extended Minimum Error Rate Training	Unfortunately, minimum error rate training cannot be directly used to optimize feature weights of max-translation decoding because Eq.
Introduction	0 As multiple derivations are used for finding optimal translations, we extend the minimum error rate training (MERT) algorithm (Och, 2003) to tune feature weights with respect to BLEU score for max-translation decoding (Section 4).

error rate is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

14. Shallow Analysis Based Assessment of Syntactic Complexity for Automated Speech Scoring

Bhat, Suma and Xue, Huichao and Yoon, Su-Youn

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	A word error rate (WER) of 31% on the SM dataset was observed.
Experimental Setup	However, due to substantial amount of speech recognition errors in our data, the POS error rate (resulting from the combined errors of ASR and automated POS tagger) is expected to be higher.
Related Work	Automatic recognition of nonnative speakers’ spontaneous speech is a challenging task as evidenced by the error rate of the state-of-the-
Related Work	For instance, Chen and Zechner (2011) reported a 50.5% word error rate (WER) and Yoon and Bhat (2012) reported a 30% WER in the recognition of ESL students’ spoken responses.
Related Work	These high error rates at the recognition stage negatively affect the subsequent stages of the speech scoring system in general, and in particular, during a deep syntactic analysis, which operates on a long sequence of words as its context.

error rate is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

15. Reranking with Linguistic and Semantic Features for Arabic Optical Character Recognition

Tomeh, Nadi and Habash, Nizar and Roth, Ryan and Farra, Noura and Dasigi, Pradeep and Diab, Mona

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We achieve 10.1% and 11.4% reduction in recognition word error rate (WER) relative to a standard baseline system on typewritten and handwritten Arabic respectively.
Discriminative Reranking for OCR	The loss is computed as the Word Error Rate (WER) of the
Discriminative Reranking for OCR	During training, the weights are updated according to the Margin-Infused Relaxed Algorithm (MIRA), whenever the highest scoring hypothesis differs from the hypothesis with the lowest error rate .
Experiments	We note that a small number of hypotheses per list is sufficient for RankSVM to obtain a good performance, but also increasing 73 further seems to increase the error rate .

error rate is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

reranking (14)
LM (9)
error rate (4)

16. Smoothed marginal distribution constraints for language modeling

Roark, Brian and Allauzen, Cyril and Riley, Michael

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present experimental results for heavily pruned backoff n-gram models, and demonstrate perplexity and word error rate reductions when used with various baseline smoothing methods.
Experimental results	We now look at the impacts on system performance we can achieve with these new models4, and whether the perplexity differences that we observe translate to real error rate reductions.
Experimental results	Word error rate (WER)
Experimental results	The perplexity reductions that were achieved for these models do translate to real word error rate reductions at both stages of between 0.5 and 0.9 percent absolute.

error rate is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

17. Knowledge-Based Question Answering as Machine Translation

Bao, Junwei and Duan, Nan and Zhou, Ming and Zhao, Tiejun

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	A linear model is defined over derivations, and minimum error rate training is used to tune feature weights based on a set of question-answer pairs.
Introduction	Derivations generated during such a translation procedure are modeled by a linear model, and minimum error rate training (MERT) (Och, 2003) is used to tune feature weights based on a set of question-answer pairs.
Introduction	Given a set of question-answer pairs {Qh A1?” } as the development (dev) set, we use the minimum error rate training (MERT) (Och, 2003) algorithm to tune the feature weights A11” in our proposed model.
Introduction	A lower disambiguation error rate results in better relation expressions.

error rate is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

18. Detecting Retries of Voice Search Queries

Levitan, Rivka and Elson, David

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	However, our model’s performance today correlates strongly with an orthogonal accuracy metric, word error rate , on unseen data.
Conclusion	This suggests that “retry rate” is a reasonable offline quality metric, to be considered in context among other metrics and traditional evaluation based on word error rate .
Introduction	In particular, we seek to measure and minimize the word error rate (WER) of a system, with a WER of zero indicating perfect transcription.
Prediction task	We do not have retry annotations for this larger set, but we have transcriptions for the first member of each query pair, enabling us to calculate the word error rate (WER) of each query’s recognition hypothesis, and thus obtain ground truth for half of our retry definition.

error rate is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

19. Automatically Evaluating Text Coherence Using Discourse Relations

Lin, Ziheng and Ng, Hwee Tou and Kan, Min-Yen

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	The experimental results demonstrate that our model is able to significantly outperform the state-of-the-art coherence model by Barzilay and Lapata (2005), reducing the error rate of the previous approach by an average of 29% over three data sets against human upper bounds.
Experiments	For the combined model, the error rates are significantly reduced in all three data sets.
Experiments	The average error rate reductions against 100% are 9.57% for the full model and 26.37% for the combined model.
Experiments	If we compute the average error rate reductions against the human upper bounds (rather than an oracular 100%), the average error rate reduction for the full model is 29% and that for the combined model is 73%.

error rate is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

20. Linguistic Considerations in Automatic Question Generation

Mazidi, Karen and Nielsen, Rodney D.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Evaluation results show a 44% reduction in the error rate relative to the best prior systems, averaging over all metrics, and up to 61% reduction in the error rate on grammaticality judgments.
Conclusions	Our system achieved a 44% reduction in the error rate relative to both the Heilman and Smith, and the Lindberg et al.
Results	As seen in Table 4, our results represent a 44% reduction in the error rate relative to Heilman and Smith on the average rating over all metrics, and as high as 61% reduction in the error rate on grammaticality judgments.
Results	Interestingly, our system again achieved a 44% reduction in the error rate when averaging over all metrics, just as it did in the Heilman and Smith comparison.

error rate is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

21. Error Detection for Statistical Machine Translation Using Linguistic Features

Xiong, Deyi and Zhang, Min and Li, Haizhou

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	The experimental results show that l) linguistic features alone outperform word posterior probability based confidence estimation in error detection; and 2) linguistic features can further provide complementary information when combined with word confidence scores, which collectively reduce the classification error rate by 18.52% and improve the F measure by 16.37%.
Experiments	To determine the true class of a word in a generated translation hypothesis, we follow (Blatz et al., 2003) to use the word error rate (WER).
Experiments	To evaluate the overall performance of the error detection, we use the commonly used metric, classification error rate (CER) to evaluate our classifiers.
SMT System	For minimum error rate tuning (Och, 2003), we use NIST MT-02 as the development set for the translation task.

error rate is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

22. Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm

Jeon, Je Hun and Liu, Yang

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments and results	Table 2: Percentage of positive samples, and averaged error rate for positive (P) and negative (N) samples for the first 20 iterations using the agreement-based and our confidence labeling methods.
Experiments and results	Table 2 shows the percentage of the positive samples added for the first 20 iterations, and the average labeling error rate of those samples for the self-labeled positive and negative classes for two methods.
Experiments and results	The agreement-based random selection added more negative samples that also have higher error rate than the positive samples.

error rate is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

23. Practical Very Large Scale CRFs

Lavergne, Thomas and Cappé, Olivier and Yvon, François

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conditional Random Fields	Results are reported in terms of phoneme error rates or tag error rates on the test set.
Conditional Random Fields	Table 1: Features jointly testing label pairs and the observation are useful ( error rates and features counts.)
Conditional Random Fields	Table 3: Error rates of the three regularizers on the Nettalk task.

error rate is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

24. Summarizing multiple spoken documents: finding evidence from untranscribed audio

Zhu, Xiaodan and Penn, Gerald and Rudzicz, Frank

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We compare the performance of this model with that achieved using manual and automatic transcripts, and find that this new approach is roughly equivalent to having access to ASR transcripts with word error rates in the 33—37% range without actually having to do the ASR, plus it better handles utterances with out-of-vocabulary words.
Experimental results	Since ASR performance can vary greatly as we discussed above, we compare our system against automatic transcripts having word error rates of 12.6%, 20.9%, 29.2%, and 35.5% on the same speech source.
Experimental results	I a: - ., 0.75 0.75 0.7 0.7 0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5 Word error rate Word error rate 1 1 Len=20%, Rand=0.324 Len=2o%, Rand=o_340 0.95 0.95 N I 0.9 0.9 e L’" D 0.85 o 0.85 g 0.8 8 0.8 v a: 0.75 0.75 0.7 0.7 0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5 Word error rate Word error rate 1 1 Len=30‘V , Rand=0.389 _ _ 0 95 0 095 Len—30%, Rand—0.402 ES N I 0.9 0.9 e L’" D 0.85 o 0.85 8 0.8 8 0.8 a: 0 75 0 75 0.7 0.7 0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5 Word error rate Word error rate
Experimental setup	These transcripts contain a word error rate of 12.6%, which is comparable to the best accuracies obtained in the literature on this data set.

error rate is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

25. A Hybrid Rule/Model-Based Finite-State Framework for Normalizing SMS Messages

Beaufort, Richard and Roekhaut, Sophie and Cougnon, Louise-Amélie and Fairon, Cédrick

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Different well-defined approaches have been proposed, but the problem remains far from being solved: best systems achieve a 11% Word Error Rate .
Abstract	Evaluated in French by 10-fold-cross validation, the system achieves a 9.3% Word Error Rate and a 0.83 BLEU score.
Evaluation	The system was evaluated in terms of BLEU score (Papineni et al., 2001), Word Error Rate (WER) and Sentence Error Rate (SER).

error rate is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

26. Bootstrapping into Filler-Gap: An Acquisition Story

van Schijndel, Marten and Elsner, Micha

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Comparison to BabySRL	Error rate Initial .36 Trained .11 Initial (given 2 args) .66 Trained (given 2 args) .13 2008 arg—arg position .65 2008 arg-verb position 0 2009 arg—arg position .82 2009 arg-verb position .63
Comparison to BabySRL	The model presented in this paper does not share this restriction, so the raw error rate for this model is presented in the first two lines; the error rate once this additional restriction is imposed is given in the second two lines.
Comparison to BabySRL	The 1-1 role bias error rate (before training) of the model presented in this paper is comparable to that of Connor et al.

error rate is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

27. Better Alignments = Better Translations?

Ganchev, Kuzman and Graça, João V. and Taskar, Ben

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Fraser and Marcu (2007) note that none of the tens of papers published over the last five years has shown that significant decreases in alignment error rate (AER) result in significant increases in translation performance.
Introduction	After presenting the models and the algorithm in Sections 2 and 3, in Section 4 we examine how the new alignments differ from standard models, and find that the new method consistently improves word alignment performance, measured either as alignment error rate or weighted F—score.
Word alignment results	(2008) show that alignment error rate (Och and Ney, 2003) can be improved with agreement constraints.

error rate is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

28. Perplexity on Reduced Corpora

Kobayashi, Hayato

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Although we did not examine the accuracy of real tasks in this paper, there is an interesting report that the word error rate of language models follows a power law with respect to perplexity (Klakow and Peters, 2002).
Experiments	Thus, we conjecture that the word error rate also has a similar tendency as perplexity with respect to the reduced vocabulary size.
Introduction	Each of these studies experimentally discusses tradeoff relationships between the size of the reduced corpus/model and its performance measured by perplexity, word error rate , and other factors.

error rate is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

unigram (13)
topic models (12)
LDA (10)

29. Adaptive HTER Estimation for Document-Specific MT Post-Editing

Huang, Fei and Xu, Jian-Ming and Ittycheriah, Abraham and Roukos, Salim

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present an adaptive translation quality estimation (QE) method to predict the human-targeted translation error rate (HTER) for a document-specific machine translation model.
Introduction	In this paper we propose an adaptive quality estimation that predicts sentence-level human-targeted translation error rate (HTER) (Snover et al., 2006) for a document-specific MT post-editing system.
Static MT Quality Estimation	score or translation error rate of the translated sentences or documents based on a set of features.

error rate is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

30. Efficient Minimum Error Rate Training and Minimum Bayes-Risk Decoding for Translation Hypergraphs and Lattices

Kumar, Shankar and Macherey, Wolfgang and Dyer, Chris and Och, Franz

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Minimum Error Rate Training (MERT) and Minimum Bayes-Risk (MBR) decoding are used in most current state-of—the-art Statistical Machine Translation (SMT) systems.
Introduction	Two popular techniques that incorporate the error criterion are Minimum Error Rate Training (MERT) (Och, 2003) and Minimum Bayes-Risk (MBR) decoding (Kumar and Byrne, 2004).
Minimum Bayes-Risk Decoding	This reranking can be done for any sentence-level loss function such as BLEU (Papineni et al., 2001), Word Error Rate, or Position-independent Error Rate .

error rate is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

n-gram (21)
BLEU (20)
phrase-based (11)

31. Abstraction and Generalisation in Semantic Role Labels: PropBank, VerbNet or both?

Merlo, Paola and van der Plas, Lonneke

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Amount of Information in Semantic Roles Inventory	Table 2: Percent Error rate reduction (ERR) across role labelling sets in three tasks in Zapirain et al.
Amount of Information in Semantic Roles Inventory	(2008) and calculate the reduction in error rate based on this differential baseline for the two annotation schemes.
Amount of Information in Semantic Roles Inventory	VerbNet has better role generalising ability overall as its reduction in error rate is greater than PropBank (first line of Table 2), but it is more degraded by lack of verb information (second and third lines of Table 2).

error rate is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

32. Model-Based Aligner Combination Using Dual Decomposition

DeNero, John and Macherey, Klaus

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Results	Table 2: Alignment error rate results for the bidirectional model versus the baseline directional models.
Experimental Results	First, we measure alignment error rate (AER), which compares the pro-
Experimental Results	The translation model weights were tuned for both the baseline and bidirectional alignments using lattice-based minimum error rate training (Kumar et al., 2009).

error rate is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

33. Utterance-Level Multimodal Sentiment Analysis

Perez-Rosas, Veronica and Mihalcea, Rada and Morency, Louis-Philippe

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Using a new multimodal dataset consisting of sentiment annotated utterances extracted from video reviews, we show that multimodal sentiment analysis can be effectively performed, and that the joint use of visual, acoustic, and linguistic modalities can lead to error rate reductions of up to 10.5% as compared to the best performing individual modality.
Conclusions	Our experiments show that sentiment annotation of utterance-level visual datastreams can be effectively performed, and that the use of multiple modalities can lead to error rate reductions of up to 10.5% as compared to the use of one modality at a time.
Discussion	Compared to the best individual classifier, the relative error rate reduction obtained with the trimodal classifier is 10.5%.

error rate is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

34. An Exact A* Method for Deciphering Letter-Substitution Ciphers

Corlett, Eric and Penn, Gerald

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	We judged the quality of the decoding by measuring the percentage of characters in the cipher alphabet that were correctly guessed, and also the word error rate of the plaintext generated by our solution.
Experiment	We did not count the accuracy or word error rate for unfinished ciphers.
Results	We can also observe that even when there are errors (e.g., in the size 1000 cipher), the word error rate is very small.

error rate is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

35. Modelling Annotator Bias with Multi-task Gaussian Processes: An Application to Machine Translation Quality Estimation

Cohn, Trevor and Specia, Lucia

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Multitask Quality Estimation 4.1 Experimental Setup	Shown above are the training mean baseline ,a, single-task learning approaches, and multitask learning models, with the columns showing macro average error rates over all three response values.
Multitask Quality Estimation 4.1 Experimental Setup	Note that here error rates are measured over all of the three annotators’ judgements, and consequently are higher than those measured against their average response in Table 1.
Multitask Quality Estimation 4.1 Experimental Setup	To test this, we trained single-task, pooled and multitask models on randomly sub-sampled training sets of different sizes, and plot their error rates in Figure 1.

error rate is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

36. Discriminative Pruning for Discriminative ITG Alignment

Liu, Shujie and Li, Chi-Ho and Zhou, Ming

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We propose a discriminative ITG pruning framework using Minimum Error Rate Training and various features from previous work on ITG alignment.
Evaluation	That is, the first evaluation metric, pruning error rate (henceforth PER), measures how many correct E-spans are discarded.
The DPDI Framework	Parameter training of DPDI is based on Minimum Error Rate Training (MERT) (Och, 2003), a Widely used method in SMT.

error rate is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

37. Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments

Mohler, Michael and Bunescu, Razvan and Mihalcea, Rada

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion and Conclusions	First, we can see from the results that several systems appear better when evaluating on a correlation measure like Pearson’s p, while others appear better when analyzing error rate .
Discussion and Conclusions	Evaluating with a correlative measure yields predictably poor results, but evaluating the error rate indicates that it is comparable to (or better than) the more intelligent BOW metrics.
Results	However, as the perceptron is designed to minimize error rate , this may not reflect an optimal objective when seeking to detect matches.

error rate is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

38. Boosting-Based System Combination for Machine Translation

Xiao, Tong and Zhu, Jingbo and Zhu, Muhua and Wang, Huizhen

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	In this work, Minimum Error Rate Training (MERT) proposed by Och (2003) is used to estimate feature weights ,1 over a series of training samples.
Background	As the weighted BLEU is used to measure the translation accuracy on the training set, the error rate is defined to be:
Background	The diversity is measured in terms of the Translation Error Rate (TER) metric proposed in (Snover et al., 2006).

error rate is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: