Index of papers in Proc. ACL 2013 that mention
  • error rate
Feng, Minwei and Peter, Jan-Thorsten and Ney, Hermann
Conclusion
The CRFs achieves lower error rate on the tagging task but RNN trained model is better for the translation task.
Experiments
Several experiments have been done to find the suitable hy-perparameter p1 and p2, we choose the model with lowest error rate on validation corpus for translation experiments.
Experiments
The error rate of the chosen model on test corpus (the test corpus in Table 2) is 25.75% for token error rate and 69.39% for sequence error rate .
Experiments
The RNN has a token error rate of 27.31% and a sentence error rate of 77.00% over the test corpus in Table 2.
Translation System Overview
The model scaling factors M” are trained with Minimum Error Rate Training (MERT).
error rate is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Berg-Kirkpatrick, Taylor and Durrett, Greg and Klein, Dan
Abstract
Overall, our system substantially outperforms state-of-the-art solutions for this task, achieving a 31% relative reduction in word error rate over the leading commercial system for historical transcription, and a 47% relative reduction over Tesseract, Google’s open source OCR system.
Experiments
We evaluate the output of our system and the baseline systems using two metrics: character error rate (CER) and word error rate (WER).
Experiments
Table 1: We evaluate the predicted transcriptions in terms of both character error rate (CER) and word error rate (WER), and report macro-averages across documents.
Introduction
For example, even state-of-the-art OCR systems produce word error rates of over 50% on the documents shown in Figure l. Unsurprisingly, such error rates are too high for many research projects (Arlitsch and Herbert, 2004; Shoemaker, 2005; Holley, 2010).
Learning
On document (a), which exhibits noisy typesetting, our system achieves a word error rate (WER) of 25.2.
error rate is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Roark, Brian and Allauzen, Cyril and Riley, Michael
Abstract
We present experimental results for heavily pruned backoff n-gram models, and demonstrate perplexity and word error rate reductions when used with various baseline smoothing methods.
Experimental results
We now look at the impacts on system performance we can achieve with these new models4, and whether the perplexity differences that we observe translate to real error rate reductions.
Experimental results
Word error rate (WER)
Experimental results
The perplexity reductions that were achieved for these models do translate to real word error rate reductions at both stages of between 0.5 and 0.9 percent absolute.
error rate is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Tomeh, Nadi and Habash, Nizar and Roth, Ryan and Farra, Noura and Dasigi, Pradeep and Diab, Mona
Abstract
We achieve 10.1% and 11.4% reduction in recognition word error rate (WER) relative to a standard baseline system on typewritten and handwritten Arabic respectively.
Discriminative Reranking for OCR
The loss is computed as the Word Error Rate (WER) of the
Discriminative Reranking for OCR
During training, the weights are updated according to the Margin-Infused Relaxed Algorithm (MIRA), whenever the highest scoring hypothesis differs from the hypothesis with the lowest error rate .
Experiments
We note that a small number of hypotheses per list is sufficient for RankSVM to obtain a good performance, but also increasing 73 further seems to increase the error rate .
error rate is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Cohn, Trevor and Specia, Lucia
Multitask Quality Estimation 4.1 Experimental Setup
Shown above are the training mean baseline ,a, single-task learning approaches, and multitask learning models, with the columns showing macro average error rates over all three response values.
Multitask Quality Estimation 4.1 Experimental Setup
Note that here error rates are measured over all of the three annotators’ judgements, and consequently are higher than those measured against their average response in Table 1.
Multitask Quality Estimation 4.1 Experimental Setup
To test this, we trained single-task, pooled and multitask models on randomly sub-sampled training sets of different sizes, and plot their error rates in Figure 1.
error rate is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Perez-Rosas, Veronica and Mihalcea, Rada and Morency, Louis-Philippe
Abstract
Using a new multimodal dataset consisting of sentiment annotated utterances extracted from video reviews, we show that multimodal sentiment analysis can be effectively performed, and that the joint use of visual, acoustic, and linguistic modalities can lead to error rate reductions of up to 10.5% as compared to the best performing individual modality.
Conclusions
Our experiments show that sentiment annotation of utterance-level visual datastreams can be effectively performed, and that the use of multiple modalities can lead to error rate reductions of up to 10.5% as compared to the use of one modality at a time.
Discussion
Compared to the best individual classifier, the relative error rate reduction obtained with the trimodal classifier is 10.5%.
error rate is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: