Index of papers in Proc. ACL 2008 that mention
  • NIST
Voorhees, Ellen M.
The Three-way Decision Task
The answer key for the three-way decision task was developed at the National Institute of Standards and Technology ( NIST ) using annotators who had experience as TREC and DUC assessors.
The Three-way Decision Task
NIST assessors annotated all 800 entailment pairs in the test set, with each pair independently annotated by two different assessors.
The Three-way Decision Task
The three-way answer key was formed by keeping exactly the same set of YES answers as in the two-way key (regardless of the NIST annotations) and having NIST staff adjudicate assessor differences on the remainder.
NIST is mentioned in 17 sentences in this paper.
Topics mentioned in this paper:
Chan, Yee Seng and Ng, Hwee Tou
Metric Design Considerations
To evaluate our metric, we conduct experiments on datasets from the ACL-07 MT workshop and NIST
Metric Design Considerations
Table 4: Correlations on the NIST MT 2003 dataset.
Metric Design Considerations
5.2 NIST MT 2003 Dataset
NIST is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Li, Zhifei and Yarowsky, David
Abstract
We integrate our method into a state-of-the-art baseline translation system and show that it consistently improves the performance of the baseline system on various NIST MT test sets.
Conclusions
We integrate our method into a state-of-the-art phrase-based baseline translation system, i.e., Moses (Koehn et al., 2007), and show that the integrated system consistently improves the performance of the baseline system on various NIST machine translation test sets.
Experimental Results
We compile a parallel dataset which consists of various corpora distributed by the Linguistic Data Consortium (LDC) for NIST MT evaluation.
Experimental Results
4.5.2 BLEU on NIST MT Test Sets
Experimental Results
Table 7 reports the results on various NIST MT test sets.
Introduction
We carry out experiments on a state-of-the-art SMT system, i.e., Moses (Koehn et al., 2007), and show that the abbreviation translations consistently improve the translation performance (in terms of BLEU (Papineni et al., 2002)) on various NIST MT test sets.
NIST is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Shen, Libin and Xu, Jinxi and Weischedel, Ralph
Conclusions and Future Work
Our string-to-dependency system generates 80% fewer rules, and achieves 1.48 point improvement in BLEU and 2.53 point improvement in TER on the decoding output on the NIST 04 Chinese-English evaluation set.
Experiments
We used part of the NIST 2006 Chinese-English large track data as well as some LDC corpora collected for the DARPA GALE program (LDC2005E83, LDC2006E34 and LDC2006G05) as our bilingual training data.
Experiments
We tuned the weights on NIST MT05 and tested on MT04.
Introduction
For example, Chiang (2007) showed that the Hiero system achieved about 1 to 3 point improvement in BLEU on the NIST 03/04/05 Chinese-English evaluation sets compared to a start-of-the-art phrasal system.
Introduction
Our string-to-dependency decoder shows 1.48 point improvement in BLEU and 2.53 point improvement in TER on the NIST 04 Chinese-English MT evaluation set.
NIST is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Avramidis, Eleftherios and Koehn, Philipp
Experiments
Results were evaluated with both BLEU (Papineni et al., 2001) and NIST metrics ( NIST , 2002).
Experiments
BLEU NIST set devtest test07 devtest test07 baseline 18.13 18.05 5.218 5.279 person 18.16 18.17 5.224 5.316
Experiments
The NIST metric clearly shows a significant improvement, because it mostly measures difficult n-gram matches (e. g. due to the long-distance rules we have been dealing with).
NIST is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Talbot, David and Brants, Thorsten
Experiments
We run an improved version of our 2006 NIST MT Evaluation entry for the Arabic-English “Unlimited” data track.6 The language model is the same one as in the previous section.
Experiments
We use MT04 data for system development, with MT05 data and MT06 ( “NIST” subset) data for blind testing.
Experiments
Overall, our baseline results compare favorably to those reported on the NIST MT06 web site.
NIST is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhang, Min and Jiang, Hongfei and Aw, Aiti and Li, Haizhou and Tan, Chew Lim and Li, Sheng
Abstract
Experimental results on the NIST MT-2005 Chinese-English translation task show that our method statistically significantly outperforms the baseline systems.
Conclusions and Future Work
The experimental results on the NIST MT-2005 Chinese-English translation task demonstrate the effectiveness of the proposed model.
Experiments
We used sentences with less than 50 characters from the NIST MT-2002 test set as our development set and the NIST MT-2005 test set as our test set.
Introduction
Experiment results on the NIST MT-2005 Chinese-English translation task show that our method significantly outperforms Moses (Koehn et al., 2007), a state-of-the-art phrase-based SMT system, and other linguistically syntax-based methods, such as SCFG-based and STSG-based methods (Zhang et al., 2007).
NIST is mentioned in 4 sentences in this paper.
Topics mentioned in this paper: