Index of papers in Proc. ACL 2009 that mention

development set

Seen in text as:

development set (41)
development sets (7)
developing set (3)

Seen in 52 sentences in 10 papers.

1. Exploiting Heterogeneous Treebanks for Parsing

Niu, Zheng-Yu and Wang, Haifeng and Wu, Hua

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments of Grammar Formalism Conversion	Table 4: Results of the generative parser on the development set , when trained with various weighting of CTB training set and CDTPS .
Experiments of Parsing	We used a standard split of CTB for performance evaluation, articles 1-270 and 400-1151 as training set, articles 301-325 as development set , and articles 271-300 as test set.
Experiments of Parsing	We tried the corpus weighting method when combining CDTPS with CTB training set (abbreviated as CTB for simplicity) as training data, by gradually increasing the weight (including 1, 2, 5, 10, 20, 50) of CTB to optimize parsing performance on the development set .
Experiments of Parsing	Table 4 presents the results of the generative parser with various weights of CTB on the development set .
Our Two-Step Solution	The number of removed trees will be determined by cross validation on development set .
Our Two-Step Solution	The value of A will be tuned by cross validation on development set .
Our Two-Step Solution	Corpus weighting is exactly such an approach, with the weight tuned on development set , that will be used for parsing on homogeneous treebanks in this paper.

development set is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

2. Efficient Minimum Error Rate Training and Minimum Bayes-Risk Decoding for Translation Hypergraphs and Lattices

Kumar, Shankar and Macherey, Wolfgang and Dyer, Chris and Och, Franz

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion	In this paper, we have described how MERT can be employed to estimate the weights for the linear loss function to maximize BLEU on a development set .
Experiments	Our development set (dev) consists of the NIST 2005 eval set; we use this set for optimizing MBR parameters.
Experiments	MERT is then performed to optimize the BLEU score on a development set ; For MERT, we use 40 random initial parameters as well as parameters computed using corpus based statistics (Tromble et al., 2008).
Experiments	We select the MBR scaling factor (Tromble et al., 2008) based on the development set ; it is set to 0.1, 0.01, 0.5, 0.2, 0.5 and 1.0 for the aren-phrase, aren-hier, aren-samt, zhen-phrase zhen-hier and zhen-samt systems respectively.
Introduction	We employ MERT to select these weights by optimizing BLEU score on a development set .

development set is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

n-gram (21)
BLEU (20)
phrase-based (11)

3. Application-driven Statistical Paraphrase Generation

Zhao, Shiqi and Lan, Xiang and Liu, Ting and Li, Sheng

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	In our experiments, the development set contains 200 sentences and the test set contains 500 sentences, both of which are randomly selected from the human translations of 2008 NIST Open Machine Translation Evaluation: Chinese to English Task.
Statistical Paraphrase Generation	= cdev(+7“)/cdev(7“), where cdev(7“) is the total number of unit replacements in the generated paraphrases on the development set .
Statistical Paraphrase Generation	Replacement rate (rr): rr measures the paraphrase degree on the development set , i.e., the percentage of words that are paraphrased.
Statistical Paraphrase Generation	We define rr as: 77 = wdev(7“)/wdev(s), where wdev(7“) is the total number of words in the replaced units on the development set, and wdev (s) is the number of words of all sentences on the development set .

development set is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

4. Heterogeneous Transfer Learning for Image Clustering via the SocialWeb

Yang, Qiang and Chen, Yuqiang and Xue, Gui-Rong and Dai, Wenyuan and Yu, Yong

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	To empirically investigate the parameter A and the convergence of our algorithm aPLSA, we generated five more date sets as the development sets .
Experiments	The detailed description of these five development sets , namely tunel to tune5 is listed in Table 1 as well.
Experiments	We have done some experiments on the development sets to investigate how different A affect the performance of aPLSA.

development set is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

5. Automatic Adaptation of Annotation Standards: Chinese Word Segmentation and POS Tagging -- A Case Study

Jiang, Wenbin and Huang, Liang and Liu, Qun

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The data splitting convention of other two corpora, People’s Daily doesn’t reserve the development sets , so in the following experiments, we simply choose the model after 7 iterations when training on this corpus.
Experiments	Table 4: Error analysis for Joint S&T on the developing set of CTB.
Experiments	To obtain further information about what kind of errors be alleviated by annotation adaptation, we conduct an initial error analysis for Joint S&T on the developing set of CTB.

development set is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

6. An Error-Driven Word-Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging

Kruengkrai, Canasai and Uchimoto, Kiyotaka and Kazama, Jun'ichi and Wang, Yiou and Torisawa, Kentaro and Isahara, Hitoshi

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Note that the development set was only used for evaluating the trained model to obtain the optimal values of tunable parameters.
Experiments	For the baseline policy, we varied 7“ in the range of [1, 5] and found that setting 7“ = 3 yielded the best performance on the development set for both the small and large training corpus experiments.
Experiments	Optimal balances were selected using the development set .
Policies for correct path selection	4In our experiments, the optimal threshold value 7" is selected by evaluating the performance of joint word segmentation and POS tagging on the development set.

development set is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

7. Revisiting Pivot Language Approach for Machine Translation

Wu, Hua and Wang, Haifeng

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For Chinese-English-Spanish translation, we used the development set (devset3) released for the pivot task as the test set, which contains 506 source sentences, with 7 reference translations in English and Spanish.
Experiments	To be capable of tuning parameters on our systems, we created a development set of 1,000 sentences taken from the training sets, with 3 reference translations in both English and Spanish.
Experiments	This development set is also used to train the regression learning model.

development set is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

8. Brutus: A Semantic Role Labeling System Incorporating CCG, CFG, and Dependency Features

Boxwell, Stephen and Mehay, Dennis and Brew, Chris

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Argument Mapping Model	Given these features with gold standard parses, our argument mapping model can predict entire argument mappings with an accuracy rate of 87.96% on the test set, and 87.70% on the development set .
Identification and Labeling Models	All classifiers were trained to 500 iterations of L-BFGS training — a quasi-Newton method from the numerical optimization literature (Liu and N o-cedal, 1989) — using Zhang Le’s maxent toolkit.2 To prevent overfitting we used Gaussian priors with global variances of l and 5 for the identifier and labeler, respectively.3 The Gaussian priors were determined empirically by testing on the development set .
Identification and Labeling Models	4The size of the window was determined experimentally on the development set — we use the same window sizes throughout.

development set is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

CCG (39)
semantic role (25)
treebank (13)

9. Topological Field Parsing of German

Cheung, Jackie Chi Kit and Penn, Gerald

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The number of iterations was determined by experiments on the development set .
Experiments	Tuning the parameter settings on the development set , we found that parameterized categories, binarization, and including punctuation gave the best F1 performance.
Experiments	While experimenting with the development set of TuBa-D/Z, we noticed that the parser sometimes returns parses, in which paired punctuation (e.g.

development set is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

10. Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm

Jeon, Je Hun and Liu, Yang

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Co-training strategy for prosodic event detection	Development Set 20 1,356 2,275 f2b, f3b Labeled set L 5 347 573 m2b, m3b Unlabeled set U 1,027 77,207 129,305 m4b
Conclusions	In our experiment, we used some labeled data as development set to estimate some parameters.
Experiments and results	Among labeled data, 102 utterances of all f] a and m] 19 speakers are used for testing, 20 utterances randomly chosen from f2b, f3b, m2b, m3b, and m4b are used as development set to optimize parameters such as A and confidence level threshold, 5 utterances are used as the initial training set L, and the rest of the data is used as unlabeled set U, which has 1027 unlabeled utterances (we removed the human labels for co-training experiments).

development set is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: