SciSurf: Index of "development set" in Proc. ACL 2014

Index of papers in Proc. ACL 2014 that mention

development set

Seen in text as:

development set (67)
development sets (11)

Seen in 79 sentences in 17 papers.

1. Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification

Tang, Duyu and Wei, Furu and Yang, Nan and Zhou, Ming and Liu, Ting and Qin, Bing

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related Work	The training and development sets were completely in full to task participants.
Related Work	However, we were unable to download all the training and development sets because some tweets were deleted or not available due to modified authorization status.
Related Work	The tradeoff parameter of ReEmb (Labutov and Lipson, 2013) is tuned on the development set of SemEval 2013.

development set is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

2. Steps to Excellence: Simple Inference with Refined Scoring of Dependency Trees

Zhang, Yuan and Lei, Tao and Barzilay, Regina and Jaakkola, Tommi and Globerson, Amir

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	For efficiency, we limit the sentence length to 70 tokens in training and development sets .
Experimental Setup	This gives a 99% pruning recall on the CATiB development set .
Experimental Setup	After pruning, we tune the regularization parameter 0 = {0.l,0.01,0.001} on development sets for different languages.
Sampling-Based Dependency Parsing with Global Features	2In our work we choose oz 2 0.003, which gives a 98.9% oracle POS tagging accuracy on the CATiB development set .

development set is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

3. Predicting Grammaticality on an Ordinal Scale

Heilman, Michael and Cahill, Aoife and Madnani, Nitin and Lopez, Melissa and Mulholland, Matthew and Tetreault, Joel

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For this experiment, all models were estimated from the training set and evaluated on the development set .
Experiments	For test set evaluations, we trained on the combination of the training and development sets (§2), to maximize the amount of training data for the final experiments.
Experiments	12We selected a threshold for binarization from a grid of 1001 points from 1 to 4 that maximized the accuracy of binarized predictions from a model trained on the training set and evaluated on the binarized development set .
System Description	On 10 preliminary runs with the development set , this variance
System Description	Table l: Pearson’s 7“ on the development set , for our full system and variations excluding each feature type.

development set is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

4. Joint POS Tagging and Transition-based Constituent Parsing in Chinese with Non-local Features

Wang, Zhiguo and Xue, Nianwen

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	We conducted experiments on the Penn Chinese Treebank (CTB) version 5.1 (Xue et al., 2005): Articles 001-270 and 400-1151 were used as the training set, Articles 301-325 were used as the development set , and Articles 271-300 were used
Experiment	We tuned the optimal number of iterations of perceptron training algorithm on the development set .
Experiment	We trained these three systems on the training set and evaluated them on the development set .

development set is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

5. Semantic Parsing via Paraphrasing

Berant, Jonathan and Liang, Percy

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Empirical evaluation	We tuned the L1 regularization strength, developed features, and ran analysis experiments on the development set (averaging across random splits).
Empirical evaluation	To further examine this, we ran BCFL13 on the development set , allowing it to use only predicates from logical forms suggested by our logical form construction step.
Empirical evaluation	This improved oracle accuracy on the development set to 64.5%, but accuracy was 32.2%.

development set is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

6. Learning Structured Perceptrons for Coreference Resolution with Latent Antecedents and Non-local Features

Björkelund, Anders and Kuhn, Jonas

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Results	the English development set as a function of number of training iterations with two different beam sizes, 20 and 100, over the local and nonlocal feature sets.
Results	In Figure 4 we compare early update with LaSO and delayed LaSO on the English development set .
Results	Table 1 displays the differences in F-measures and CoNLL average between the local and nonlocal systems when applied to the development sets for each language.

development set is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

7. Dependency-based Pre-ordering for Chinese-English Machine Translation

Cai, Jingsheng and Utiyama, Masao and Sumita, Eiichiro and Zhang, Yujie

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Dependency-based Pre-ordering Rule Set	4 Conduct primary experiments which used the same training set and development set as the experiments described in Section 3.
Dependency-based Pre-ordering Rule Set	In the primary experiments, we tested the effectiveness of the candidate rules and filtered the ones that did not work based on the BLEU scores on the development set .
Experiments	Our development set was the official NIST MT evaluation data from 2002 to 2005, consisting of 4476 Chinese-English sentences pairs.
Experiments	lected from the development set .
Experiments	The evaluation set contained 200 sentences randomly selected from the development set .

development set is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

8. Word Segmentation of Informal Arabic with Domain Adaptation

Monroe, Will and Green, Spence and Manning, Christopher D.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Error Analysis	We sampled 100 errors randomly from all errors made by our final model (trained on all three datasets with domain adaptation and additional features) on the ARZ development set ; see Table 4.
Error Analysis	Table 4: Counts of error categories (out of 100 randomly sampled ARZ development set errors).
Error Analysis	One example of this distinction that appeared in the development set is the pair any)» mawdm“‘my topic” (yo madeZ< + 6.
Experiments	F1 scores provide a more informative assessment of performance than word-level or character-level accuracy scores, as over 80% of tokens in the development sets consist of only one segment, with an average of one segmentation every 4.7 tokens (or one every 20.4 characters).
Experiments	Table 1 contains results on the development set for the model of Green and DeNero and our improvements.

development set is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

9. Anchors Regularized: Adding Robustness and Extensibility to Scalable Topic-Modeling Algorithms

Nguyen, Thang and Hu, Yuening and Boyd-Graber, Jordan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Regularization Improves Topic Models	We split each dataset into a training fold (70%), development fold (15%), and a test fold (15%): the training data are used to fit models; the development set are used to select parameters (anchor threshold M, document prior 04, regularization weight A); and final results are reported on the test fold.
Regularization Improves Topic Models	We select 04 using grid search on the development set .
Regularization Improves Topic Models	4.1 Grid Search for Parameters on Development Set

development set is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

10. Less Grammar, More Features

Hall, David and Durrett, Greg and Klein, Dan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Annotations	Table 2: Results for the Penn Treebank development set , sentences of length g 40, for different annotation schemes implemented on top of the X-bar grammar.
Features	Table 1 shows the results of incrementally building up our feature set on the Penn Treebank development set .
Other Languages	(2013) only report results on the development set for the Berkeley-Rep model; however, the task organizers also use a version of the Berkeley parser provided with parts of speech from high-quality POS taggers for each language (Berkeley-Tags).
Other Languages	On the development set , we outperform the Berkeley parser and match the performance of the Berkeley-Rep parser.

development set is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

11. Tagging The Web: Building A Robust Web Tagger with Neural Network

Ma, Ji and Zhang, Yue and Zhu, Jingbo

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	While emails and weblogs are used as the development sets , reviews, news groups and Yahoo!Answers are used as the final test sets.
Experiments	All these parameters are selected according to the averaged accuracy on the development set .
Experiments	Experimental results under the 4 combined settings on the development sets are illustrated in Figure 2, 3 and 4, where the

development set is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

12. Learning Soft Linear Constraints with Application to Citation Field Extraction

Anzaroot, Sam and Passos, Alexandre and Belanger, David and McCallum, Andrew

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Citation Extraction Data	There are 660 citations in the development set and 367 citation in the test set.
Citation Extraction Data	We then use the development set to learn the penalties for the soft constraints, using the perceptron algorithm described in section 3.1.
Citation Extraction Data	We instantiate constraints from each template in section 5.1, iterating over all possible labels that contain a B prefix at any level in the hierarchy and pruning all constraints with imp(c) < 2.75 calculated on the development set .
Soft Constraints in Dual Decomposition	We found it beneficial, though it is not theoretically necessary, to learn the constraints on a held-out development set , separately from the other model parameters, as during training most constraints are satisfied due to overfitting, which leads to an underestimation of the relevant penalties.

development set is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

13. Abstractive Summarization of Spoken and Written Conversations Based on Phrasal Queries

Mehdad, Yashar and Carenini, Giuseppe and Ng, Raymond T.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	For parameters estimation, we tune all parameters (utterance selection and path ranking) ex-haustively with 0.1 intervals using our development set .
Phrasal Query Abstraction Framework	The parameters a and fl are tuned on a development set and sum up to 1.
Phrasal Query Abstraction Framework	We estimate the percentage of the retrieved utterances based on the development set .

development set is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

14. Shift-Reduce CCG Parsing with a Dependency Model

Xu, Wenduan and Clark, Stephen and Zhang, Yue

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The beam size was tuned on the development set , and a value of 128 was found to achieve a reasonable balance of accuracy and speed; hence this value was used for all experiments.
Experiments	dependency length on the development set .
Experiments	Table 1 shows the accuracies of all parsers on the development set , in terms of labeled precision and recall over the predicate-argument dependencies in CCGBank.

development set is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

15. A Sense-Based Translation Model for Statistical Machine Translation

Xiong, Deyi and Zhang, Min

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We used the NIST MT03 evaluation test data as our development set , and the NIST MT05 as the test set.
Experiments	Table 4: Experiment results of the sense-based translation model (STM) with lexicon and sense features extracted from a window of size varying from $5 to $15 words on the development set .
Experiments	Our first group of experiments were conducted to investigate the impact of the window size k on translation performance in terms of BLEUMIST on the development set .

development set is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

16. Unsupervised Morphology-Based Vocabulary Expansion

Rasooli, Mohammad Sadegh and Lippincott, Thomas and Habash, Nizar and Rambow, Owen

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Evaluation	We sampled data from the training and development set of the Persian dependency treebank (Rasooli et al., 2013) to create a comparable seventh dataset in Persian.
Evaluation	00 is the upper-bound OOV reduction for our expansion model: for each word in the development set , we ask if our model, without any vocabulary size restriction at all, could generate it.
Evaluation	Table 5: Results from running a handcrafted Turkish morphological analyzer (Oflazer, 1996) on different expansions and on the development set .

development set is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

17. Decoder Integration and Expected BLEU Training for Recurrent Neural Network Language Models

Auli, Michael and Gao, Jianfeng

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Expected BLEU Training	1We tuned AM+1 on the development set but found that AM+1 = 1 resulted in faster training and equal accuracy.
Expected BLEU Training	We fix 6 and re-optimize A in the presence of the recurrent neural network model using Minimum Error Rate Training (Och, 2003) on the development set (§5).
Experiments	ther lattices or the unique 100-best output of the phrase-based decoder and reestimate the log-linear weights by running a further iteration of MERT on the n-best list of the development set , augmented by scores corresponding to the neural network models.

development set is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: