Index of papers in Proc. ACL 2011 that mention
  • development set
Bodenstab, Nathan and Dunlop, Aaron and Hall, Keith and Roark, Brian
Beam-Width Prediction
Figure 3 is a visual representation of beam-width prediction on a single sentence of the development set using the Berkeley latent-variable grammar and Boundary FOM.
Conclusion and Future Work
Table 1: Section 22 development set results for CYK and Beam-Search (Beam) parsing using the Berkeley latent-variable grammar.
Open/Closed Cell Classification
We found the value A = 102 to give the best performance on our development set , and we use this value in all of our experiments.
Open/Closed Cell Classification
Figures 2a and 2b compare the pruned charts of Chart Constraints and Constituent Closure for a single sentence in the development set .
Open/Closed Cell Classification
We compare the effectiveness of Constituent Closure, Complete Closure, and Chart Constraints, by decreasing the percentage of chart cells closed until accuracy over all sentences in our development set start to decline.
Results
In Table 1 we present the accuracy and parse time for three baseline parsers on the development set : exhaustive CYK parsing, beam-search parsing using only the inside score BC), and beam-search parsing using the Boundary FOM.
development set is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Sun, Ang and Grishman, Ralph and Sekine, Satoshi
Cluster Feature Selection
Our main idea is to learn the best set of prefix lengths, perhaps through the validation of their effectiveness on a development set of data.
Cluster Feature Selection
Because this method does not need validation on the development set , it is the laziest but the fastest method for selecting clusters.
Cluster Feature Selection
Exhaustive Search (ES): ES works by trying every possible combination of the set I and picking the one that works the best for the development set .
Experiments
The set of prefix lengths that worked the best for the development set was chosen to select clusters.
Experiments
It was interesting that ES did not always outperform the two statistical methods which might be because of its overfitting to the development set .
Experiments
For the semi-supervised system, each test fold was the same one used in the baseline and the other 4 folds were further split into a training set and a development set in a ratio of 7:3 for selecting clusters.
development set is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Auli, Michael and Lopez, Adam
Experiments
We measured the evolving accuracy of the models on the development set (Figure 4).
Oracle Parsing
The reason that using the gold-standard supertags doesn’t result in 100% oracle parsing accuracy is that some of the development set parses cannot be constructed by the learned grammar.
Oracle Parsing
riety of fixed beam settings (Figure l), considering only the subset of our development set which could be parsed with all beam settings.
development set is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lin, Hui and Bilmes, Jeff
Experiments
We used DUC-03 as our development set , and tested on DUC-04 data.
Experiments
DUC-03 was used as development set .
Experiments
Figure 1 illustrates how ROUGE-1 scores change when 04 and K vary on the development set (DUC-03).
development set is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wang, Dong and Liu, Yang
Experiments
The 18 conversations annotated by all 3 annotators are used as test set, and the rest of 70 conversations are used as development set to tune the parameters (determining the best combination weights).
Experiments
From the development set , we used the grid search method to obtain the best combination weights for the two summarization methods.
Experiments
In the sentence-ranking method, the best parameters found on the development set are Asim = 0, Are; 2 0.3, Agent 2 0.3, Alen = 0.4.
development set is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wing, Benjamin and Baldridge, Jason
Data
Running on the full data set is time-consuming, so development was done on a subset of about 80,000 articles (19.9 million tokens) as a training set and 500 articles as a development set .
Experiments
For both Wikipedia and Twitter, preliminary experiments on the development set were run to plot the prediction error for each method for each level of resolution, and the optimal resolution for each method was chosen for obtaining test results.
Experiments
We recomputed the distributions using several values for both parameters and evaluated on the development set .
development set is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zollmann, Andreas and Vogel, Stephan
Experiments
We evaluate our approach by comparing translation quality, as evaluated by the IBM-BLEU (Papineni et al., 2002) metric on the NIST Chinese-to-English translation task using MT04 as development set to train the model parameters A, and MTOS, MT06 and MT08 as test sets.
Experiments
We therefore choose N merely based on development set performance.
Experiments
Unfortunately, variance in development set BLEU scores tends to be higher than test set scores, despite of SAMT MERT’s inbuilt algorithms to overcome local optima, such as random restarts and zeroing-out.
development set is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: