Index of papers in Proc. ACL 2008 that mention
  • feature set
Koo, Terry and Carreras, Xavier and Collins, Michael
Experiments
In order to evaluate the effectiveness of the cluster-based feature sets , we conducted dependency parsing experiments in English and Czech.
Experiments
In our English experiments, we tested eight different parsing configurations, representing all possible choices between baseline or cluster-based feature sets , first-order (Eisner, 2000) or second-order (Carreras, 2007) factorizations, and labeled or unlabeled parsing.
Experiments
Second, note that the parsers using cluster-based feature sets consistently outperform the models using the baseline features, regardless of model order or label usage.
Feature design
The feature sets we used are similar to other feature sets in the literature (McDonald et al., 2005a; Carreras, 2007), so we will not attempt to give a exhaustive description of the features in this section.
Feature design
In our experiments, we employed two different feature sets: a baseline feature set which draws upon “normal” information sources such as word forms and parts of speech, and a cluster-based feature set that also uses information derived from the Brown cluster hierarchy.
Feature design
Our first-order baseline feature set is similar to the feature set of McDonald et al.
feature set is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Li, Jianguo and Brew, Chris
Experiment Setup 4.1 Corpus
We evaluate six different feature sets for their effectiveness in AVC: SCF, DR, CO, ACO, SCF+CO, and J OANISO7 .
Experiment Setup 4.1 Corpus
The other four feature sets include both syntactic and lexical information.
Experiment Setup 4.1 Corpus
JOANISO7: We use the feature set proposed in J oanis et al.
Introduction
We develop feature sets that combine syntactic and lexical information, which are in principle useful for any Levin-style verb classification.
Introduction
We test the general applicability and scalability of each feature set to the distinctions among 48 verb classes involving 1,300 verbs, which is, to our knowledge, the largest investigation on English verb classification by far.
Introduction
To preview our results, a feature set that combines both syntactic information and lexical information works much better than either of them used alone.
Machine Learning Method
We construct a semantic space with each feature set .
Machine Learning Method
Except for J ONAISO7 which only contains 224 features, all the other feature sets lead to a very high-dimensional space.
Related Work
The deeper linguistic analysis allows their feature set to cover a variety of indicators of verb semantics, beyond that of frame information.
feature set is mentioned in 37 sentences in this paper.
Topics mentioned in this paper:
Polifroni, Joseph and Walker, Marilyn
Experiment Two
We derive two types of feature sets from the responses: features derived from each user model and features derived from attributes of the query/ response pair itself.
Experiment Two
The five feature sets for the user model are:
Experiment Two
0 allUtz’lz’ty: 12 features consisting of the high, low, and average utility scores from the previous three feature sets .
feature set is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Surdeanu, Mihai and Ciaramita, Massimiliano and Zaragoza, Hugo
Approach
To answer the second research objective we will analyze the contribution of the proposed feature set to this function.
Approach
For completeness we also include in the feature set the value of the t f - idf similarity measure.
Experiments
Feature Set MRR P@1
Experiments
The algorithm incrementally adds to the feature set the feature that provides the highest MRR improvement in the development partition.
Related Work
This approach allowed us to perform a systematic feature analysis on a large-scale real-world corpus and a comprehensive feature set .
Related Work
Our model uses a larger feature set that includes correlation and transformation-based features and five different content representations.
feature set is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Blunsom, Phil and Cohn, Trevor and Osborne, Miles
Challenges for Discriminative SMT
This problem of over-fitting is exacerbated in discriminative models with large, expressive, feature sets .
Challenges for Discriminative SMT
Learning with a large feature set requires many training examples and typically many iterations of a solver during training.
Evaluation
To do this we use our own implementation of Hiero (Chiang, 2007), with the same grammar but with the traditional generative feature set trained in a linear model with minimum BLEU training.
Evaluation
The feature set includes: a trigram language model (lm) trained
Evaluation
The relative scores confirm that our model, with its minimalist feature set, achieves comparable performance to the standard feature set without the language model.
feature set is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Bartlett, Susan and Kondrak, Grzegorz and Cherry, Colin
Syllabification Experiments
In this section, we will discuss the results of our best emission feature set (five-gram features with a context window of eleven letters) on held-out unseen test sets.
Syllabification with Structured SVMs
With SVM-HMM, the crux of the task is to create a tag scheme and feature set that produce good results.
Syllabification with Structured SVMs
After experimenting with the development set, we decided to include in our feature set a window of eleven characters around the focus character, five on either side.
Syllabification with Structured SVMs
As is apparent from Figure 2, we see a substantial improvement by adding bigrams to our feature set .
feature set is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Elsner, Micha and Charniak, Eugene
Future Work
We are also interested to see how well this feature set performs on speech data, as in (Aoki et al., 2003).
Related Work
They motivate a richer feature set , which, however, does not yet appear to be implemented.
Related Work
(2005) adds word repetition to their feature set .
Related Work
Our feature set incorporates information which has proven useful in meeting segmentation (Galley et al., 2003) and the task of detecting addressees of a specific utterance in a meeting (J ovanovic et al., 2006).
feature set is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Espinosa, Dominic and White, Michael and Mehay, Dennis
Conclusion
Finally, further efforts to engineer a grammar suitable for realization from the CCGbank should provide richer feature sets , which, as our feature ablation study suggests, are useful for boosting hypertagging performance, hence for finding better and more complete realizations.
Results and Discussion
The the whole feature set was found in feature ablation testing on the development set to outperform all other feature subsets significantly (p < 2.2 - 10—16).
Results and Discussion
The full feature set outperforms all others significantly (p < 2.2 - 10—16).
Results and Discussion
The results for the full feature set on Sections ()0 and 23 are outlined in Table 2.
feature set is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Saha, Sujan Kumar and Mitra, Pabitra and Sarkar, Sudeshna
Maximum Entropy Based Model for Hindi NER
In Table 2 we have shown the accuracy values for few feature sets .
Maximum Entropy Based Model for Hindi NER
Again when wi_2 and rut-+2 are deducted from the feature set (i.e.
Maximum Entropy Based Model for Hindi NER
When suffix, prefix and digit information are added to the feature set , the f-value is increased upto 74.26.
feature set is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Veale, Tony and Hao, Yanfen and Li, Guofu
Empirical Evaluation: Simile-derived Representations
Suspecting that a noisy feature set had contributed to the apparent drop in performance, these authors then proceed to apply a variety of noise filters to reduce the set of feature values to 51,345, which in turn leads to an improved cluster purity measure of 62.7%.
Empirical Evaluation: Simile-derived Representations
In experiment 2, we see a similar ratio of feature quantities before filtering; after some initial filtering, Almuhareb and Poesio reduce their feature set to just under 10 times the size of the simile-derived feature set .
Empirical Evaluation: Simile-derived Representations
First, the feature representations do not need to be hand-filtered and noise-free to be effective; we see from the above results that the raw values extracted from the simile pattern prove slightly more effective than filtered feature sets used by Almuhareb and Poesio.
Related Work
As noted by the latter authors, this results in a much smaller yet more diagnostic feature set for each concept.
feature set is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Haghighi, Aria and Liang, Percy and Berg-Kirkpatrick, Taylor and Klein, Dan
Experimental Setup
Table 1: Performance of EDITDIST and our model with various features sets on EN -ES-W. See section 5.
Experimental Setup
We will use MCCA (for matching CCA) to denote our model using the optimal feature set (see section 5.3).
Introduction
As an example of the performance of the system, in English-Spanish induction with our best feature set , using corpora derived from topically similar but nonparallel sources, the system obtains 89.0% precision at 33% recall.
feature set is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Huang, Liang
Experiments
Our feature set is summarized in Table 2, which closely follows Chamiak and Johnson (2005), except that we excluded the nonlocal features Edges, NGram, and CoPar, and simplified Rule and NGramTree features, since they were too complicated to compute.4 We also added four unlexicalized local features from Collins (2000) to cope with data-sparsity.
Experiments
tures in the updated version.5 However, our initial experiments show that, even with this much simpler feature set , our 50-best reranker performed equally well as theirs (both with an F-score of 91.4, see Tables 3 and 4).
Experiments
This result confirms that our feature set design is appropriate, and the averaged perceptron learner is a reasonable candidate for reranking.
feature set is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: