Index of papers in Proc. ACL 2013 that mention
  • feature set
Eidelman, Vladimir and Marton, Yuval and Resnik, Philip
Abstract
We evaluate our optimizer on Chinese-English and Arabic-English translation tasks, each with small and large feature sets, and show that our learner is able to achieve significant improvements of 1.2-2 BLEU and 1.7-4.3 TER on average over state-of-the-art optimizers with the large feature set .
Experiments
To evaluate the advantage of explicitly accounting for the spread of the data, we conducted several experiments on two Chinese-English translation test sets, using two different feature sets in each.
Experiments
We selected the bound step size D, based on performance on a held-out dev set, to be 0.01 for the basic feature set and 0.1 for the sparse feature set .
Experiments
4.2 Feature Sets
Introduction
Chinese-English translation experiments show that our algorithm, RM, significantly outperforms strong state-of-the-art optimizers, in both a basic feature setting and high-dimensional (sparse) feature space (§4).
Learning in SMT
The instability of MERT in larger feature sets (Foster and Kuhn, 2009; Hopkins and May, 2011), has motivated many alternative tuning methods for SMT.
feature set is mentioned in 25 sentences in this paper.
Topics mentioned in this paper:
Green, Spence and Wang, Sida and Cer, Daniel and Manning, Christopher D.
Abstract
We present a fast and scalable online method for tuning statistical machine translation models with large feature sets .
Adaptive Online Algorithms
When we have a large feature set and therefore want to tune on a large data set, batch methods are infeasible.
Adaptive Online MT
For example, simple indicator features like lexicalized reordering classes are potentially useful yet bloat the the feature set and, in the worst case, can negatively impact
Experiments
To the dense features we add three high dimensional “sparse” feature sets .
Experiments
The primary baseline is the dense feature set tuned with MERT (Och, 2003).
Experiments
with the PT feature set .
feature set is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Varga, István and Sano, Motoki and Torisawa, Kentaro and Hashimoto, Chikara and Ohtake, Kiyonori and Kawai, Takao and Oh, Jong-Hoon and De Saeger, Stijn
Experiments
In both experiments we observed that the performance drops when excitation polarities and trouble expressions are removed from the feature set .
Experiments
PROPOSED-*: The proposed method without the feature set denoted by “*”.
Experiments
PROPOSED-*z The proposed method without the feature set denoted by “*”.
Problem Report and Aid Message Recognizers
The feature set given to the SVMs are summarized in the top part of Table 2.
Problem Report and Aid Message Recognizers
Note that we used a common feature set for both the problem report recognizer and aid message recognizer and that it is categorized into several types: features concerning trouble expressions (TR), excitation polarity (EX), their combination (TREXl) and word sentiment polarity (WSP), features expressing morphological and syntactic structures of nuclei and their context surrounding problem/aid nuclei (MSA), features concerning semantic word classes (SWC) appearing in nuclei and their context, request phrases, such as “Please help us”, appearing in tweets (REQ), and geographical locations in tweets recognized by our location recognizer (GL).
Problem Report and Aid Message Recognizers
We also attempted to represent nucleus template IDs, noun IDs and their combinations directly in our feature set to capture typical templates fre-
Problem-Aid Match Recognizer
Here also we attempted to capture typical or frequent matches of nuclei using template and noun IDs and their combinations, but we did not observe any improvement so we omit them from the feature set .
Problem-Aid Match Recognizer
The bottom part of Table 2 summarizes the additional feature set , some of which are described below in more detail.
feature set is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Ferschke, Oliver and Gurevych, Iryna and Rittberger, Marc
Evaluation and Discussion
The SVMs achieve a similar cross-validated performance on all feature sets containing ngrams, showing only minor improvements for individual flaws when adding non-lexical features.
Evaluation and Discussion
Table 6 shows the performance of the SVMs with RBF kernel12 on each dataset using the NGRAM feature set .
Evaluation and Discussion
Classifiers using the N ONGRAM feature set achieved average F 1-scores below 0.50 on all datasets.
Experiments
We selected a subset of these features for our experiments and grouped them into four feature sets in order to determine how well different combinations of features perform in the task.
Experiments
Table 4: Feature sets used in the experiments
feature set is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Metallinou, Angeliki and Bohus, Dan and Williams, Jason
Generative state tracking
In DIS-CDYNl, we use the original feature set , ignoring the problem described above (so that the general features contribute no information), resulting in M + K weights.
Generative state tracking
The analysis of various feature sets indicates that the ASR/SLU error correlation (confusion) features yield the largest improvement — c.f.
Generative state tracking
feature set be compared to b in Table 3.
Introduction
portance of different feature sets for this task, and measure the amount of data required to reliably train our model.
feature set is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Yancheva, Maria and Rudzicz, Frank
Discussion and future work
l.00‘_ Feature Set ; —LIWC 3‘ ‘ —S ntactic «3 .90“.
Discussion and future work
Figure 3: Effect of feature set choice on cross-validation accuracy.
Discussion and future work
2012; Almela et al., 2012; Fomaciari and Poesio, 2012), our results suggest that the set of syntactic features presented here perform significantly better than the LIWC feature set on our data, and across seven out of the eight experiments based on age groups and verbosity of transcriptions.
Related Work
Descriptions of the data (section 3) and feature sets (section 4) precede experimental results (section 5) and the concluding discussion (section 6).
feature set is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Kozareva, Zornitsa
Conclusion
We have conducted exhaustive evaluation with multiple machine learning classifiers and different features sets spanning from lexical information to psychological categories developed by (Tausczik and Pennebaker, 2010).
Task A: Polarity Classification
We studied the influence of unigrams, bigrams and a combination of the two, and saw that the best performing feature set consists of the combination of unigrams and bigrams.
Task A: Polarity Classification
For each information source (metaphor, context, source, target and their combinations), we built a separate n-gram feature set and model, which was evaluated on 10-fold cross validation.
Task A: Polarity Classification
We have used different feature sets and information sources to solve the task.
Task B: Valence Prediction
We have studied different feature sets and information sources to solve the task.
feature set is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Oh, Jong-Hoon and Torisawa, Kentaro and Hashimoto, Chikara and Sano, Motoki and De Saeger, Stijn and Ohtake, Kiyonori
Causal Relations for Why-QA
We used the three types of feature sets in Table 3 for training the CRFs, where j is in the range of z' — 4 g j g i + 4 for current position i in a causal relation candidate.
Causal Relations for Why-QA
More detailed information concerning the configurations of all the nouns in all the candidates of an appropriate causal relation (including their cause parts) and the question are encoded into our feature set 6 f1—e f4 in Table 4 and the final judgment is done by our re-ranker.
Experiments
We evaluated the performance when we removed one of the three types of features (ALL-“MORPH”, ALL-“SYNTACTIC” and ALL-“C-MARKER”) and compared the results in these settings with the one when all the feature sets were used (ALL).
Experiments
We confirmed that all the feature sets improved the performance, and we got the best performance when using all of them.
feature set is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Scheible, Christian and Schütze, Hinrich
Distant Supervision
However, we did not find a cumulative effect (line 8) of the two feature sets .
Features
We refer to these feature sets as CoreLex (CX) and VerbNet (VN) features and to their combination as semantic features (SEM).
Features
feature set is referred to as named entities (NE).
Features
We refer to this feature set as sequential features (SQ).
feature set is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Wang, Aobo and Kan, Min-Yen
Experiment
To compare our joint inference versus other learning models, we also employed a decision tree (DT) learner, equipped with the same feature set as our FCRF.
Experiment
Both models take the whole feature set described in Section 2.3.
Experiment
3.4.3 Feature set evaluation
feature set is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Darwish, Kareem
Related Work
(2007) used a maximum entropy classifier trained on a feature set that includes the use of gazetteers and a stop-word list, appearance of a NE in the training set, leading and trailing word bigrams, and the tag of the previous word.
Related Work
(2008), they examined the same feature set on the Automatic Content Extraction (ACE) datasets using CRF
Related Work
Abdul-Hamid and Darwish (2010) used a simplified feature set that relied primarily on character level features, namely leading and trailing letters in a word.
feature set is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
He, Hua and Barbosa, Denilson and Kondrak, Grzegorz
Features
The principal feature sets are listed in Table 2, together with an indication whether they are novel or have been used in previous work.
Speaker Identification
Table 2: Principal feature sets .
Speaker Identification
subsequently we add three more feature sets that represent the following neighboring utterances: n — 2, n — l and n + l. Informally, the features of the utterances n — l and n + l encode the first observation, while the features representing the utterance n — 2 encode the second observation.
feature set is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Mukherjee, Arjun and Liu, Bing
Empirical Evaluation
To compare classification performance, we use two feature sets : (i) standard word + POS 1-4 grams and (ii) AD-expressions from $5.
Empirical Evaluation
Predicting agreeing arguing nature is harder than that of disagreeing across all feature settings .
Empirical Evaluation
Using the discovered AD-eXpressions (Table 6, last low) as features renders a statistically significant (see Table 6 caption) improvement over other baseline feature settings .
feature set is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Radziszewski, Adam
CRF and features
The work describes a feature set proposed for this task, which includes word forms in a local window, values of grammatical class, gender, number and case, tests for agreement on number, gender and case, as well as simple tests for letter case.
CRF and features
We took this feature set as a starting point.
CRF and features
The final feature set includes the following
feature set is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: