Index of papers in Proc. ACL 2013 that mention
  • SVM
Fukumoto, Fumiyo and Suzuki, Yoshimi and Matsuyoshi, Suguru
Abstract
We applied an error detection and correction technique to the results of positive and negative documents classified by the Support Vector Machines ( SVM ).
Framework of the System
As error candidates, we focus on support vectors (SVs) extracted from the training documents by SVM .
Framework of the System
Training by SVM is performed to find the optimal hyperplane consisting of SVs, and only the SVs affect the performance.
Framework of the System
We set these selected documents to negative training documents (N1), and apply SVM to learn classifiers.
Introduction
uses soft-margin SVM as the underlying classifiers (Liu et al., 2003).
Introduction
They reported that the results were comparable to the current state-of-the-art biased SVM method.
Introduction
Like much previous work on semi-supervised ML, we apply SVM to the positive and unlabeled data, and add the classification results to the training data.
SVM is mentioned in 23 sentences in this paper.
Topics mentioned in this paper:
Sarioglu, Efsun and Yadav, Kabir and Choi, Hyeong-Ah
Background
Support vector machines ( SVM ) is a popular classification algorithm that attempts to find a decision boundary between classes that is the farthest from any point in the training dataset.
Background
Given labeled training data (mt,yt),t = 1, ...,N where 30,; 6 RM and y; E {1, —1}, SVM tries to find a separating hyperplane with the maximum margin (Platt, 1998).
Experiments
SVM was chosen as the classification algorithm as it was shown that it performs well in text classification tasks (J oachims, 1998; Yang and Liu, 1999) and it is robust to overfitting (Sebastiani, 2002).
Experiments
Accordingly, the raw text of the reports and topic vectors are compiled into individual files with their corresponding outcomes in ARFF and then classified with SVM .
Related Work
tor classification results with SVM , however (Sri-urai, 2011) uses a fixed number of topics, whereas we evaluated different number of topics since typically this is not known in advance.
Results
Classification results using ATC and SVM are shown in Figures 2, 3, and 4 for precision, recall, and f-score respectively.
Results
Best classification performance was achieved with 15 topics for ATC and 100 topics for SVM .
Results
For smaller number of topics, ATC performed better than SVM .
SVM is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Shardlow, Matthew
Discussion
Whilst the thresholding and simplify everything methods were not significantly different from each other, the SVM method was significantly different from the other two (p < 0.001).
Discussion
This can be seen in the slightly lower recall, yet higher precision attained by the SVM .
Discussion
This indicates that the SVM was better at distinguishing between complex and simple words, but also wrongly identified many CWs.
Experimental Design
Support vector machines ( SVM ) are statistical classifiers which use labelled training data to predict the class of unseen inputs.
Experimental Design
The training data consist of several features which the SVM uses to distinguish between classes.
Experimental Design
The SVM was chosen as it has been used elsewhere for similar tasks (Gasperin et al., 2009; Hancke et al., 2012; J auhar and Specia, 2012).
Results
Everything Thresholding SVM
Results
To analyse the features of the SVM , the correlation coefficient between each feature vector and the vector of feature labels was calculated.
SVM is mentioned in 19 sentences in this paper.
Topics mentioned in this paper:
Wang, Aobo and Kan, Min-Yen
Experiment
We re-implemented Xia and Wong (2008)’s extended Support Vector Machine ( SVM ) based microtext IWR system to compare with our method.
Experiment
Both the SVM and DT models are provided by the Weka3 (Hall et al., 2009) toolkit, using its default configuration.
Experiment
Adapted SVM for Joint Classification.
SVM is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Yancheva, Maria and Rudzicz, Frank
Discussion and future work
RF MLP RF MLP RF SVM
Discussion and future work
While past research has used logistic regression as a binary classifier (Newman et al., 2003), our experiments show that the best-performing classifiers allow for highly nonlinear class boundaries; SVM and RF models achieve between 62.5% and 91.7% accuracy across age groups — a significant improvement over the baselines of LR and NB, as well as over previous results.
Related Work
Two classifiers, Nai've Bayes (NB) and a support vector machine (SVM), were applied on the tokenized and stemmed statements to obtain best classification accuracies of 70% (abortion topic, NB), 67.4% (death penalty topic, NB), and 77% (friend description, SVM ), where the baseline was taken to be 50%.
Related Work
The authors note this as well by demonstrating significantly lower results of 59.8% for NB and 57.8% for SVM when cross-topic classification is performed by training each classifier on two topics and testing on the third.
Results
We evaluate five classifiers: logistic regression (LR), a multilayer perceptron (MLP), nai've Bayes (NB), a random forest (RF), and a support vector machine ( SVM ).
Results
The SVM is a parametric binary classifier that provides highly nonlinear decision boundaries given particular kernels.
Results
The SVM classifier
SVM is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Cohn, Trevor and Specia, Lucia
Conclusion
Model MAE RMSE p 0.5596 0.7053 MA 0.5184 0.6367 us 0.5888 0.7588 MT 0.6300 0.8270 Pooled SVM 0.5823 0.7472 Independent A SVM 0.5058 0.6351 EasyAdapt SVM 0.7027 0.8816 SINGLE-TASK LEARNING Independent A 0.5091 0.6362 Independents 0.5980 0.7729 Pooled 0.5834 0.7494 Pooled & {N} 0.4932 0.6275 MULTITASK LEARNING: Annotator Combined A 0.4815 0.6174 CombinedA & {N} 0.4909 0.6268 Combined+A 0.4855 0.6203 Combined+A & {N} 0.4833 0.6102 MULTITASK LEARNING: Translation system Combineds 0.5825 0.7482 MULTITASK LEARNING: Sentence pair CombinedT 0.5813 0.7410 MULTITASK LEARNING: Combinations Combined A, 5 0.4988 0.6490 Combined A, s & {N A, 5} 0.4707 0.6003 Combined+A, 5 0.4772 0.6094 Combined 14,51 0.4588 0.5852 Combined A, s,T & {N A, 5} 0.4723 0.6023
Gaussian Process Regression
In typical usage, the kernel hyperparameters for an SVM are fit using held-out estimation, which is inefficient and often involves tying together parameters to limit the search complexity (e.g., using a single scale parameter in the squared exponential).
Gaussian Process Regression
Multiple-kemel learning (Go'nen and Alpaydin, 2011) goes some way to addressing this problem within the SVM framework, however this technique is limited to reweighting linear combinations of kernels and has high computational complexity.
Multitask Quality Estimation 4.1 Experimental Setup
Baselines: The baselines use the SVM regression algorithm with radial basis function kernel and parameters 7, e and C optimised through grid-search and 5-fold cross validation on the training set.
Multitask Quality Estimation 4.1 Experimental Setup
a 0.8279 0.9899 SVM 0.6889 0.8201
Multitask Quality Estimation 4.1 Experimental Setup
,a is a baseline which predicts the training mean, SVM uses the same system as the WMT12 QE task, and the remainder are GP regression models with different kernels (all include additive noise).
SVM is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Liu, Ding and Yang, Xiaofang and Jiang, Minghu
Discussion
We present here our model of text classification and compare it with SVM and KNN on two datasets.
Discussion
Moreover, the QC performs well in text classification compared with SVM and KNN and outperforms them on small-scale training sets.
Experiment
We compared the performance of QC with several classical classification methods, including Support Vector Machine ( SVM ) and K-nearest neighbor (KNN).
Experiment
We randomly selected training samples from the training pool ten times to train QC, SVM , and KNN classifier respectively and then verified the three trained classifiers on the testing sets, the results of which are illustrated in Figure 4.
Experiment
We noted that the QC performed better than both KNN and SVM on small-scale training sets, when the number of training samples is less than 50.
SVM is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Eidelman, Vladimir and Marton, Yuval and Resnik, Philip
Introduction
We focus on large-margin methods such as SVM (Joachims, 1998) and passive-aggressive algorithms such as MIRA.
The Relative Margin Machine in SMT
It is maximized by minimizing the norm in SVM , or analogously, the proximity constraint in MIRA: arg minW — wt||2.
The Relative Margin Machine in SMT
RMM was introduced as a generalization over SVM that incorporates both the margin constraint
The Relative Margin Machine in SMT
Nonetheless, since structured RMM is a generalization of Structured SVM , which shares its underlying objective with MIRA, our intuition is that SMT should be able to benefit as well.
SVM is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Ferret, Olivier
Experiments and evaluation
As mentioned before, this classifier is a linear SVM .
Improving a distributional thesaurus
More precisely, we follow (Lee and Ng, 2002), a reference work for WSD, by adopting a Support Vector Machines ( SVM ) classifier with a linear kernel and three kinds of features for characterizing each considered occur-
Improving a distributional thesaurus
For the second type of features, we take more precisely the POS of the three words before E and those of the three words after E. Each pair {POS, position} corresponds to a binary feature for the SVM classifier.
Improving a distributional thesaurus
Each instance of the 11 types of collocations is represented by a tuple (lemmal, positionl, lemma2, position2> and leads to a binary feature for the SVM classifier.
SVM is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Aker, Ahmet and Paramita, Monica and Gaizauskas, Rob
Abstract
For classification we use an SVM binary classifier and training data taken from the EUROVOC thesaurus.
Feature extraction
To align or map source and target terms we use an SVM binary classifier (J oachims, 2002) with a linear kernel and the tradeoff between training error and margin parameter c = 10.
Method
For classification purposes we use an SVM binary classifier.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Danescu-Niculescu-Mizil, Cristian and Sudhof, Moritz and Jurafsky, Dan and Leskovec, Jure and Potts, Christopher
Predicting politeness
The BOW classifier is an SVM using a unigram feature representation.6 We consider this to be a strong baseline for this new
Predicting politeness
is an SVM using the linguistic features listed in Table 3 in addition to the unigram features.
Predicting politeness
For new requests, we use class probability estimates obtained by fitting a logistic regression model to the output of the SVM (Witten and Frank, 2005) as predicted politeness scores (with values between 0 and l; henceforth politeness, by abuse of language).
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Perez-Rosas, Veronica and Mihalcea, Rada and Morency, Louis-Philippe
Discussion
As before, the linguistic, acoustic, and visual features are averaged over the entire video, and we use an SVM classifier in tenfold cross validation experiments.
Experiments and Results
We use the entire set of 412 utterances and run ten fold cross validations using an SVM classifier, as implemented in the Weka toolkit.5 In line with previous work on emotion recognition in speech (Haq and Jackson, 2009; Anagnostopoulos and Vovoli, 2010) where utterances are selected in a speaker dependent manner (i.e., utterances from the same speaker are included in both training and test), as well as work on sentence-level opinion classification where document boundaries are not considered in the split performed between the training and test sets (Wilson et al., 2004; Wiegand and Klakow, 2009), the training/test split for each fold is performed at utterance level regardless of the video they belong to.
Multimodal Sentiment Analysis
These simple weighted unigram features have been successfully used in the past to build sentiment classifiers on text, and in conjunction with Support Vector Machines ( SVM ) have been shown to lead to state-of-the-art performance (Maas et al., 2011).
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Ramteke, Ankit and Malu, Akshat and Bhattacharyya, Pushpak and Nath, J. Saketha
A Machine Learning based approach
We use the SVM classifier with features generated using the following steps.
Conclusions and Future Work
This ontology guides a rule based approach to thwarting detection, and also provides features for an SVM based learning system.
Results
We used the CVX3 library in Matlab to solve the optimization problem for learning weights and the LIBSVM4 library to implement the svm classifier.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Silberer, Carina and Ferrari, Vittorio and Lapata, Mirella
Attribute-based Classification
We used an L2-regularized L2-loss linear SVM (Fan et a1., 2008) to learn the attribute predictions.
Attribute-based Classification
data was randomly split into a training and validation set of equal size in order to find the optimal cost parameter C. The final SVM for the attribute was trained on the entire training data, i.e., on all positive and negative examples.
Attribute-based Classification
The SVM learners used the four different feature types proposed in Farhadi et al.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhu, Jun and Zheng, Xun and Zhang, Bo
Experiments
For gLDA, we learn a binary linear SVM on its topic representations using SVMLight (J oachims, 1999).
Experiments
The results of DiscLDA (Lacoste-Jullien et al., 2009) and linear SVM on raw bag-of-words features were reported in (Zhu et al., 2012).
Experiments
The fact that gLDA+SVM performs better than the standard gSLDA is due to the same reason, since the SVM part of gLDA+SVM can well capture the supervision information to learn a classifier for good prediction, while standard sLDA can’t well-balance the influence of supervision.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: