Index of papers in Proc. ACL that mention
  • SVM
Bartlett, Susan and Kondrak, Grzegorz and Cherry, Colin
Introduction
We formulate syllabification as a tagging problem, and learn a discriminative tagger from labeled data using a structured support vector machine ( SVM ) (Tsochantaridis et al., 2004).
Structured SVMs
A structured support vector machine ( SVM ) is a large-margin training method that can learn to predict structured outputs, such as tag sequences or parse trees, instead of performing binary classification (Tsochantaridis et al., 2004).
Structured SVMs
We employ a structured SVM that predicts tag sequences, called an SVM Hidden Markov Model, or SVM-HMM.
Structured SVMs
This approach can be considered an SVM because the model parameters are trained discrimi-natively to separate correct tag sequences from incorrect ones by as large a margin as possible.
Syllabification with Structured SVMs
The SVM framework is less restrictive: we can include 0 as an emission feature, but we can also include features indicating that the preceding and following letters are m and r respectively.
SVM is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Wang, William Yang and Hua, Zhenhao
Experiments
The baselines are standard squared-loss linear regression, linear kernel SVM, and nonlinear (Gaussian) kernel SVM .
Experiments
We use the Statistical Toolbox’s linear regression implementation in Matlab, and LibSVM (Chang and Lin, 2011) for training and testing the SVM models.
Experiments
The hyperparameter C in linear SVM, and the 7 and C hyperparameters in Gaussian SVM are tuned on the training set using 10-fold cross-validation.
Introduction
0 Our results significantly outperform standard linear regression and strong SVM baselines.
Related Work
(2003) are among the first to study SVM and text mining methods in the market prediction domain, where they align financial news articles with multiple time series to simulate the 33 stocks in the Hong Kong Hang Seng Index.
Related Work
(2009) model the SEC-mandated annual reports, and performs linear SVM regression with e-insensitive loss function to predict the measured volatility.
Related Work
Traditional discriminative models, such as linear regression and linear SVM , have been very popular in various text regression tasks, such as predicting movie revenues from reviews (Joshi et al., 2010), understanding the geographic lexical variation (Eisenstein et al., 2010), and predicting food prices from menus (Chahuneau et al., 2012).
SVM is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Qian, Tieyun and Liu, Bing and Chen, Li and Peng, Zhiyong
Experimental Evaluation
We use logistic regression (LR) with L2 regularization (Fan et al., 2008) and the SVMWWW ( SVM ) system (Joachims, 2007) with its default settings as the classifiers.
Experimental Evaluation
It self-trains two classifiers from the character 3- gram, lexical, and syntactic views using CNG and SVM classifiers (Kourtis and Stamatatos, 2011).
Experimental Evaluation
The original method applied only CNG and SVM on the character n-gram view.
Introduction
However, the self-training method in (Kourtis and Stamatatos, 2011) uses two classifiers (CNG and SVM ) on one view.
Proposed Tri-Training Algorithm
Many classification algorithms give such scores, e.g., SVM and logistic regression.
Related Work
On developing effective learning techniques, supervised classification has been the dominant approach, e.g., neural networks (Graham et al., 2005; Zheng et al., 2006), decision tree (Uzuner and Katz, 2005; Zhao and Zobel, 2005), logistic regression (Madigan et al., 2005), SVM (Diederich et al., 2000; Gamon 2004; Li et al., 2006; Kim et al., 2011), etc.
SVM is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Duan, Manjuan and White, Michael
Abstract
However, by using an SVM ranker to combine the realizer’s model score together with features from multiple parsers, including ones designed to make the ranker more robust to parsing mistakes, we show that significant increases in BLEU scores can be achieved.
Abstract
Moreover, via a targeted manual analysis, we demonstrate that the SVM reranker frequently manages to avoid vicious ambiguities, while its ranking errors tend to affect fluency much more often than adequacy.
Introduction
Consequently, we examine two reranking strategies, one a simple baseline approach and the other using an SVM reranker (J oachims, 2002).
Introduction
Therefore, to develop a more nuanced self-monitoring reranker that is more robust to such parsing mistakes, we trained an SVM using dependency precision and recall features for all three parses, their n-best parsing results, and per-label precision and recall for each type of dependency, together with the realizer’s normalized perceptron model score as a feature.
Introduction
With the SVM reranker, we obtain a significant improvement in BLEU scores over
Reranking with SVMs 4.1 Methods
Similarly, we conjectured that large differences in the realizer’s perceptron model score may more reliably reflect human fluency preferences than small ones, and thus we combined this score with features for parser accuracy in an SVM ranker.
Reranking with SVMs 4.1 Methods
Additionally, given that parsers may more reliably recover some kinds of dependencies than others, we included features for each dependency type, so that the SVM ranker might learn how to weight them appropriately.
Reranking with SVMs 4.1 Methods
We trained the SVM ranker (J oachims, 2002) with a linear kernel and chose the hyper-parameter c, which tunes the tradeoff between training error and margin, with 6-fold cross-validation on the devset.
SVM is mentioned in 23 sentences in this paper.
Topics mentioned in this paper:
Yancheva, Maria and Rudzicz, Frank
Discussion and future work
RF MLP RF MLP RF SVM
Discussion and future work
While past research has used logistic regression as a binary classifier (Newman et al., 2003), our experiments show that the best-performing classifiers allow for highly nonlinear class boundaries; SVM and RF models achieve between 62.5% and 91.7% accuracy across age groups — a significant improvement over the baselines of LR and NB, as well as over previous results.
Related Work
Two classifiers, Nai've Bayes (NB) and a support vector machine (SVM), were applied on the tokenized and stemmed statements to obtain best classification accuracies of 70% (abortion topic, NB), 67.4% (death penalty topic, NB), and 77% (friend description, SVM ), where the baseline was taken to be 50%.
Related Work
The authors note this as well by demonstrating significantly lower results of 59.8% for NB and 57.8% for SVM when cross-topic classification is performed by training each classifier on two topics and testing on the third.
Results
We evaluate five classifiers: logistic regression (LR), a multilayer perceptron (MLP), nai've Bayes (NB), a random forest (RF), and a support vector machine ( SVM ).
Results
The SVM is a parametric binary classifier that provides highly nonlinear decision boundaries given particular kernels.
Results
The SVM classifier
SVM is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Wang, Aobo and Kan, Min-Yen
Experiment
We re-implemented Xia and Wong (2008)’s extended Support Vector Machine ( SVM ) based microtext IWR system to compare with our method.
Experiment
Both the SVM and DT models are provided by the Weka3 (Hall et al., 2009) toolkit, using its default configuration.
Experiment
Adapted SVM for Joint Classification.
SVM is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Shardlow, Matthew
Discussion
Whilst the thresholding and simplify everything methods were not significantly different from each other, the SVM method was significantly different from the other two (p < 0.001).
Discussion
This can be seen in the slightly lower recall, yet higher precision attained by the SVM .
Discussion
This indicates that the SVM was better at distinguishing between complex and simple words, but also wrongly identified many CWs.
Experimental Design
Support vector machines ( SVM ) are statistical classifiers which use labelled training data to predict the class of unseen inputs.
Experimental Design
The training data consist of several features which the SVM uses to distinguish between classes.
Experimental Design
The SVM was chosen as it has been used elsewhere for similar tasks (Gasperin et al., 2009; Hancke et al., 2012; J auhar and Specia, 2012).
Results
Everything Thresholding SVM
Results
To analyse the features of the SVM , the correlation coefficient between each feature vector and the vector of feature labels was calculated.
SVM is mentioned in 19 sentences in this paper.
Topics mentioned in this paper:
Sarioglu, Efsun and Yadav, Kabir and Choi, Hyeong-Ah
Background
Support vector machines ( SVM ) is a popular classification algorithm that attempts to find a decision boundary between classes that is the farthest from any point in the training dataset.
Background
Given labeled training data (mt,yt),t = 1, ...,N where 30,; 6 RM and y; E {1, —1}, SVM tries to find a separating hyperplane with the maximum margin (Platt, 1998).
Experiments
SVM was chosen as the classification algorithm as it was shown that it performs well in text classification tasks (J oachims, 1998; Yang and Liu, 1999) and it is robust to overfitting (Sebastiani, 2002).
Experiments
Accordingly, the raw text of the reports and topic vectors are compiled into individual files with their corresponding outcomes in ARFF and then classified with SVM .
Related Work
tor classification results with SVM , however (Sri-urai, 2011) uses a fixed number of topics, whereas we evaluated different number of topics since typically this is not known in advance.
Results
Classification results using ATC and SVM are shown in Figures 2, 3, and 4 for precision, recall, and f-score respectively.
Results
Best classification performance was achieved with 15 topics for ATC and 100 topics for SVM .
Results
For smaller number of topics, ATC performed better than SVM .
SVM is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Fukumoto, Fumiyo and Suzuki, Yoshimi and Matsuyoshi, Suguru
Abstract
We applied an error detection and correction technique to the results of positive and negative documents classified by the Support Vector Machines ( SVM ).
Framework of the System
As error candidates, we focus on support vectors (SVs) extracted from the training documents by SVM .
Framework of the System
Training by SVM is performed to find the optimal hyperplane consisting of SVs, and only the SVs affect the performance.
Framework of the System
We set these selected documents to negative training documents (N1), and apply SVM to learn classifiers.
Introduction
uses soft-margin SVM as the underlying classifiers (Liu et al., 2003).
Introduction
They reported that the results were comparable to the current state-of-the-art biased SVM method.
Introduction
Like much previous work on semi-supervised ML, we apply SVM to the positive and unlabeled data, and add the classification results to the training data.
SVM is mentioned in 23 sentences in this paper.
Topics mentioned in this paper:
Wang, William Yang and Mayfield, Elijah and Naidu, Suresh and Dittmar, Jeremiah
Prediction Experiments
In the first experiment, we compare the prediction accuracy of our SME model to a widely used discriminative learner in NLP — the linear kernel support vector machine ( SVM )3.
Prediction Experiments
In the second experiment, in addition to the linear kernel SVM , we also compare our SME model to a state-of-the-art sparse generative model of text (Eisenstein et al., 2011a), and vary the size of input vocabulary W exponentially from 29 to the full size of our training vocabulary4.
Prediction Experiments
We use threefold cross-validation to infer the learning rate 6 and cost C hyperpriors in the SME and SVM model respectively.
Related Work
Traditional discriminative methods, such as support vector machine ( SVM ) and logistic regression, have been very popular in various text categorization tasks (J oachims, 1998; Wang and McKe-own, 2010) in the past decades.
Related Work
For example, SVM does not have latent variables to model the subtle differences and interactions of features from different domains (e.g.
SVM is mentioned in 21 sentences in this paper.
Topics mentioned in this paper:
Meng, Xinfan and Wei, Furu and Liu, Xiaohua and Zhou, Ming and Xu, Ge and Wang, Houfeng
Experiment
MT—SVM: We translate the English labeled data to Chinese using Google Translate and use the translation results to train the SVM classifier for Chinese.
Experiment
SVM: We train a SVM classifier on the Chinese labeled data.
Experiment
First, two monolingual SVM classifiers are trained on English labeled data and Chinese data translated from English labeled data.
Introduction
The experiment results show that CLMM yields 71% in accuracy when no Chinese labeled data are used, which significantly improves Chinese sentiment classification and is superior to the SVM and co-training based methods.
Introduction
When Chinese labeled data are employed, CLMM yields 83% in accuracy, which is remarkably better than the SVM and achieve state-of-the-art performance.
Related Work
(2002) compare the performance of three commonly used machine learning models (Naive Bayes, Maximum Entropy and SVM ).
Related Work
Ga-mon (2004) shows that introducing deeper linguistic features into SVM can help to improve the performance.
Related Work
English Labeled data are first translated to Chinese, and then two SVM classifiers are trained on English and Chinese labeled data respectively.
SVM is mentioned in 19 sentences in this paper.
Topics mentioned in this paper:
Mohler, Michael and Bunescu, Razvan and Mihalcea, Rada
Answer Grading System
Using each of these as features, we use Support Vector Machines ( SVM ) to produce a combined real-number grade.
Answer Grading System
The weight vector u is trained to optimize performance in two scenarios: Regression: An SVM model for regression (SVR) is trained using as target function the grades assigned by the instructors.
Answer Grading System
Ranking: An SVM model for ranking (SVMRank) is trained using as ranking pairs all pairs of student answers (AS,At) such that grade(Az-,AS) > grade(Az-,At), where A,- is the corresponding instructor answer.
Discussion and Conclusions
The correlation for the BOW-only SVM model for SVMRank improved upon the best BOW feature
Discussion and Conclusions
Likewise, using the BOW-only SVM model for SVR reduces the RMSE by .022 overall compared to the best BOW feature.
Results
5.4 SVM Score Grading
Results
The SVM components of the system are run on the full dataset using a 12-fold cross validation.
Results
Both SVM models are trained using a linear kernel.11 Results from both the SVR and the SVMRank implementations are reported in Table 7 along with a selection of other measures.
SVM is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Wu, Zhili and Markert, Katja and Sharoff, Serge
Discussion
As expected, the structural methods on either skewed or flattened hierarchies are not significantly better than the flat SVM .
Discussion
For the flattened hierarchy of 15 leaf genres the maximal accuracy is 54.2% vs. 52.4% for the flat SVM (Figure 3), a nonsignificant improvement.
Experiments
As a baseline we use the accuracy achieved by a standard "flat" SVM.
Experiments
A standard flat SVM achieves an accuracy of 64.4% whereas the best structural SVM based on Lin’s information content distance measure (IC-lin-word-bnc) achieves 68.8% accuracy, significantly better at the 1% level.
Experiments
Table 1 summarizes the best performing measures that all outperform the flat SVM at the 1% level.
Genre Distance Measures
The structural SVM (Section 2) requires a distance measure h between two genres.
Structural SVMs
To strengthen the constraints, the zero value on the right hand side of the inequality for the flat SVM can be replaced by a positive value, corresponding to a distance measure h(yi, m) between two genre classes, leading to the following constraint:
SVM is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Wan, Xiaojun
Empirical Evaluation 4.1 Evaluation Setup
SVM(CN): This method applies the inductive SVM with only Chinese features for sentiment classification in the Chinese view.
Empirical Evaluation 4.1 Evaluation Setup
SVM(EN): This method applies the inductive SVM with only English features for sentiment classification in the English view.
Empirical Evaluation 4.1 Evaluation Setup
SVM(ENCNI): This method applies the inductive SVM with both English and Chinese features for sentiment classification in the two views.
Introduction
SVM , NB), and the classification performance is far from satisfactory because of the language gap between the original language and the translated language.
Introduction
The SVM classifier is adopted as the basic classifier in the proposed approach.
Related Work 2.1 Sentiment Classification
Standard Na'1've Bayes and SVM classifiers have been applied for subjectivity classification in Romanian (Mihalcea et al., 2007; Banea et al., 2008), and the results show that automatic translation is a viable alternative for the construction of resources and tools for subjectivity analysis in a new target language.
Related Work 2.1 Sentiment Classification
To date, many semi-supervised learning algorithms have been developed for addressing the cross-domain text classification problem by transferring knowledge across domains, including Transductive SVM (Joachims, 1999), EM(Nigam et al., 2000), EM-based Na'1've Bayes classifier (Dai et al., 2007a), Topic-bridged PLSA (Xue et al., 2008), Co-Clustering based classification (Dai et al., 2007b), two-stage approach (Jiang and Zhai, 2007).
The Co-Training Approach
Typical text classifiers include Support Vector Machine ( SVM ), Na'1've Bayes (NB), Maximum Entropy (ME), K-Nearest Neighbor (KNN), etc.
The Co-Training Approach
In this study, we adopt the widely-used SVM classifier (Joachims, 2002).
The Co-Training Approach
as two sets of vectors in a feature space, SVM constructs a separating hyperplane in the space by maximizing the margin between the two data sets.
SVM is mentioned in 19 sentences in this paper.
Topics mentioned in this paper:
Dou, Qing and Bergsma, Shane and Jiampojamarn, Sittichai and Kondrak, Grzegorz
Abstract
We represent words as sequences of substrings, and use the substrings as features in a Support Vector Machine ( SVM ) ranker, which is trained to rank possible stress patterns.
Automatic Stress Prediction
We use a support vector machine ( SVM ) to rank the possible patterns for each sequence (Section 3.2).
Automatic Stress Prediction
These units are used to define the features and outputs used by the SVM ranker.
Automatic Stress Prediction
The SVM can thus generalize from observed words to similarly-spelled, unseen examples.
Introduction
We divide each word into a sequence of substrings, and use these substrings as features for a Support Vector Machine ( SVM ) ranker.
Introduction
The task of the SVM is to rank the true stress pattern above the small number of acceptable alternatives.
Introduction
The SVM ranker achieves exceptional 96.2% word accuracy on the challenging task of predicting the full stress pattern in English.
SVM is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Dasgupta, Sajib and Ng, Vincent
Evaluation
Transductive SVM .
Evaluation
Specifically, we begin by training an inductive SVM on one labeled example from each class, iteratively labeling the most uncertain unlabeled point on each side of the hyperplane and retraining the SVM until 100 points are labeled.
Evaluation
Finally, we train a transductive SVM on the 100 labeled points and the remaining 1900 unlabeled points, obtaining the results in row 3 of Table 1.
Our Approach
Specifically, we train a discriminative classifier using the support vector machine ( SVM ) learning algorithm (J oachims, 1999) on the set of unambiguous reviews, and then apply the resulting classifier to all the reviews in the training folds4 that are not seeds.
Our Approach
As our weakly supervised learner, we employ a transductive SVM .
Our Approach
Hence, instead of training just one SVM classifier, we aim to reduce classification errors by training an ensemble of five classifiers, each of which uses all 100 manually labeled reviews and a different subset of the 500 automatically labeled reviews.
SVM is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Das, Dipanjan and Smith, Noah A.
Data and Task
Dolan and Brockett (2005) remark that this corpus was created semiautomatically by first training an SVM classifier on a disjoint annotated 10,000 sentence pair dataset and then applying the SVM on an unseen 49,375 sentence pair corpus, with its output probabilities skewed towards over-identification, i.e., towards generating some false paraphrases.
Experimental Evaluation
The SVM was trained to classify positive and negative examples of paraphrase using SVMlight (J oachims, 1999).8 Metaparameters, tuned on the development data, were the regularization constant and the degree of the polynomial kernel (chosen in [10—5, 102] and 1—5 respectively.
Experimental Evaluation
It is unsurprising that the SVM performs very well on the MSRPC because of the corpus creation process (see Sec.
Experimental Evaluation
4) where an SVM was applied as well, with very similar features and a skewed decision process (Dolan and Brockett, 2005).
Product of Experts
LR (like the QG) provides a probability distribution, but uses surface features (like the SVM ).
Product of Experts
2; this model is on par with the SVM , though trading recall in favor of precision.
Product of Experts
We view it as a probabilistic simulation of the SVM more suitable for combination with the QG.
SVM is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Celikyilmaz, Asli and Thint, Marcus and Huang, Zhiheng
Experiments
Features Model MRR Topl Tops Baseline — 42.3% 32.7% 54.5% QTCF SVM 51 .9% 44.6% 63.4% SSL 49.5% 43.1% 60.9% LexSem SVM 48.2% 40.6% 61.4% SSL 47.9% 40.1% 58.4% QComp SVM 54.2% 47.5% 64.3% SSL 51.9% 45.5% 62.4%
Experiments
We performed manual iterative parameter optimization during training based on prediction accuracy to find the best k—nearest parameter for SSL, i.e., k = {3,5,10,20,50} , and best 0 = {10—2,..,102} and 7 2 {2‘2, ..,23} for RBF kernel SVM .
Experiments
We applied SVM and our graph based SSL method with no summarization to learn models using labeled training and testing datasets.
Feature Extraction for Entailment
The QC model is trained via support vector machines ( SVM ) (Vapnik, 1995) considering different features such as semantic headword feature based on variation of Collins rules, hypernym extraction via Lesk word disambiguation (Lesk, 1988), regular expressions for wh-word indicators, n-grams, word-shapes(capitals), etc.
Graph Summarization
Using a separate learner, e.g., SVM (Vapnik, 1995), we obtain predicted outputs, Y5 = (33f, ..., 39214) of X 5 and append observed labels Y5 = Y5 U YL.
SVM is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Andreevskaia, Alina and Bergler, Sabine
Domain Adaptation in Sentiment Research
They applied an out-of-domain-trained SVM classifier to label examples from the target domain and then retrained the classifier using these new examples.
Domain Adaptation in Sentiment Research
Depending on the similarity between domains, this method brought up to 15% gain compared to the baseline SVM .
Experiments
Dataset Movie News Blogs PRs Dataset size 1066 800 800 1200 unigrams SVM 68.5 61.5 63.85 76.9 NB 60.2 59.5 60.5 74.25 nb features 5410 4544 3615 2832 bigrams SVM 59.9 63.2 61.5 75.9 NB 57.0 58.4 59.5 67.8 nb features 16286 14633 15182 12951 trigrams SVM 54.3 55.4 52.7 64.4 NB 53.3 57.0 56.0 69.7 nb features 20837 18738 19847 19132
Experiments
Table 4: Accuracy of SVM with unigram model
Experiments
results depends on the genre and size of the n-gram: on product reviews, all results are statistically significant at oz 2 0.025 level; on movie reviews, the difference between NaVe Bayes and SVM is statistically significant at oz 2 0.01 but the significance diminishes as the size of the n- gram increases; on news, only bigrams produce a statistically significant (a = 0.01) difference between the two machine learning methods, while on blogs the difference between SVMs and NaVe Bayes is most pronounced when unigrams are used (a = 0.025).
Factors Affecting System Performance
To our knowledge, the only work that describes the application of statistical classifiers ( SVM ) to sentence-level sentiment classification is (Gamon and Aue, 2005)1.
Integrating the Corpus-based and Dictionary-based Approaches
Using then an SVM meta-classifier trained on a small number of target domain examples to combine the nine base classifiers, they obtained a statistically significant improvement on out-of-domain texts from book reviews, knowledge-base feedback, and product support services survey data.
Lexicon-Based Approach
The baseline performance of the Lexicon-Based System (LBS) described above is presented in Table 5, along with the performance results of the in-domain- and out-of-domain-trained SVM classifier.
Lexicon-Based Approach
Movies ‘ News ‘ Blogs ‘ PRs LBS 57.5 62.3 63.3 59.3 SVM in-dom.
Lexicon-Based Approach
68.5 61.5 63.85 76.9 SVM out-of-dom.
SVM is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
duVerle, David and Prendinger, Helmut
Building a Discourse Parser
l SVM Training | ‘ Feature Extraction w Classification [ SVM Models (Binary and Multiclass) J v [ Scored RS sub-trees ] v ‘ Bottom-up Tree Construction
Building a Discourse Parser
Support Vector Machines (SVM) (Vapnik, 1995) are used to model classifiers S and L. SVM refers to a set of supervised learning algorithms that are based on margin maximization.
Building a Discourse Parser
This makes SVM well-fitted to treat classification problems involving relatively large feature spaces such as ours (% 105 features).
Evaluation
4.2 Raw SVM Classification
Evaluation
Although our final goal is to achieve good performance on the entire tree-building task, a useful intermediate evaluation of our system can be conducted by measuring raw performance of SVM classifiers.
Evaluation
Table 1: SVM Classifier performance.
Features
Instrumental to our system’s performance is the choice of a set of salient characteristics (“features”) to be used as input to the SVM algorithm for training and classification.
SVM is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Cohn, Trevor and Specia, Lucia
Conclusion
Model MAE RMSE p 0.5596 0.7053 MA 0.5184 0.6367 us 0.5888 0.7588 MT 0.6300 0.8270 Pooled SVM 0.5823 0.7472 Independent A SVM 0.5058 0.6351 EasyAdapt SVM 0.7027 0.8816 SINGLE-TASK LEARNING Independent A 0.5091 0.6362 Independents 0.5980 0.7729 Pooled 0.5834 0.7494 Pooled & {N} 0.4932 0.6275 MULTITASK LEARNING: Annotator Combined A 0.4815 0.6174 CombinedA & {N} 0.4909 0.6268 Combined+A 0.4855 0.6203 Combined+A & {N} 0.4833 0.6102 MULTITASK LEARNING: Translation system Combineds 0.5825 0.7482 MULTITASK LEARNING: Sentence pair CombinedT 0.5813 0.7410 MULTITASK LEARNING: Combinations Combined A, 5 0.4988 0.6490 Combined A, s & {N A, 5} 0.4707 0.6003 Combined+A, 5 0.4772 0.6094 Combined 14,51 0.4588 0.5852 Combined A, s,T & {N A, 5} 0.4723 0.6023
Gaussian Process Regression
In typical usage, the kernel hyperparameters for an SVM are fit using held-out estimation, which is inefficient and often involves tying together parameters to limit the search complexity (e.g., using a single scale parameter in the squared exponential).
Gaussian Process Regression
Multiple-kemel learning (Go'nen and Alpaydin, 2011) goes some way to addressing this problem within the SVM framework, however this technique is limited to reweighting linear combinations of kernels and has high computational complexity.
Multitask Quality Estimation 4.1 Experimental Setup
Baselines: The baselines use the SVM regression algorithm with radial basis function kernel and parameters 7, e and C optimised through grid-search and 5-fold cross validation on the training set.
Multitask Quality Estimation 4.1 Experimental Setup
a 0.8279 0.9899 SVM 0.6889 0.8201
Multitask Quality Estimation 4.1 Experimental Setup
,a is a baseline which predicts the training mean, SVM uses the same system as the WMT12 QE task, and the remainder are GP regression models with different kernels (all include additive noise).
SVM is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Escalante, Hugo Jair and Solorio, Thamar and Montes-y-Gomez, Manuel
Authorship Attribution With LOWBOW Representations
For both types of representations we consider an SVM classifier under the one-vs-all formulation for facing the AA problem.
Authorship Attribution With LOWBOW Representations
We consider SVM as base classifier because this method has proved to be very effective in a large number of applications, including AA (Houvardas and Stamatatos, 2006; Plakias and Stamatatos, 2008b; Plakias and Stamatatos, 2008a); further, since SVMs are kernel-based methods, they allow us to use local histograms for AA by considering kernels that work over sets of histograms.
Authorship Attribution With LOWBOW Representations
We build a multiclass SVM classifier by considering the pairs of patterns-outputs associated to documents-authors.
Experiments and Results
All our experiments use the SVM implementation provided by Canu et al.
Experiments and Results
Columns show the true author for test documents and rows show the authors predicted by the SVM .
Experiments and Results
The SVM with BOW representation of character n-grams achieved recognition rates of 40% and 50% for BL and JM respectively.
Related Work
applied to this problem, including support vector machine ( SVM ) classifiers (Houvardas and Stamatatos, 2006) and variants thereon (Plakias and Stamatatos, 2008b; Plakias and Stamatatos, 2008a), neural networks (Tearle et al., 2008), Bayesian classifiers (Coyotl-Morales et al., 2006), decision tree methods (Koppel et al., 2009) and similarity based techniques (Keselj et al., 2003; Lambers and Veenman, 2009; Stamatatos, 2009b; Koppel et al., 2009).
Related Work
In this work, we chose an SVM classifier as it has reported acceptable performance in AA and because it will allow us to directly compare results with previous work that has used this same classifier.
SVM is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Yamangil, Elif and Shieber, Stuart M.
Evaluation
We compared the Gibbs sampling compressor (GS) against a version of maximum a posteriori EM (with Dirichlet parameter greater than 1) and a discriminative STSG based on SVM training (Cohn and Lapata, 2008) ( SVM ).
Evaluation
EM is a natural benchmark, while SVM is also appropriate since it can be taken as the state of the art for our task.4
Evaluation
Nonetheless, because the comparison system is a generalization of the extractive SVM compressor of Cohn and Lapata (2007), we do not expect that the results would differ qualitatively.
Introduction
We achieve substantial improvements against a number of baselines including EM, support vector machine ( SVM ) based discriminative training, and variational Bayes (VB).
SVM is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Ji, Yangfeng and Eisenstein, Jacob
Large-Margin Learning Framework
As we will see, it is possible to learn {Wm} using standard support vector machine ( SVM ) training (holding A fixed), and then make a simple gradient-based update to A (holding {Wm} fixed).
Large-Margin Learning Framework
As is standard in the multi-class linear SVM (Crammer and Singer, 2001), we can solve the problem defined in Equation 6 via Lagrangian optimization:
Large-Margin Learning Framework
If A is fixed, then the optimization problem is equivalent to a standard multi-class SVM , in the transformed feature space f (vi; A).
SVM is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Garera, Nikesh and Yarowsky, David
Corpus Details
As our reference algorithm, we used the current state-of-the-art system developed by Boulis and Ostendorf (2005) using unigram and bigram features in a SVM framework.
Corpus Details
Table 12 Top 20 ngram features for gender, ranked by the weights assigned by the linear SVM model
Corpus Details
After extracting the ngrams, a SVM model was trained via the SVMlight toolkit (J oachims, 1999) using the linear kernel with the default toolkit settings.
SVM is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Ding, Shilin and Cong, Gao and Lin, Chin-Yew and Zhu, Xiaoyan
Context and Answer Detection
SVM , can be employed, where each pair of question and candidate context will be treated as an instance.
Experiments
Model H Prec(%) l Rec(%) 1 F1(%) 1 Context Detection SVM 75.27 68.80 71.32 C4.5 70.16 64.30 67.21 L—CRF 75.75 72.84 74.45 Answer Detection SVM 73.31 47.35 57.52 C4.5 65.36 46.55 54.37 L—CRF 63.92 58.74 61.22
Experiments
This experiment is to evaluate Linear CRF model (Section 3.1) for context and answer detection by comparing with SVM and C4.5(Quinlan, 1993).
Experiments
For SVM , we use SVMlightanchims, 1999).
Introduction
Experimental results show that 1) Linear CRFs outperform SVM and decision tree in both context and answer detection; 2) Skip-chain CRFs outperform Linear CRFs for answer finding, which demonstrates that context improves answer finding; 3) 2D CRF model improves the performance of Linear CRFs and the combination of 2D CRFs and Skip-chain CRFs achieves better performance for context detection.
Related Work
(2007) used SVM to extract input-reply pairs from forums for chatbot knowledge.
SVM is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Branavan, S.R.K. and Kushman, Nate and Lei, Tao and Barzilay, Regina
Experimental Setup
Baselines To evaluate the performance of our relation extraction, we compare against an SVM classifier8 trained on the Gold Relations.
Experimental Setup
We test the SVM baseline in a leave-one-out fashion.
Experimental Setup
Model F-score 0.4 _ ---- -- SVM F-score ---------- -- All-text F-score
Introduction
Our results demonstrate the strength of our relation extraction technique — while using planning feedback as its only source of supervision, it achieves a precondition relation extraction accuracy on par with that of a supervised SVM baseline.
Results
We also show the performance of the supervised SVM baseline.
Results
Feature Analysis Figure 7 shows the top five positive features for our model and the SVM baseline.
Results
Figure 7: The top five positive features on words and dependency types learned by our model (above) and by SVM (below) for precondition prediction.
SVM is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Wang, Qin Iris and Schuurmans, Dale and Lin, Dekang
Conclusion and Future Work
Unlike previous proposed approaches, we introduce a convex objective for the semi-supervised learning algorithm by combining a convex structured SVM loss and a convex least square loss.
Introduction
In particular, they present an algorithm for multi-class unsupervised and semi-supervised SVM learning, which relaxes the original non-convex objective into a close convex approximation, thereby allowing a global solution to be obtained.
Introduction
More specifically, for the loss on the unlabeled data part, we substitute the original unsupervised structured SVM loss with a least squares loss, but keep constraints on the inferred prediction targets, which avoids trivialization.
Introduction
ing semi-supervised convex objective to dependency parsing, and obtain significant improvement over the corresponding supervised structured SVM .
Semi-supervised Convex Training for Structured SVM
Although semi-supervised structured SVM learning has been an active research area, semi-supervised structured SVMs have not been used in many real applications to date.
Semi-supervised Convex Training for Structured SVM
By combining the convex structured SVM loss on labeled data (shown in Equation (5)) and the convex least squares loss on unlabeled data (shown in Equation (8)), we obtain a semi-supervised structured large margin loss
Semi-supervised Structured Large Margin Objective
The objective of standard semi-supervised structured SVM is a combination of structured large margin losses on both labeled and unlabeled data.
SVM is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
HaCohen-Kerner, Yaakov and Kass, Ariel and Peretz, Ariel
Abbreviation Disambiguation
: Maximum Entropy, SVM and C50.
Abstract
An accuracy of 96.09% has been achieved by SVM .
Experiments
Several well-known supervised ML methods have been selected: artificial neural networks (ANN), Nai've Bayes (NB), Support Vector Machines ( SVM ) and J48 (Witten and Frank, 1999) an improved variant of the C4.5 decision tree induction.
Experiments
Table 2 shows that SVM achieved the best result with 96.09% accuracy.
Experiments
ants ML Method ANN NB SVM J48
SVM is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Chan, Wen and Zhou, Xiangdong and Wang, Wei and Chua, Tat-Seng
Experimental Results
We adapt the Support Vector Machine ( SVM ) and Logistic Regression (LR) which have been reported to be effective for classification and the Linear CRF (LCRF) which is used to summarize ordinary text documents in (Shen et al., 2007) as baselines for comparison.
Experimental Results
Table 2 shows that our general CRF model based on question segmentation with group L1 regularization outperforms the baselines significantly in all three measures (gCRF—QS-ll is 13.99% better than SVM in precision, 9.77% better in recall and 11.72% better in F1 score).
Experimental Results
We note that both SVM and LR,
Introduction
The experimental results show that the proposed model improve the performance signifi-cantly(in terms of precision, recall and F1 measures) as well as the ROUGE-l, ROUGE-2 and ROUGE-L measures as compared to the state-of-the-art methods, such as Support Vector Machines ( SVM ), Logistic Regression (LR) and Linear CRF (LCRF) (Shen et al., 2007).
SVM is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Li, Tao and Zhang, Yi and Sindhwani, Vikas
Experiments
We also compare the results of SSMFLK with those of two supervised classification methods: Support Vector Machine ( SVM ) and Naive Bayes.
Experiments
—e— Consistency Method 0.51 —I— Homonic—CMN + Green Function 0.45 7 + SVM
Experiments
6 f —9— SSMFLK 0.5 - —6— Consistency Method -—I— Homonic—CMN + Green Function 0.4 ' + SVM + Naive Bayes
Related Work
Most work in machine learning literature on utilizing labeled features has focused on using them to generate weakly labeled examples that are then used for standard supervised learning: (Schapire et al., 2002) propose one such framework for boosting logistic regression; (Wu and Srihari, 2004) build a modified SVM and (Liu et al., 2004) use a combination of clustering and EM based methods to instantiate similar frameworks.
SVM is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Lu, Bin and Tan, Chenhao and Cardie, Claire and K. Tsou, Benjamin
Experimental Setup 4.1 Data Sets and Preprocessing
SVM: This method learns an SVM classifier for each language given the monolingual labeled data; the unlabeled data is not used.
Experimental Setup 4.1 Data Sets and Preprocessing
Monolingual TSVM (TSVM-M): This method learns two transductive SVM (TSVM) classifiers given the monolingual labeled data and the monolingual unlabeled data for each language.
Experimental Setup 4.1 Data Sets and Preprocessing
First, two monolingual SVM classifiers are built based on only the corresponding labeled data, and then they are bootstrapped by adding the most confident predicted examples from the unlabeled data into the training set.
Introduction
maximum entropy and SVM classifiers) as well as two alternative methods for leveraging unlabeled data (transductive SVMs (Joachims, 1999b) and co-training (Blum and Mitchell, 1998)).
Results and Analysis
Among the baselines, the best is Co-SVM; TSVMs do not always improve performance using the unlabeled data compared to the standalone SVM ; and TSVM-B outperforms TSVM-M except for Chinese in the second setting.
Results and Analysis
8Significance is tested using paired t-tests with p<0.05: denotes statistical significance compared to the corresponding performance of MaXEnt; * denotes statistical significance compared to SVM ; and r denotes statistical significance compared to Co-SVM.
SVM is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Liu, Ding and Yang, Xiaofang and Jiang, Minghu
Discussion
We present here our model of text classification and compare it with SVM and KNN on two datasets.
Discussion
Moreover, the QC performs well in text classification compared with SVM and KNN and outperforms them on small-scale training sets.
Experiment
We compared the performance of QC with several classical classification methods, including Support Vector Machine ( SVM ) and K-nearest neighbor (KNN).
Experiment
We randomly selected training samples from the training pool ten times to train QC, SVM , and KNN classifier respectively and then verified the three trained classifiers on the testing sets, the results of which are illustrated in Figure 4.
Experiment
We noted that the QC performed better than both KNN and SVM on small-scale training sets, when the number of training samples is less than 50.
SVM is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Martineau, Justin and Chen, Lu and Cheng, Doreen and Sheth, Amit
Experiments
Methods: We evaluated the overall performance relative to the common SVM bag of words approach that can be ubiquitously found in text mining literature.
Experiments
o SVM-TF: Uses a bag of words SVM with term frequency weights.
Experiments
SVM-Delta-IDF: Uses a bag of words SVM classification with TF.Delta-IDF weights (Formula 2) in the feature vectors before training or testing an SVM .
Related Work
(2012) propose an algorithm which first trains individual SVM classifiers on several small, class-balanced, random subsets of the dataset, and then reclassifies each training instance using a majority vote of these individual classifiers.
SVM is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Li, Jiwei and Ott, Myle and Cardie, Claire and Hovy, Eduard
Experiments
10We use SVMlight (J oachims, 1999) to train our linear SVM classifiers
Experiments
For SVM , models trained on POS and LIWC features achieve even lower accuracy than Unigram.
Experiments
tive model, SAGE achieve much better results than SVM , and is around 0.65 accurate in the cross-domain task.
Feature-based Additive Model
If we instead use SVM , for example, we would have to train classifiers one by one (due to the distinct features from different sources) to draw conclusions regarding the differences between Turker vs Expert vs truthful reviews, positive expert vs negative expert reviews, or reviews from different domains.
Introduction
In the examples in Table l, we trained a linear SVM classifier on Ott’s Chicago-hotel dataset on unigram features and tested it on a couple of different domains (the details of data acquisition are illustrated in Section 3).
Introduction
Table 1: SVM performance on datasets for a classifier trained on Chicago hotel review based on Unigram feature.
SVM is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Li, Shoushan and Huang, Chu-Ren and Zhou, Guodong and Lee, Sophia Yat Mei
Unsupervised Mining of Personal and Impersonal Views
We apply both support vector machine ( SVM ) and Maximum Entropy (ME) algorithms with the help of the SVM-light4 and Mallet5 tools.
Unsupervised Mining of Personal and Impersonal Views
We find that ME performs slightly better than SVM on the average.
Unsupervised Mining of Personal and Impersonal Views
Transductive SVM , which seeks the largest separation between labeled and unlabeled data through regularization (Joachims, 1999).
SVM is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Mairesse, François and Walker, Marilyn
Parameter Estimation Models
Continuous parameters are modeled with a linear regression model (LR), an M5’ model tree (M5), and a model based on support vector machines with a linear kernel ( SVM ).
Parameter Estimation Models
We test a Naive Bayes classifier (NB), a j48 decision tree (J48), a nearest-neighbor classifier using one neighbor (NN), a Java implementation of the RIPPER rule-based learner (J RIP), the AdaBoost boosting algorithm (ADA), and a support vector machines classifier with a linear kernel ( SVM ).
Parameter Estimation Models
Figure 3: SVM model with a linear kernel predicting the CONTENT POLARITY parameter.
SVM is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Qian, Longhua and Hui, Haotian and Hu, Ya'nan and Zhou, Guodong and Zhu, Qiaoming
Abstract
Input: - L, labeled data set - U, unlabeled data set - n, batch size Output: - SVM , classifier Repeat: 1.
Abstract
Train a single classifier SVM on L 2.
Abstract
The objective is to learn SVM classifiers in both languages, denoted as SVMC and SVMe respectively, in a BAL fashion to improve their classification performance.
SVM is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Ferret, Olivier
Experiments and evaluation
As mentioned before, this classifier is a linear SVM .
Improving a distributional thesaurus
More precisely, we follow (Lee and Ng, 2002), a reference work for WSD, by adopting a Support Vector Machines ( SVM ) classifier with a linear kernel and three kinds of features for characterizing each considered occur-
Improving a distributional thesaurus
For the second type of features, we take more precisely the POS of the three words before E and those of the three words after E. Each pair {POS, position} corresponds to a binary feature for the SVM classifier.
Improving a distributional thesaurus
Each instance of the 11 types of collocations is represented by a tuple (lemmal, positionl, lemma2, position2> and leads to a binary feature for the SVM classifier.
SVM is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Eidelman, Vladimir and Marton, Yuval and Resnik, Philip
Introduction
We focus on large-margin methods such as SVM (Joachims, 1998) and passive-aggressive algorithms such as MIRA.
The Relative Margin Machine in SMT
It is maximized by minimizing the norm in SVM , or analogously, the proximity constraint in MIRA: arg minW — wt||2.
The Relative Margin Machine in SMT
RMM was introduced as a generalization over SVM that incorporates both the margin constraint
The Relative Margin Machine in SMT
Nonetheless, since structured RMM is a generalization of Structured SVM , which shares its underlying objective with MIRA, our intuition is that SMT should be able to benefit as well.
SVM is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Wang, Chang and Fan, James
Experiments
In SVM implementations, the tradeoff parameter between training error and margin was set to l for all experiments.
Experiments
We compare our approaches to three state-of-the-art approaches including SVM with convolution tree kernels (Collins and Duffy, 2001), linear regression and SVM with linear kernels (Scholkopf and Smola, 2002).
Experiments
The SVM with linear kernels and the linear regression model used the same features as the manifold models.
SVM is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Kim, Jungi and Li, Jin-Ji and Lee, Jong-Hyeok
Experiment
4.3 Classification task — SVM
Experiment
4.3.1 Experimental Setting To test our SVM classifier, we perform the classification task.
Experiment
Table 3: Average tenfold cross-validation accuracies of polarity classification task with SVM .
Term Weighting and Sentiment Analysis
Specifically, we explore the statistical term weighting features of the word generation model with Support Vector machine ( SVM ), faithfully reproducing previous work as closely as possible (Pang et al., 2002).
SVM is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Ciobanu, Alina Maria and Dinu, Liviu P.
Experiments
We experiment with two machine-learning approaches: Naive Bayes and SVM .
Experiments
We report the n-gram values for which the best results are obtained and the hyperparameters for SVM , c and 7.
Experiments
The SVM produces better results for all languages except Portuguese, where the accuracy is equal.
Our Approach
For SVM , we use the wrapper provided by Weka for LibSVM (Chang and Lin, 2011).
SVM is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Jiang, Long and Yang, Shiquan and Zhou, Ming and Liu, Xiaohua and Zhu, Qingsheng
Adaptive Pattern-based Bilingual Data Mining
Next, in the pattern learning module, those translation snippet pairs are used to find candidate patterns and then a SVM classifier is built to select the most useful patterns shared by most translation pairs in the whole text.
Adaptive Pattern-based Bilingual Data Mining
After all pattern candidates are extracted, a SVM classifier is used to select the good ones:
Adaptive Pattern-based Bilingual Data Mining
In this SVM model, each pattern candidate pi has the following four features:
Overview of the Proposed Approach
Then a SVM classifier is trained to select good patterns from all extracted pattern candidates.
SVM is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Wang, WenTing and Su, Jian and Tan, Chew Lim
Experiments and Results
We employ an SVM coreference resolver trained and tested on ACE 2005 with 79.5% Precision, 66.7% Recall and 72.5% F1 to label coreference mentions of the same named entity in an article.
Incorporating Structural Syntactic Information
And thus an SVM classifier can be learned and then used for recognition.
Introduction
Section 4 introduces the frame work for discourse recognition, as well as the baseline feature space and the SVM classifier.
The Recognition Framework
The classifier learned by SVM is:
The Recognition Framework
One advantage of SVM is that we can use tree kernel approach to capture syntactic parse tree information in a particular high-dimension space.
SVM is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Abu-Jbara, Amjad and Radev, Dragomir
Approach
We use Support Vector Machines ( SVM ) with linear kernel as our classifier.
Approach
We use SVM with linear kernel as our classifier.
Evaluation
Sentence Filtering Evaluation: We used Support Vector Machines ( SVM ) with linear kernel as our classifier.
Evaluation
Sentence Classification Evaluation: We used SVM in this step as well.
Evaluation
Author Name Replacement Evaluation: The classifier used in this task is also SVM .
SVM is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Park, Keun Chan and Jeong, Yoonjae and Myaeng, Sung Hyon
Experience Detection
While we tested several classifiers, we chose to use two different classifiers based on SVM and Logistic Regression for the final experimental results because they showed the best performance.
Experience Detection
Logistic Feature Regression SVM
Experience Detection
Logistic Feature Regression SVM
Lexicon Construction
ME SVM Prec.
SVM is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Kalchbrenner, Nal and Grefenstette, Edward and Blunsom, Phil
Experiments
NB 41.0 81.8 BINB 41.9 83.1 SVM 40.7 79.4 REcNTN 45.7 85.4 MAX-TDNN 37.4 77.1 NBOW 42.4 80.5 DCNN 48.5 86.8
Experiments
SVM is a support vector machine with unigram and bigram features.
Experiments
head word, parser SVM hypernyms, WordNet
SVM is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Sun, Jun and Zhang, Min and Tan, Chew Lim
Substructure Spaces for BTKs
In the 1st phase, a kernel based classifier, SVM in our study, is employed to classify each candidate subtree pair as aligned or unaligned.
Substructure Spaces for BTKs
Since SVM is a large margin based discriminative classifier rather than a probabilistic model, we introduce a sigmoid function to convert the distance against the hyperplane to a posterior alignment probability as follows:
Substructure Spaces for BTKs
We use SVM with binary classes as the classifier.
SVM is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Woodsend, Kristian and Lapata, Mirella
Experimental Setup
We learned the feature weights with a linear SVM , using the software SVM-OOPS (Woodsend and Gondzio, 2009).
Experimental Setup
For each phrase, features were extracted and salience scores calculated from the feature weights determined through SVM training.
Experimental Setup
The distance from the SVM hyperplane represents the salience score.
SVM is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Sassano, Manabu and Kurohashi, Sadao
Experimental Evaluation and Discussion
We set the degree of the kernels to 3 since cubic kernels with SVM have proved effective for Japanese dependency parsing (Kudo and Matsumoto, 2000; Kudo and Matsumoto, 2002).
Experimental Evaluation and Discussion
Stopping Criteria It is known that increment rate of the number of support vectors in SVM indicates saturation of accuracy improvement during iterations of active learning (Schohn and Cohn, 2000).
Experimental Evaluation and Discussion
It is interesting to examine whether the observation for SVM is also useful for support vectors7 of the averaged perceptron.
SVM is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Jansen, Peter and Surdeanu, Mihai and Clark, Peter
CR + LS + DMM + DPM 39.32* +24% 47.86* +20%
For all experiments we used a linear SVM kernel.15
CR + LS + DMM + DPM 39.32* +24% 47.86* +20%
Table 4 shows that of the highest-weighted SVM features learned when training models for HOW questions on YA and Bio, many are shared (e.g., 56.5% of the features in the top half of both DPMs are shared), suggesting that a core set of discourse features may be of utility across domains.
CR + LS + DMM + DPM 39.32* +24% 47.86* +20%
Table 4: Percentage of top features with the highest SVM weights that are shared between Bio HOW and YA models.
SVM is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Habash, Nizar and Roth, Ryan
Experimental Settings
The PZD system relies on a set of SVM classifiers trained using morphological and lexical features.
Experimental Settings
The SVM classifiers are built using Yamcha (Kudo and Matsumoto, 2003).
Experimental Settings
Simple features are used directly by the PZD SVM models, whereas Binned features’ (numerical) values are reduced to a small, labeled category set whose labels are used as model features.
SVM is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Jiang, Long and Yu, Mo and Zhou, Ming and Liu, Xiaohua and Zhao, Tiejun
Approach Overview
In each of the first two steps, a binary SVM classifier is built to perform the classification.
Experiments
In the experiments, we consider the positive and negative tweets annotated by humans as subjective tweets (i.e., positive instances in the SVM classifiers), which amount to 727 tweets.
Related Work
According to the experimental results, machine learning based classifiers outperform the unsupervised approach, where the best performance is achieved by the SVM classifier with unigram presences as features.
Related Work
In contrast, (Barbosa and Feng, 2010) propose a two-step approach to classify the sentiments of tweets using SVM classifiers with abstract features.
SVM is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Gao, Wei and Blitzer, John and Zhou, Ming and Wong, Kam-Fai
Experiments and Results
fied; (2) for the identified query pairs, there should be sufficient statistics of associated clickthrough data; (3) The click frequency should be well distributed at both sides so that the preference order between bilingual document pairs can be derived for SVM learning.
Introduction
For both languages, we achieve significant improvements over monolingual Ranking SVM (RSVM) baselines (Herbrich et al., 2000; J oachims, 2002), which exploit a variety of monolingual features.
Learning to Rank Using Bilingual Information
We resort to Ranking SVM (RSVM) (Herbrich et al., 2000; Joachims, 2002) learning for classification on pairs of instances.
Learning to Rank Using Bilingual Information
The problem is to solve SVM objective: rrgn + A 21.2].
SVM is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Tang, Duyu and Wei, Furu and Yang, Nan and Zhou, Ming and Liu, Ting and Qin, Bing
Related Work
(2) SVM : The ngram features and Support Vector Machine are widely used baseline methods to build sentiment classifiers (Pang et al., 2002).
Related Work
LibLinear is used to train the SVM classifier.
Related Work
(3) NBSVM: NBSVM (Wang and Manning, 2012) is a state-of—the-art performer on many sentiment classification datasets, which trades-off between Naive Bayes and NB-enhanced SVM .
SVM is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Park, Souneil and Lee, Kyung Soon and Song, Junehwa
Abstract
We applied a modified version of HITS algorithm and an SVM classifier trained with pseudo-relevant data for article analysis.
Disputant relation-based method
As for the rest of the sentences, a similarity analysis is conducted with an SVM classifier.
Disputant relation-based method
where SU: number of all sentences of the article Qi: number of quotes from the side i. Qij: number of quotes from either side i or j. Si: number of sentences classified to i by SVM .
Introduction
We applied a modified version of HITS algorithm to identify the key opponents of an issue, and used disputant extraction techniques combined with an SVM classifier for article analysis.
SVM is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Sun, Xu and Okazaki, Naoaki and Tsujii, Jun'ichi
Results and Discussion
We compared the performance of the DPLVM with the CRFs and other baseline systems, including the heuristic system (Heu), the HMM model, and the SVM model described in $08, i.e., Sun et al.
Results and Discussion
The SVM method described by Sun et al.
Results and Discussion
In general, the results indicate that all of the sequential labeling models outperformed the SVM regression model with less training time.3 In the SVM regression approach, a large number of negative examples are explicitly generated for the training, which slowed the process.
SVM is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Huang, Minlie and Ye, Borui and Wang, Yichen and Chen, Haiqiang and Cheng, Junjun and Zhu, Xiaoyan
Conclusion
# Pos/Neg Lexicon SVM Hownet 627/1,038 0.737 0.756 Hownet+NW 743/ 1,150 0.770 0.779 Hownet+T100 679/ 1,172 0.761 0.774 cptHownet 138/125 0.738 0.758 cptHownet+NW 254/237 0.774 0.782 cptHownet+T100 190/159 0.764 0.775
Experiment
The second model is a SVM model in which opinion words are used as feature, and 5-fold cross validation is conducted.
Experiment
5 This is not necessary for the SVM model.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Chen, Yanping and Zheng, Qinghua and Zhang, Wei
Related Work
The TRE systems use techniques such as: Rules (Regulars, Patterns and Propositions) (Miller et al., 1998), Kernel method (Zhang et al., 2006b; Zelenko et al., 2003), Belief network (Roth and Yih, 2002), Linear programming (Roth and Yih, 2007), Maximum entropy (Kambhatla, 2004) or SVM (GuoDong et al., 2005).
Related Work
(2005) introduced a feature based method, which utilized lexicon information around entities and was evaluated on Winnow and SVM classifiers.
Related Work
For each type of these relations, a SVM was trained and tested independently.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Joshi, Aditya and Mishra, Abhijit and Senthamilselvan, Nivvedan and Bhattacharyya, Pushpak
Discussion
We use three sentiment classification techniques: Na‘1've Bayes, MaxEnt and SVM with un-igrams, bigrams and trigrams as features.
Discussion
7http://scikit-learn.org/stable/ 8In case of SVM , the probability of predicted class is computed as given in Platt (1999).
Discussion
MaxEnt (Movie) -0.29 (72.17) MaxEnt (Twitter) -0.26 (71.68) SVM (Movie) -().24 (66.27) SVM (Twitter) -().19 (73.15)
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Hashimoto, Chikara and Torisawa, Kentaro and Kloetzer, Julien and Sano, Motoki and Varga, István and Oh, Jong-Hoon and Kidawara, Yutaka
Event Causality Extraction Method
An event causality candidate is given a causality score 0 8 core, which is the SVM score (distance from the hyperplane) that is normalized to [0,1] by the sigmoid function Each event causality candidate may be given multiple original sentences, since a phrase pair can appear in multiple sentences, in which case it is given more than one SVM score.
Experiments
(2011): CEAWS is an unsupervised method that uses CEA to rank event causality candidates, and CEAsup is a supervised method using SVM and the CEA features, whose ranking is based on the SVM scores.
Experiments
The baselines are as follows: Csuns is an unsupervised method that uses 03 for ranking, and Cssup is a supervised method using SVM with 03 as the only feature that uses SVM scores for ranking.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Kim, Jungi and Li, Jin-Ji and Lee, Jong-Hyeok
Multilingual Subjectivity System
Previous studies have found that, among several ML-based approaches, the SVM classifier generally performs well in many subjectivity analysis tasks (Pang et al., 2002; Banea et al., 2008).
Multilingual Subjectivity System
An SVM score (a margin or the distance from a learned decision boundary) with a positive value predicts the input as being subjective, and negative value as objective.
Multilingual Subjectivity System
The second and the third approaches are carried out as follows: Corpus-based (T-CB): We translate the MPQA corpus into the target languages sentence by sentence using a web-based service.6 Using the same method for S-CB, we train an SVM model for each language with the translated training corpora.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Iida, Ryu and Kobayashi, Syumpei and Tokunaga, Takenobu
Empirical Evaluation
SVM ) should be separately created with regards to distinct features.
Empirical Evaluation
We utilised SVanl‘;8 as an implementation of the Ranking SVM algorithm, in which the parameter c was set as 1.0 and the remaining parameters were set to their defaults.
Reference Resolution using Extra-linguistic Information
Although the work by Denis and Baldridge (2008) uses Maximum Entropy to create their ranking-based model, we adopt the Ranking SVM algorithm (J oachims, 2002), which learns a weight vector to rank candidates for a given partial ranking of each referent.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Prabhakaran, Vinodkumar and Rambow, Owen
Predicting Direction of Power
Handling of undefined values for features in SVM is not straightforward.
Predicting Direction of Power
Most SVM implementations assume the value of 0 by default in such cases, conflating them
Predicting Direction of Power
Since we use a quadratic kernel, we expect the SVM to pick up the interaction between each feature and its indicator feature.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Costa, Francisco and Branco, António
Comparing the two Datasets
SMO is an implementation of Support Vector Machines ( SVM ), rules.JRip is the RIPPER algorithm, and bayes .NaiveBayes is a Naive Bayes classifier.
Comparing the two Datasets
In task C, the SVM algorithm was also the best performing algorithm among those that were also tried on the English data, but decision trees produced even better results here.
Comparing the two Datasets
The results are: in task A the lazy.KStar classifier scored 58.6%, and the SVM classifier scored 75.5% in task B and 59.4% in task C, with trees .
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Jeon, Je Hun and Liu, Yang
Previous work
She also exploited a semi-supervised approach using Laplacian SVM classification on a small set of examples.
Prosodic event detection method
Our previous supervised learning approach (Jeon and Liu, 2009) showed that a combined model using Neural Network (NN) classifier for acoustic-prosodic evidence and Support Vector Machine ( SVM ) classifier for syntactic-prosodic evidence performed better than other classifiers.
Prosodic event detection method
We therefore use NN and SVM in this study.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Sun, Le and Han, Xianpei
Introduction
Finally, new relation instances are extracted using kernel based classifiers, e. g., the SVM classifier.
Introduction
We apply the one vs. others strategy for multiple classification using SVM .
Introduction
For SVM training, the parameter C is set to 2.4 for all experiments, and the tree kernel parameter A is tuned to 0.2 for FTK and 0.4 (the optimal parameter setting used in Qian et al.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Feng, Vanessa Wei and Hirst, Graeme
Related work
In particular, starting from EDUs, at each step of the tree-building, a binary SVM classifier is first applied to determine which pair of adjacent discourse constituents should be merged to form a larger span, and another multi-class SVM classifier is then applied to assign the type of discourse relation that holds between the chosen pair.
Related work
Also, the employment of SVM classifiers allows the incorporation of rich features for better data representation (Feng and Hirst, 2012).
Related work
However, HILDA’s approach also has obvious weakness: the greedy algorithm may lead to poor performance due to local optima, and more importantly, the SVM classifiers are not well-suited for solving structural problems due to the difficulty of taking context into account.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Pitler, Emily and Louis, Annie and Nenkova, Ani
Experimental setup
We use a Ranking SVM (Si/Mug“ (Joachims, 2002)) to score summaries using our features.
Experimental setup
The Ranking SVM seeks to minimize the number of discordant pairs (pairs in which the gold standard has :31 ranked strictly higher than :52, but the learner ranks x2 strictly higher than :01).
Experimental setup
For system-level evaluation, we treat the real-valued output of the SVM ranker for each summary as the linguistic quality score.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Dong, Li and Wei, Furu and Tan, Chuanqi and Tang, Duyu and Zhou, Ming and Xu, Ke
Experiments
the target-independent (SVM-indep) and target-dependent features and uses SVM as the classifier.
Experiments
SVM-conn: The words, punctuations, emoti-cons, and #hashtags included in the converted dependency tree are used as the features for SVM .
Experiments
AdaRNN-comb: We combine the root vectors obtained by AdaRNN-Wfli with the uni/bi-gram features, and they are fed into a SVM classifier.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Daxenberger, Johannes and Gurevych, Iryna
Machine Learning with Edit-Turn-Pairs
Baseline R. Forest SVM Accuracy .799 :|:.031 .866 :|:.026T .858 :|:.027T Fimac, NaN .789 1.032 .763 1.033 Precisionmac.
Machine Learning with Edit-Turn-Pairs
A reduction of the feature set as judged by a X2 ranker improved the results for both Random Forest as well as the SVM , so we limited our feature set to the 100 best features.
Machine Learning with Edit-Turn-Pairs
In a 10-fold cross-validation experiment, we tested a Random Forest classifier (Breiman, 2001) and an SVM (Platt, 1998) with polynomial kernel.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Yannakoudakis, Helen and Briscoe, Ted and Medlock, Ben
Approach
In its basic form, a binary SVM classifier learns a linear threshold function that discriminates data points of two categories.
Evaluation
We trained a SVM regression model with our full set of feature types and compared it to the SVM rank preference model.
Introduction
In this paper, we report experiments on rank preference Support Vector Machines (SVMs) trained on a relatively small amount of data, on identification of appropriate feature types derived automatically from generic text processing tools, on comparison with a regression SVM model, and on the robustness of the best model to ‘outlier’ texts.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Anzaroot, Sam and Passos, Alexandre and Belanger, David and McCallum, Andrew
Soft Constraints in Dual Decomposition
All we need to employ the structured perceptron algorithm (Collins, 2002) or the structured SVM algorithm (Tsochantaridis et al., 2004) is a black-box procedure for performing MAP inference in the structured linear model given an arbitrary cost vector.
Soft Constraints in Dual Decomposition
This can be ensured by simple modifications of the perceptron and subgradient descent optimization of the structured SVM objective simply by truncating c coordinate-wise to be nonnegative at every learning iteration.
Soft Constraints in Dual Decomposition
A similar analysis holds for the structured SVM ap-
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhu, Jun and Zheng, Xun and Zhang, Bo
Experiments
For gLDA, we learn a binary linear SVM on its topic representations using SVMLight (J oachims, 1999).
Experiments
The results of DiscLDA (Lacoste-Jullien et al., 2009) and linear SVM on raw bag-of-words features were reported in (Zhu et al., 2012).
Experiments
The fact that gLDA+SVM performs better than the standard gSLDA is due to the same reason, since the SVM part of gLDA+SVM can well capture the supervision information to learn a classifier for good prediction, while standard sLDA can’t well-balance the influence of supervision.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Silberer, Carina and Ferrari, Vittorio and Lapata, Mirella
Attribute-based Classification
We used an L2-regularized L2-loss linear SVM (Fan et a1., 2008) to learn the attribute predictions.
Attribute-based Classification
data was randomly split into a training and validation set of equal size in order to find the optimal cost parameter C. The final SVM for the attribute was trained on the entire training data, i.e., on all positive and negative examples.
Attribute-based Classification
The SVM learners used the four different feature types proposed in Farhadi et al.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Ramteke, Ankit and Malu, Akshat and Bhattacharyya, Pushpak and Nath, J. Saketha
A Machine Learning based approach
We use the SVM classifier with features generated using the following steps.
Conclusions and Future Work
This ontology guides a rule based approach to thwarting detection, and also provides features for an SVM based learning system.
Results
We used the CVX3 library in Matlab to solve the optimization problem for learning weights and the LIBSVM4 library to implement the svm classifier.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Perez-Rosas, Veronica and Mihalcea, Rada and Morency, Louis-Philippe
Discussion
As before, the linguistic, acoustic, and visual features are averaged over the entire video, and we use an SVM classifier in tenfold cross validation experiments.
Experiments and Results
We use the entire set of 412 utterances and run ten fold cross validations using an SVM classifier, as implemented in the Weka toolkit.5 In line with previous work on emotion recognition in speech (Haq and Jackson, 2009; Anagnostopoulos and Vovoli, 2010) where utterances are selected in a speaker dependent manner (i.e., utterances from the same speaker are included in both training and test), as well as work on sentence-level opinion classification where document boundaries are not considered in the split performed between the training and test sets (Wilson et al., 2004; Wiegand and Klakow, 2009), the training/test split for each fold is performed at utterance level regardless of the video they belong to.
Multimodal Sentiment Analysis
These simple weighted unigram features have been successfully used in the past to build sentiment classifiers on text, and in conjunction with Support Vector Machines ( SVM ) have been shown to lead to state-of-the-art performance (Maas et al., 2011).
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Berg-Kirkpatrick, Taylor and Gillick, Dan and Klein, Dan
Structured Learning
We use a soft-margin support vector machine ( SVM ) (Vapnik, 1998) objective over the full structured output space (Taskar et al., 2003; Tsochantaridis et al., 2004) of extractive and compressive summaries:
Structured Learning
In our application, this approach efficiently solves the structured SVM training problem up to some specified tolerance 6.
Structured Learning
Thus, if loss-augmented prediction turns up no new constraints on a given iteration, the current solution to the reduced problem, w and E, is the solution to the full SVM training problem.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Bramsen, Philip and Escobar-Molano, Martha and Patel, Ami and Alonso, Rafael
Abstract
Oberlander and Nowson explore using a Na'ive Bayes and an SVM classifier to perform binary classification of text on each personality dimension.
Abstract
The results of the SVM classifier, shown in line (1) of Table 2, were fairly poor.
Abstract
Training a multiclass SVM on the binned n-gram features from (5) produces 51.6% cross-validation accuracy on training data and 44.4% accuracy on the weighted test set (both numbers should be compared to a 33% baseline).
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Danescu-Niculescu-Mizil, Cristian and Sudhof, Moritz and Jurafsky, Dan and Leskovec, Jure and Potts, Christopher
Predicting politeness
The BOW classifier is an SVM using a unigram feature representation.6 We consider this to be a strong baseline for this new
Predicting politeness
is an SVM using the linguistic features listed in Table 3 in addition to the unigram features.
Predicting politeness
For new requests, we use class probability estimates obtained by fitting a logistic regression model to the output of the SVM (Witten and Frank, 2005) as predicted politeness scores (with values between 0 and l; henceforth politeness, by abuse of language).
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Aker, Ahmet and Paramita, Monica and Gaizauskas, Rob
Abstract
For classification we use an SVM binary classifier and training data taken from the EUROVOC thesaurus.
Feature extraction
To align or map source and target terms we use an SVM binary classifier (J oachims, 2002) with a linear kernel and the tradeoff between training error and margin parameter c = 10.
Method
For classification purposes we use an SVM binary classifier.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Rosenthal, Sara and McKeown, Kathleen
Experiments and Results
We experimented with an SVM classifier and found logistic regression to do slightly better.
Related Work
They use an SVM classifier with only n-grams as features.
Related Work
Nowson et al (2006) employed dictionary and n—gram based content analysis and achieved 91.5% accuracy using an SVM classifier.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Croce, Danilo and Moschitti, Alessandro and Basili, Roberto and Palmer, Martha
Experiments
This uses a set of binary SVM classifiers, one for each verb class (frame) 73.
Experiments
In the classification phase the binary classifiers are applied by (i) only considering classes that are compatible with the target verbs; and (ii) selecting the class associated with the maximum positive SVM margin.
Model Analysis and Discussion
In line with the method discussed in (Pighin and Moschitti, 2009b), these fragments are extracted as they appear in most of the support vectors selected during SVM training.
SVM is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: