Index of papers in Proc. ACL that mention

SVM

Seen in text as:

SVM (649)

Seen in 604 sentences in 84 papers.

1. Automatic Syllabification with Structured SVMs for Letter-to-Phoneme Conversion

Bartlett, Susan and Kondrak, Grzegorz and Cherry, Colin

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	We formulate syllabification as a tagging problem, and learn a discriminative tagger from labeled data using a structured support vector machine ( SVM ) (Tsochantaridis et al., 2004).
Structured SVMs	A structured support vector machine ( SVM ) is a large-margin training method that can learn to predict structured outputs, such as tag sequences or parse trees, instead of performing binary classification (Tsochantaridis et al., 2004).
Structured SVMs	We employ a structured SVM that predicts tag sequences, called an SVM Hidden Markov Model, or SVM-HMM.
Structured SVMs	This approach can be considered an SVM because the model parameters are trained discrimi-natively to separate correct tag sequences from incorrect ones by as large a margin as possible.
Syllabification with Structured SVMs	The SVM framework is less restrictive: we can include 0 as an emission feature, but we can also include features indicating that the preceding and following letters are m and r respectively.

SVM is mentioned in 16 sentences in this paper.

Topics mentioned in this paper:

2. A Semiparametric Gaussian Copula Regression Model for Predicting Financial Risks from Earnings Calls

Wang, William Yang and Hua, Zhenhao

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	The baselines are standard squared-loss linear regression, linear kernel SVM, and nonlinear (Gaussian) kernel SVM .
Experiments	We use the Statistical Toolbox’s linear regression implementation in Matlab, and LibSVM (Chang and Lin, 2011) for training and testing the SVM models.
Experiments	The hyperparameter C in linear SVM, and the 7 and C hyperparameters in Gaussian SVM are tuned on the training set using 10-fold cross-validation.
Introduction	0 Our results significantly outperform standard linear regression and strong SVM baselines.
Related Work	(2003) are among the first to study SVM and text mining methods in the market prediction domain, where they align financial news articles with multiple time series to simulate the 33 stocks in the Hong Kong Hang Seng Index.
Related Work	(2009) model the SEC-mandated annual reports, and performs linear SVM regression with e-insensitive loss function to predict the measured volatility.
Related Work	Traditional discriminative models, such as linear regression and linear SVM , have been very popular in various text regression tasks, such as predicting movie revenues from reviews (Joshi et al., 2010), understanding the geographic lexical variation (Eisenstein et al., 2010), and predicting food prices from menus (Chahuneau et al., 2012).

SVM is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

3. Tri-Training for Authorship Attribution with Limited Training Data

Qian, Tieyun and Liu, Bing and Chen, Li and Peng, Zhiyong

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Evaluation	We use logistic regression (LR) with L2 regularization (Fan et al., 2008) and the SVMWWW ( SVM ) system (Joachims, 2007) with its default settings as the classifiers.
Experimental Evaluation	It self-trains two classifiers from the character 3- gram, lexical, and syntactic views using CNG and SVM classifiers (Kourtis and Stamatatos, 2011).
Experimental Evaluation	The original method applied only CNG and SVM on the character n-gram view.
Introduction	However, the self-training method in (Kourtis and Stamatatos, 2011) uses two classifiers (CNG and SVM ) on one view.
Proposed Tri-Training Algorithm	Many classification algorithms give such scores, e.g., SVM and logistic regression.
Related Work	On developing effective learning techniques, supervised classification has been the dominant approach, e.g., neural networks (Graham et al., 2005; Zheng et al., 2006), decision tree (Uzuner and Katz, 2005; Zhao and Zobel, 2005), logistic regression (Madigan et al., 2005), SVM (Diederich et al., 2000; Gamon 2004; Li et al., 2006; Kim et al., 2011), etc.

SVM is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

4. That's Not What I Meant! Using Parsers to Avoid Structural Ambiguities in Generated Text

Duan, Manjuan and White, Michael

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	However, by using an SVM ranker to combine the realizer’s model score together with features from multiple parsers, including ones designed to make the ranker more robust to parsing mistakes, we show that significant increases in BLEU scores can be achieved.
Abstract	Moreover, via a targeted manual analysis, we demonstrate that the SVM reranker frequently manages to avoid vicious ambiguities, while its ranking errors tend to affect fluency much more often than adequacy.
Introduction	Consequently, we examine two reranking strategies, one a simple baseline approach and the other using an SVM reranker (J oachims, 2002).
Introduction	Therefore, to develop a more nuanced self-monitoring reranker that is more robust to such parsing mistakes, we trained an SVM using dependency precision and recall features for all three parses, their n-best parsing results, and per-label precision and recall for each type of dependency, together with the realizer’s normalized perceptron model score as a feature.
Introduction	With the SVM reranker, we obtain a significant improvement in BLEU scores over
Reranking with SVMs 4.1 Methods	Similarly, we conjectured that large differences in the realizer’s perceptron model score may more reliably reflect human fluency preferences than small ones, and thus we combined this score with features for parser accuracy in an SVM ranker.
Reranking with SVMs 4.1 Methods	Additionally, given that parsers may more reliably recover some kinds of dependencies than others, we included features for each dependency type, so that the SVM ranker might learn how to weight them appropriately.
Reranking with SVMs 4.1 Methods	We trained the SVM ranker (J oachims, 2002) with a linear kernel and chose the hyper-parameter c, which tunes the tradeoff between training error and margin, with 6-fold cross-validation on the devset.

SVM is mentioned in 23 sentences in this paper.

Topics mentioned in this paper:

perceptron (29)
SVM (23)
BLEU (20)

5. Automatic detection of deception in child-produced speech using syntactic complexity features

Yancheva, Maria and Rudzicz, Frank

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion and future work	RF MLP RF MLP RF SVM
Discussion and future work	While past research has used logistic regression as a binary classifier (Newman et al., 2003), our experiments show that the best-performing classifiers allow for highly nonlinear class boundaries; SVM and RF models achieve between 62.5% and 91.7% accuracy across age groups — a significant improvement over the baselines of LR and NB, as well as over previous results.
Related Work	Two classifiers, Nai've Bayes (NB) and a support vector machine (SVM), were applied on the tokenized and stemmed statements to obtain best classification accuracies of 70% (abortion topic, NB), 67.4% (death penalty topic, NB), and 77% (friend description, SVM ), where the baseline was taken to be 50%.
Related Work	The authors note this as well by demonstrating significantly lower results of 59.8% for NB and 57.8% for SVM when cross-topic classification is performed by training each classifier on two topics and testing on the third.
Results	We evaluate five classifiers: logistic regression (LR), a multilayer perceptron (MLP), nai've Bayes (NB), a random forest (RF), and a support vector machine ( SVM ).
Results	The SVM is a parametric binary classifier that provides highly nonlinear decision boundaries given particular kernels.
Results	The SVM classifier

SVM is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

6. Mining Informal Language from Chinese Microtext: Joint Word Recognition and Segmentation

Wang, Aobo and Kan, Min-Yen

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	We re-implemented Xia and Wong (2008)’s extended Support Vector Machine ( SVM ) based microtext IWR system to compare with our method.
Experiment	Both the SVM and DT models are provided by the Weka3 (Hall et al., 2009) toolkit, using its default configuration.
Experiment	Adapted SVM for Joint Classification.

SVM is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

SVM (13)
baseline systems (11)
CRF (11)

7. A Comparison of Techniques to Automatically Identify Complex Words.

Shardlow, Matthew

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion	Whilst the thresholding and simplify everything methods were not significantly different from each other, the SVM method was significantly different from the other two (p < 0.001).
Discussion	This can be seen in the slightly lower recall, yet higher precision attained by the SVM .
Discussion	This indicates that the SVM was better at distinguishing between complex and simple words, but also wrongly identified many CWs.
Experimental Design	Support vector machines ( SVM ) are statistical classifiers which use labelled training data to predict the class of unseen inputs.
Experimental Design	The training data consist of several features which the SVM uses to distinguish between classes.
Experimental Design	The SVM was chosen as it has been used elsewhere for similar tasks (Gasperin et al., 2009; Hancke et al., 2012; J auhar and Specia, 2012).
Results	Everything Thresholding SVM
Results	To analyse the features of the SVM , the correlation coefficient between each feature vector and the vector of feature labels was calculated.

SVM is mentioned in 19 sentences in this paper.

Topics mentioned in this paper:

8. Topic Modeling Based Classification of Clinical Reports

Sarioglu, Efsun and Yadav, Kabir and Choi, Hyeong-Ah

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	Support vector machines ( SVM ) is a popular classification algorithm that attempts to find a decision boundary between classes that is the farthest from any point in the training dataset.
Background	Given labeled training data (mt,yt),t = 1, ...,N where 30,; 6 RM and y; E {1, —1}, SVM tries to find a separating hyperplane with the maximum margin (Platt, 1998).
Experiments	SVM was chosen as the classification algorithm as it was shown that it performs well in text classification tasks (J oachims, 1998; Yang and Liu, 1999) and it is robust to overfitting (Sebastiani, 2002).
Experiments	Accordingly, the raw text of the reports and topic vectors are compiled into individual files with their corresponding outcomes in ARFF and then classified with SVM .
Related Work	tor classification results with SVM , however (Sri-urai, 2011) uses a fixed number of topics, whereas we evaluated different number of topics since typically this is not known in advance.
Results	Classification results using ATC and SVM are shown in Figures 2, 3, and 4 for precision, recall, and f-score respectively.
Results	Best classification performance was achieved with 15 topics for ATC and 100 topics for SVM .
Results	For smaller number of topics, ATC performed better than SVM .

SVM is mentioned in 14 sentences in this paper.

Topics mentioned in this paper:

topic model (23)
SVM (14)
LDA (6)

9. Text Classification from Positive and Unlabeled Data using Misclassified Data Correction

Fukumoto, Fumiyo and Suzuki, Yoshimi and Matsuyoshi, Suguru

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We applied an error detection and correction technique to the results of positive and negative documents classified by the Support Vector Machines ( SVM ).
Framework of the System	As error candidates, we focus on support vectors (SVs) extracted from the training documents by SVM .
Framework of the System	Training by SVM is performed to find the optimal hyperplane consisting of SVs, and only the SVs affect the performance.
Framework of the System	We set these selected documents to negative training documents (N1), and apply SVM to learn classifiers.
Introduction	uses soft-margin SVM as the underlying classifiers (Liu et al., 2003).
Introduction	They reported that the results were comparable to the current state-of-the-art biased SVM method.
Introduction	Like much previous work on semi-supervised ML, we apply SVM to the positive and unlabeled data, and add the classification results to the training data.

SVM is mentioned in 23 sentences in this paper.

Topics mentioned in this paper:

SVM (23)
unlabeled data (10)
F-score (6)

10. Historical Analysis of Legal Opinions with a Sparse Mixed-Effects Latent Variable Model

Wang, William Yang and Mayfield, Elijah and Naidu, Suresh and Dittmar, Jeremiah

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Prediction Experiments	In the first experiment, we compare the prediction accuracy of our SME model to a widely used discriminative learner in NLP — the linear kernel support vector machine ( SVM )3.
Prediction Experiments	In the second experiment, in addition to the linear kernel SVM , we also compare our SME model to a state-of-the-art sparse generative model of text (Eisenstein et al., 2011a), and vary the size of input vocabulary W exponentially from 29 to the full size of our training vocabulary4.
Prediction Experiments	We use threefold cross-validation to infer the learning rate 6 and cost C hyperpriors in the SME and SVM model respectively.
Related Work	Traditional discriminative methods, such as support vector machine ( SVM ) and logistic regression, have been very popular in various text categorization tasks (J oachims, 1998; Wang and McKe-own, 2010) in the past decades.
Related Work	For example, SVM does not have latent variables to model the subtle differences and interactions of features from different domains (e.g.

SVM is mentioned in 21 sentences in this paper.

Topics mentioned in this paper:

11. Cross-Lingual Mixture Model for Sentiment Classification

Meng, Xinfan and Wei, Furu and Liu, Xiaohua and Zhou, Ming and Xu, Ge and Wang, Houfeng

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	MT—SVM: We translate the English labeled data to Chinese using Google Translate and use the translation results to train the SVM classifier for Chinese.
Experiment	SVM: We train a SVM classifier on the Chinese labeled data.
Experiment	First, two monolingual SVM classifiers are trained on English labeled data and Chinese data translated from English labeled data.
Introduction	The experiment results show that CLMM yields 71% in accuracy when no Chinese labeled data are used, which significantly improves Chinese sentiment classification and is superior to the SVM and co-training based methods.
Introduction	When Chinese labeled data are employed, CLMM yields 83% in accuracy, which is remarkably better than the SVM and achieve state-of-the-art performance.
Related Work	(2002) compare the performance of three commonly used machine learning models (Naive Bayes, Maximum Entropy and SVM ).
Related Work	Ga-mon (2004) shows that introducing deeper linguistic features into SVM can help to improve the performance.
Related Work	English Labeled data are first translated to Chinese, and then two SVM classifiers are trained on English and Chinese labeled data respectively.

SVM is mentioned in 19 sentences in this paper.

Topics mentioned in this paper:

12. Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments

Mohler, Michael and Bunescu, Razvan and Mihalcea, Rada

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Answer Grading System	Using each of these as features, we use Support Vector Machines ( SVM ) to produce a combined real-number grade.
Answer Grading System	The weight vector u is trained to optimize performance in two scenarios: Regression: An SVM model for regression (SVR) is trained using as target function the grades assigned by the instructors.
Answer Grading System	Ranking: An SVM model for ranking (SVMRank) is trained using as ranking pairs all pairs of student answers (AS,At) such that grade(Az-,AS) > grade(Az-,At), where A,- is the corresponding instructor answer.
Discussion and Conclusions	The correlation for the BOW-only SVM model for SVMRank improved upon the best BOW feature
Discussion and Conclusions	Likewise, using the BOW-only SVM model for SVR reduces the RMSE by .022 overall compared to the best BOW feature.
Results	5.4 SVM Score Grading
Results	The SVM components of the system are run on the full dataset using a 12-fold cross validation.
Results	Both SVM models are trained using a linear kernel.11 Results from both the SVR and the SVMRank implementations are reported in Table 7 along with a selection of other measures.

SVM is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

13. Fine-Grained Genre Classification Using Structural Learning Algorithms

Wu, Zhili and Markert, Katja and Sharoff, Serge

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion	As expected, the structural methods on either skewed or flattened hierarchies are not significantly better than the flat SVM .
Discussion	For the flattened hierarchy of 15 leaf genres the maximal accuracy is 54.2% vs. 52.4% for the flat SVM (Figure 3), a nonsignificant improvement.
Experiments	As a baseline we use the accuracy achieved by a standard "flat" SVM.
Experiments	A standard flat SVM achieves an accuracy of 64.4% whereas the best structural SVM based on Lin’s information content distance measure (IC-lin-word-bnc) achieves 68.8% accuracy, significantly better at the 1% level.
Experiments	Table 1 summarizes the best performing measures that all outperform the flat SVM at the 1% level.
Genre Distance Measures	The structural SVM (Section 2) requires a distance measure h between two genres.
Structural SVMs	To strengthen the constraints, the zero value on the right hand side of the inequality for the flat SVM can be replaced by a positive value, corresponding to a distance measure h(yi, m) between two genre classes, leading to the following constraint:

SVM is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

14. Co-Training for Cross-Lingual Sentiment Classification

Wan, Xiaojun

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Empirical Evaluation 4.1 Evaluation Setup	SVM(CN): This method applies the inductive SVM with only Chinese features for sentiment classification in the Chinese view.
Empirical Evaluation 4.1 Evaluation Setup	SVM(EN): This method applies the inductive SVM with only English features for sentiment classification in the English view.
Empirical Evaluation 4.1 Evaluation Setup	SVM(ENCNI): This method applies the inductive SVM with both English and Chinese features for sentiment classification in the two views.
Introduction	SVM , NB), and the classification performance is far from satisfactory because of the language gap between the original language and the translated language.
Introduction	The SVM classifier is adopted as the basic classifier in the proposed approach.
Related Work 2.1 Sentiment Classification	Standard Na'1've Bayes and SVM classifiers have been applied for subjectivity classification in Romanian (Mihalcea et al., 2007; Banea et al., 2008), and the results show that automatic translation is a viable alternative for the construction of resources and tools for subjectivity analysis in a new target language.
Related Work 2.1 Sentiment Classification	To date, many semi-supervised learning algorithms have been developed for addressing the cross-domain text classification problem by transferring knowledge across domains, including Transductive SVM (Joachims, 1999), EM(Nigam et al., 2000), EM-based Na'1've Bayes classifier (Dai et al., 2007a), Topic-bridged PLSA (Xue et al., 2008), Co-Clustering based classification (Dai et al., 2007b), two-stage approach (Jiang and Zhai, 2007).
The Co-Training Approach	Typical text classifiers include Support Vector Machine ( SVM ), Na'1've Bayes (NB), Maximum Entropy (ME), K-Nearest Neighbor (KNN), etc.
The Co-Training Approach	In this study, we adopt the widely-used SVM classifier (Joachims, 2002).
The Co-Training Approach	as two sets of vectors in a feature space, SVM constructs a separating hyperplane in the space by maximizing the margin between the two data sets.

SVM is mentioned in 19 sentences in this paper.

Topics mentioned in this paper:

15. A Ranking Approach to Stress Prediction for Letter-to-Phoneme Conversion

Dou, Qing and Bergsma, Shane and Jiampojamarn, Sittichai and Kondrak, Grzegorz

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We represent words as sequences of substrings, and use the substrings as features in a Support Vector Machine ( SVM ) ranker, which is trained to rank possible stress patterns.
Automatic Stress Prediction	We use a support vector machine ( SVM ) to rank the possible patterns for each sequence (Section 3.2).
Automatic Stress Prediction	These units are used to define the features and outputs used by the SVM ranker.
Automatic Stress Prediction	The SVM can thus generalize from observed words to similarly-spelled, unseen examples.
Introduction	We divide each word into a sequence of substrings, and use these substrings as features for a Support Vector Machine ( SVM ) ranker.
Introduction	The task of the SVM is to rank the true stress pattern above the small number of acceptable alternatives.
Introduction	The SVM ranker achieves exceptional 96.2% word accuracy on the challenging task of predicting the full stress pattern in English.

SVM is mentioned in 16 sentences in this paper.

Topics mentioned in this paper:

16. Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification

Dasgupta, Sajib and Ng, Vincent

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Evaluation	Transductive SVM .
Evaluation	Specifically, we begin by training an inductive SVM on one labeled example from each class, iteratively labeling the most uncertain unlabeled point on each side of the hyperplane and retraining the SVM until 100 points are labeled.
Evaluation	Finally, we train a transductive SVM on the 100 labeled points and the remaining 1900 unlabeled points, obtaining the results in row 3 of Table 1.
Our Approach	Specifically, we train a discriminative classifier using the support vector machine ( SVM ) learning algorithm (J oachims, 1999) on the set of unambiguous reviews, and then apply the resulting classifier to all the reviews in the training folds4 that are not seeds.
Our Approach	As our weakly supervised learner, we employ a transductive SVM .
Our Approach	Hence, instead of training just one SVM classifier, we aim to reduce classification errors by training an ensemble of five classifiers, each of which uses all 100 manually labeled reviews and a different subset of the 500 automatically labeled reviews.

SVM is mentioned in 13 sentences in this paper.

Topics mentioned in this paper:

17. Paraphrase Identification as Probabilistic Quasi-Synchronous Recognition

Das, Dipanjan and Smith, Noah A.

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Data and Task	Dolan and Brockett (2005) remark that this corpus was created semiautomatically by first training an SVM classifier on a disjoint annotated 10,000 sentence pair dataset and then applying the SVM on an unseen 49,375 sentence pair corpus, with its output probabilities skewed towards over-identification, i.e., towards generating some false paraphrases.
Experimental Evaluation	The SVM was trained to classify positive and negative examples of paraphrase using SVMlight (J oachims, 1999).8 Metaparameters, tuned on the development data, were the regularization constant and the degree of the polynomial kernel (chosen in [10—5, 102] and 1—5 respectively.
Experimental Evaluation	It is unsurprising that the SVM performs very well on the MSRPC because of the corpus creation process (see Sec.
Experimental Evaluation	4) where an SVM was applied as well, with very similar features and a skewed decision process (Dolan and Brockett, 2005).
Product of Experts	LR (like the QG) provides a probability distribution, but uses surface features (like the SVM ).
Product of Experts	2; this model is on par with the SVM , though trading recall in favor of precision.
Product of Experts	We view it as a probabilistic simulation of the SVM more suitable for combination with the QG.

SVM is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

18. A Graph-based Semi-Supervised Learning for Question-Answering

Celikyilmaz, Asli and Thint, Marcus and Huang, Zhiheng

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Features Model MRR Topl Tops Baseline — 42.3% 32.7% 54.5% QTCF SVM 51 .9% 44.6% 63.4% SSL 49.5% 43.1% 60.9% LexSem SVM 48.2% 40.6% 61.4% SSL 47.9% 40.1% 58.4% QComp SVM 54.2% 47.5% 64.3% SSL 51.9% 45.5% 62.4%
Experiments	We performed manual iterative parameter optimization during training based on prediction accuracy to find the best k—nearest parameter for SSL, i.e., k = {3,5,10,20,50} , and best 0 = {10—2,..,102} and 7 2 {2‘2, ..,23} for RBF kernel SVM .
Experiments	We applied SVM and our graph based SSL method with no summarization to learn models using labeled training and testing datasets.
Feature Extraction for Entailment	The QC model is trained via support vector machines ( SVM ) (Vapnik, 1995) considering different features such as semantic headword feature based on variation of Collins rules, hypernym extraction via Lesk word disambiguation (Lesk, 1988), regular expressions for wh-word indicators, n-grams, word-shapes(capitals), etc.
Graph Summarization	Using a separate learner, e.g., SVM (Vapnik, 1995), we obtain predicted outputs, Y5 = (33f, ..., 39214) of X 5 and append observed labels Y5 = Y5 U YL.

SVM is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

19. When Specialists and Generalists Work Together: Overcoming Domain Dependence in Sentiment Tagging

Andreevskaia, Alina and Bergler, Sabine

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Domain Adaptation in Sentiment Research	They applied an out-of-domain-trained SVM classifier to label examples from the target domain and then retrained the classifier using these new examples.
Domain Adaptation in Sentiment Research	Depending on the similarity between domains, this method brought up to 15% gain compared to the baseline SVM .
Experiments	Dataset Movie News Blogs PRs Dataset size 1066 800 800 1200 unigrams SVM 68.5 61.5 63.85 76.9 NB 60.2 59.5 60.5 74.25 nb features 5410 4544 3615 2832 bigrams SVM 59.9 63.2 61.5 75.9 NB 57.0 58.4 59.5 67.8 nb features 16286 14633 15182 12951 trigrams SVM 54.3 55.4 52.7 64.4 NB 53.3 57.0 56.0 69.7 nb features 20837 18738 19847 19132
Experiments	Table 4: Accuracy of SVM with unigram model
Experiments	results depends on the genre and size of the n-gram: on product reviews, all results are statistically significant at oz 2 0.025 level; on movie reviews, the difference between NaVe Bayes and SVM is statistically significant at oz 2 0.01 but the significance diminishes as the size of the n- gram increases; on news, only bigrams produce a statistically significant (a = 0.01) difference between the two machine learning methods, while on blogs the difference between SVMs and NaVe Bayes is most pronounced when unigrams are used (a = 0.025).
Factors Affecting System Performance	To our knowledge, the only work that describes the application of statistical classifiers ( SVM ) to sentence-level sentiment classification is (Gamon and Aue, 2005)1.
Integrating the Corpus-based and Dictionary-based Approaches	Using then an SVM meta-classifier trained on a small number of target domain examples to combine the nine base classifiers, they obtained a statistically significant improvement on out-of-domain texts from book reviews, knowledge-base feedback, and product support services survey data.
Lexicon-Based Approach	The baseline performance of the Lexicon-Based System (LBS) described above is presented in Table 5, along with the performance results of the in-domain- and out-of-domain-trained SVM classifier.
Lexicon-Based Approach	Movies ‘ News ‘ Blogs ‘ PRs LBS 57.5 62.3 63.3 59.3 SVM in-dom.
Lexicon-Based Approach	68.5 61.5 63.85 76.9 SVM out-of-dom.

SVM is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

in-domain (26)
unigrams (14)
SVM (10)

20. A Novel Discourse Parser Based on Support Vector Machine Classification

duVerle, David and Prendinger, Helmut

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Building a Discourse Parser	l SVM Training \| ‘ Feature Extraction w Classification [ SVM Models (Binary and Multiclass) J v [ Scored RS sub-trees ] v ‘ Bottom-up Tree Construction
Building a Discourse Parser	Support Vector Machines (SVM) (Vapnik, 1995) are used to model classifiers S and L. SVM refers to a set of supervised learning algorithms that are based on margin maximization.
Building a Discourse Parser	This makes SVM well-fitted to treat classification problems involving relatively large feature spaces such as ours (% 105 features).
Evaluation	4.2 Raw SVM Classification
Evaluation	Although our final goal is to achieve good performance on the entire tree-building task, a useful intermediate evaluation of our system can be conducted by measuring raw performance of SVM classifiers.
Evaluation	Table 1: SVM Classifier performance.
Features	Instrumental to our system’s performance is the choice of a set of salient characteristics (“features”) to be used as input to the SVM algorithm for training and classification.

SVM is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

discourse parsing (10)
SVM (10)
edus (9)

21. Modelling Annotator Bias with Multi-task Gaussian Processes: An Application to Machine Translation Quality Estimation

Cohn, Trevor and Specia, Lucia

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	Model MAE RMSE p 0.5596 0.7053 MA 0.5184 0.6367 us 0.5888 0.7588 MT 0.6300 0.8270 Pooled SVM 0.5823 0.7472 Independent A SVM 0.5058 0.6351 EasyAdapt SVM 0.7027 0.8816 SINGLE-TASK LEARNING Independent A 0.5091 0.6362 Independents 0.5980 0.7729 Pooled 0.5834 0.7494 Pooled & {N} 0.4932 0.6275 MULTITASK LEARNING: Annotator Combined A 0.4815 0.6174 CombinedA & {N} 0.4909 0.6268 Combined+A 0.4855 0.6203 Combined+A & {N} 0.4833 0.6102 MULTITASK LEARNING: Translation system Combineds 0.5825 0.7482 MULTITASK LEARNING: Sentence pair CombinedT 0.5813 0.7410 MULTITASK LEARNING: Combinations Combined A, 5 0.4988 0.6490 Combined A, s & {N A, 5} 0.4707 0.6003 Combined+A, 5 0.4772 0.6094 Combined 14,51 0.4588 0.5852 Combined A, s,T & {N A, 5} 0.4723 0.6023
Gaussian Process Regression	In typical usage, the kernel hyperparameters for an SVM are fit using held-out estimation, which is inefficient and often involves tying together parameters to limit the search complexity (e.g., using a single scale parameter in the squared exponential).
Gaussian Process Regression	Multiple-kemel learning (Go'nen and Alpaydin, 2011) goes some way to addressing this problem within the SVM framework, however this technique is limited to reweighting linear combinations of kernels and has high computational complexity.
Multitask Quality Estimation 4.1 Experimental Setup	Baselines: The baselines use the SVM regression algorithm with radial basis function kernel and parameters 7, e and C optimised through grid-search and 5-fold cross validation on the training set.
Multitask Quality Estimation 4.1 Experimental Setup	a 0.8279 0.9899 SVM 0.6889 0.8201
Multitask Quality Estimation 4.1 Experimental Setup	,a is a baseline which predicts the training mean, SVM uses the same system as the WMT12 QE task, and the remainder are GP regression models with different kernels (all include additive noise).

SVM is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

22. Local Histograms of Character N-grams for Authorship Attribution

Escalante, Hugo Jair and Solorio, Thamar and Montes-y-Gomez, Manuel

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Authorship Attribution With LOWBOW Representations	For both types of representations we consider an SVM classifier under the one-vs-all formulation for facing the AA problem.
Authorship Attribution With LOWBOW Representations	We consider SVM as base classifier because this method has proved to be very effective in a large number of applications, including AA (Houvardas and Stamatatos, 2006; Plakias and Stamatatos, 2008b; Plakias and Stamatatos, 2008a); further, since SVMs are kernel-based methods, they allow us to use local histograms for AA by considering kernels that work over sets of histograms.
Authorship Attribution With LOWBOW Representations	We build a multiclass SVM classifier by considering the pairs of patterns-outputs associated to documents-authors.
Experiments and Results	All our experiments use the SVM implementation provided by Canu et al.
Experiments and Results	Columns show the true author for test documents and rows show the authors predicted by the SVM .
Experiments and Results	The SVM with BOW representation of character n-grams achieved recognition rates of 40% and 50% for BL and JM respectively.
Related Work	applied to this problem, including support vector machine ( SVM ) classifiers (Houvardas and Stamatatos, 2006) and variants thereon (Plakias and Stamatatos, 2008b; Plakias and Stamatatos, 2008a), neural networks (Tearle et al., 2008), Bayesian classifiers (Coyotl-Morales et al., 2006), decision tree methods (Koppel et al., 2009) and similarity based techniques (Keselj et al., 2003; Lambers and Veenman, 2009; Stamatatos, 2009b; Koppel et al., 2009).
Related Work	In this work, we chose an SVM classifier as it has reported acceptable performance in AA and because it will allow us to directly compare results with previous work that has used this same classifier.

SVM is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

n-grams (26)
SVM (10)
word-level (7)

23. Bayesian Synchronous Tree-Substitution Grammar Induction and Its Application to Sentence Compression

Yamangil, Elif and Shieber, Stuart M.

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Evaluation	We compared the Gibbs sampling compressor (GS) against a version of maximum a posteriori EM (with Dirichlet parameter greater than 1) and a discriminative STSG based on SVM training (Cohn and Lapata, 2008) ( SVM ).
Evaluation	EM is a natural benchmark, while SVM is also appropriate since it can be taken as the state of the art for our task.4
Evaluation	Nonetheless, because the comparison system is a generalization of the extractive SVM compressor of Cohn and Lapata (2007), we do not expect that the results would differ qualitatively.
Introduction	We achieve substantial improvements against a number of baselines including EM, support vector machine ( SVM ) based discriminative training, and variational Bayes (VB).

SVM is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

24. Representation Learning for Text-level Discourse Parsing

Ji, Yangfeng and Eisenstein, Jacob

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Large-Margin Learning Framework	As we will see, it is possible to learn {Wm} using standard support vector machine ( SVM ) training (holding A fixed), and then make a simple gradient-based update to A (holding {Wm} fixed).
Large-Margin Learning Framework	As is standard in the multi-class linear SVM (Crammer and Singer, 2001), we can solve the problem defined in Equation 6 via Lagrangian optimization:
Large-Margin Learning Framework	If A is fixed, then the optimization problem is equivalent to a standard multi-class SVM , in the transformed feature space f (vi; A).

SVM is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

25. Modeling Latent Biographic Attributes in Conversational Genres

Garera, Nikesh and Yarowsky, David

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Corpus Details	As our reference algorithm, we used the current state-of-the-art system developed by Boulis and Ostendorf (2005) using unigram and bigram features in a SVM framework.
Corpus Details	Table 12 Top 20 ngram features for gender, ranked by the weights assigned by the linear SVM model
Corpus Details	After extracting the ngrams, a SVM model was trained via the SVMlight toolkit (J oachims, 1999) using the linear kernel with the default toolkit settings.

SVM is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

26. Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums

Ding, Shilin and Cong, Gao and Lin, Chin-Yew and Zhu, Xiaoyan

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Context and Answer Detection	SVM , can be employed, where each pair of question and candidate context will be treated as an instance.
Experiments	Model H Prec(%) l Rec(%) 1 F1(%) 1 Context Detection SVM 75.27 68.80 71.32 C4.5 70.16 64.30 67.21 L—CRF 75.75 72.84 74.45 Answer Detection SVM 73.31 47.35 57.52 C4.5 65.36 46.55 54.37 L—CRF 63.92 58.74 61.22
Experiments	This experiment is to evaluate Linear CRF model (Section 3.1) for context and answer detection by comparing with SVM and C4.5(Quinlan, 1993).
Experiments	For SVM , we use SVMlightanchims, 1999).
Introduction	Experimental results show that 1) Linear CRFs outperform SVM and decision tree in both context and answer detection; 2) Skip-chain CRFs outperform Linear CRFs for answer finding, which demonstrates that context improves answer finding; 3) 2D CRF model improves the performance of Linear CRFs and the combination of 2D CRFs and Skip-chain CRFs achieves better performance for context detection.
Related Work	(2007) used SVM to extract input-reply pairs from forums for chatbot knowledge.

SVM is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

CRFs (45)
CRF (22)
SVM (9)

27. Learning High-Level Planning from Text

Branavan, S.R.K. and Kushman, Nate and Lei, Tao and Barzilay, Regina

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	Baselines To evaluate the performance of our relation extraction, we compare against an SVM classifier8 trained on the Gold Relations.
Experimental Setup	We test the SVM baseline in a leave-one-out fashion.
Experimental Setup	Model F-score 0.4 _ ---- -- SVM F-score ---------- -- All-text F-score
Introduction	Our results demonstrate the strength of our relation extraction technique — while using planning feedback as its only source of supervision, it achieves a precondition relation extraction accuracy on par with that of a supervised SVM baseline.
Results	We also show the performance of the supervised SVM baseline.
Results	Feature Analysis Figure 7 shows the top five positive features for our model and the SVM baseline.
Results	Figure 7: The top five positive features on words and dependency types learned by our model (above) and by SVM (below) for precondition prediction.

SVM is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

28. Semi-Supervised Convex Training for Dependency Parsing

Wang, Qin Iris and Schuurmans, Dale and Lin, Dekang

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion and Future Work	Unlike previous proposed approaches, we introduce a convex objective for the semi-supervised learning algorithm by combining a convex structured SVM loss and a convex least square loss.
Introduction	In particular, they present an algorithm for multi-class unsupervised and semi-supervised SVM learning, which relaxes the original non-convex objective into a close convex approximation, thereby allowing a global solution to be obtained.
Introduction	More specifically, for the loss on the unlabeled data part, we substitute the original unsupervised structured SVM loss with a least squares loss, but keep constraints on the inferred prediction targets, which avoids trivialization.
Introduction	ing semi-supervised convex objective to dependency parsing, and obtain significant improvement over the corresponding supervised structured SVM .
Semi-supervised Convex Training for Structured SVM	Although semi-supervised structured SVM learning has been an active research area, semi-supervised structured SVMs have not been used in many real applications to date.
Semi-supervised Convex Training for Structured SVM	By combining the convex structured SVM loss on labeled data (shown in Equation (5)) and the convex least squares loss on unlabeled data (shown in Equation (8)), we obtain a semi-supervised structured large margin loss
Semi-supervised Structured Large Margin Objective	The objective of standard semi-supervised structured SVM is a combination of structured large margin losses on both labeled and unlabeled data.

SVM is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

29. Combined One Sense Disambiguation of Abbreviations

HaCohen-Kerner, Yaakov and Kass, Ariel and Peretz, Ariel

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abbreviation Disambiguation	: Maximum Entropy, SVM and C50.
Abstract	An accuracy of 96.09% has been achieved by SVM .
Experiments	Several well-known supervised ML methods have been selected: artificial neural networks (ANN), Nai've Bayes (NB), Support Vector Machines ( SVM ) and J48 (Witten and Frank, 1999) an improved variant of the C4.5 decision tree induction.
Experiments	Table 2 shows that SVM achieved the best result with 96.09% accuracy.
Experiments	ants ML Method ANN NB SVM J48

SVM is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

30. Community Answer Summarization for Multi-Sentence Question with Group L1 Regularization

Chan, Wen and Zhou, Xiangdong and Wang, Wei and Chua, Tat-Seng

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Results	We adapt the Support Vector Machine ( SVM ) and Logistic Regression (LR) which have been reported to be effective for classification and the Linear CRF (LCRF) which is used to summarize ordinary text documents in (Shen et al., 2007) as baselines for comparison.
Experimental Results	Table 2 shows that our general CRF model based on question segmentation with group L1 regularization outperforms the baselines significantly in all three measures (gCRF—QS-ll is 13.99% better than SVM in precision, 9.77% better in recall and 11.72% better in F1 score).
Experimental Results	We note that both SVM and LR,
Introduction	The experimental results show that the proposed model improve the performance signifi-cantly(in terms of precision, recall and F1 measures) as well as the ROUGE-l, ROUGE-2 and ROUGE-L measures as compared to the state-of-the-art methods, such as Support Vector Machines ( SVM ), Logistic Regression (LR) and Linear CRF (LCRF) (Shen et al., 2007).

SVM is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

CRF (21)
sentence-level (6)
SVM (6)

31. A Non-negative Matrix Tri-factorization Approach to Sentiment Classification with Lexical Prior Knowledge

Li, Tao and Zhang, Yi and Sindhwani, Vikas

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We also compare the results of SSMFLK with those of two supervised classification methods: Support Vector Machine ( SVM ) and Naive Bayes.
Experiments	—e— Consistency Method 0.51 —I— Homonic—CMN + Green Function 0.45 7 + SVM
Experiments	6 f —9— SSMFLK 0.5 - —6— Consistency Method -—I— Homonic—CMN + Green Function 0.4 ' + SVM + Naive Bayes
Related Work	Most work in machine learning literature on utilizing labeled features has focused on using them to generate weakly labeled examples that are then used for standard supervised learning: (Schapire et al., 2002) propose one such framework for boosting logistic regression; (Wu and Srihari, 2004) build a modified SVM and (Liu et al., 2004) use a combination of clustering and EM based methods to instantiate similar frameworks.

SVM is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

32. Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora

Lu, Bin and Tan, Chenhao and Cardie, Claire and K. Tsou, Benjamin

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup 4.1 Data Sets and Preprocessing	SVM: This method learns an SVM classifier for each language given the monolingual labeled data; the unlabeled data is not used.
Experimental Setup 4.1 Data Sets and Preprocessing	Monolingual TSVM (TSVM-M): This method learns two transductive SVM (TSVM) classifiers given the monolingual labeled data and the monolingual unlabeled data for each language.
Experimental Setup 4.1 Data Sets and Preprocessing	First, two monolingual SVM classifiers are built based on only the corresponding labeled data, and then they are bootstrapped by adding the most confident predicted examples from the unlabeled data into the training set.
Introduction	maximum entropy and SVM classifiers) as well as two alternative methods for leveraging unlabeled data (transductive SVMs (Joachims, 1999b) and co-training (Blum and Mitchell, 1998)).
Results and Analysis	Among the baselines, the best is Co-SVM; TSVMs do not always improve performance using the unlabeled data compared to the standalone SVM ; and TSVM-B outperforms TSVM-M except for Chinese in the second setting.
Results and Analysis	8Significance is tested using paired t-tests with p<0.05: denotes statistical significance compared to the corresponding performance of MaXEnt; * denotes statistical significance compared to SVM ; and r denotes statistical significance compared to Co-SVM.

SVM is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

33. A Novel Classifier Based on Quantum Computation

Liu, Ding and Yang, Xiaofang and Jiang, Minghu

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion	We present here our model of text classification and compare it with SVM and KNN on two datasets.
Discussion	Moreover, the QC performs well in text classification compared with SVM and KNN and outperforms them on small-scale training sets.
Experiment	We compared the performance of QC with several classical classification methods, including Support Vector Machine ( SVM ) and K-nearest neighbor (KNN).
Experiment	We randomly selected training samples from the training pool ten times to train QC, SVM , and KNN classifier respectively and then verified the three trained classifiers on the testing sets, the results of which are illustrated in Figure 4.
Experiment	We noted that the QC performed better than both KNN and SVM on small-scale training sets, when the number of training samples is less than 50.

SVM is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

34. Active Learning with Efficient Feature Weighting Methods for Improving Data Quality and Classification Accuracy

Martineau, Justin and Chen, Lu and Cheng, Doreen and Sheth, Amit

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Methods: We evaluated the overall performance relative to the common SVM bag of words approach that can be ubiquitously found in text mining literature.
Experiments	o SVM-TF: Uses a bag of words SVM with term frequency weights.
Experiments	SVM-Delta-IDF: Uses a bag of words SVM classification with TF.Delta-IDF weights (Formula 2) in the feature vectors before training or testing an SVM .
Related Work	(2012) propose an algorithm which first trains individual SVM classifiers on several small, class-balanced, random subsets of the dataset, and then reclassifies each training instance using a majority vote of these individual classifiers.

SVM is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

35. Towards a General Rule for Identifying Deceptive Opinion Spam

Li, Jiwei and Ott, Myle and Cardie, Claire and Hovy, Eduard

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	10We use SVMlight (J oachims, 1999) to train our linear SVM classifiers
Experiments	For SVM , models trained on POS and LIWC features achieve even lower accuracy than Unigram.
Experiments	tive model, SAGE achieve much better results than SVM , and is around 0.65 accurate in the cross-domain task.
Feature-based Additive Model	If we instead use SVM , for example, we would have to train classifiers one by one (due to the distinct features from different sources) to draw conclusions regarding the differences between Turker vs Expert vs truthful reviews, positive expert vs negative expert reviews, or reviews from different domains.
Introduction	In the examples in Table l, we trained a linear SVM classifier on Ott’s Chicago-hotel dataset on unigram features and tested it on a couple of different domains (the details of data acquisition are illustrated in Section 3).
Introduction	Table 1: SVM performance on datasets for a classifier trained on Chicago hotel review based on Unigram feature.

SVM is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

36. Employing Personal/Impersonal Views in Supervised and Semi-Supervised Sentiment Classification

Li, Shoushan and Huang, Chu-Ren and Zhou, Guodong and Lee, Sophia Yat Mei

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Unsupervised Mining of Personal and Impersonal Views	We apply both support vector machine ( SVM ) and Maximum Entropy (ME) algorithms with the help of the SVM-light4 and Mallet5 tools.
Unsupervised Mining of Personal and Impersonal Views	We find that ME performs slightly better than SVM on the average.
Unsupervised Mining of Personal and Impersonal Views	Transductive SVM , which seeks the largest separation between labeled and unlabeled data through regularization (Joachims, 1999).

SVM is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

37. Trainable Generation of Big-Five Personality Styles through Data-Driven Parameter Estimation

Mairesse, François and Walker, Marilyn

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Parameter Estimation Models	Continuous parameters are modeled with a linear regression model (LR), an M5’ model tree (M5), and a model based on support vector machines with a linear kernel ( SVM ).
Parameter Estimation Models	We test a Naive Bayes classifier (NB), a j48 decision tree (J48), a nearest-neighbor classifier using one neighbor (NN), a Java implementation of the RIPPER rule-based learner (J RIP), the AdaBoost boosting algorithm (ADA), and a support vector machines classifier with a linear kernel ( SVM ).
Parameter Estimation Models	Figure 3: SVM model with a linear kernel predicting the CONTENT POLARITY parameter.

SVM is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

38. Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora

Qian, Longhua and Hui, Haotian and Hu, Ya'nan and Zhou, Guodong and Zhu, Qiaoming

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Input: - L, labeled data set - U, unlabeled data set - n, batch size Output: - SVM , classifier Repeat: 1.
Abstract	Train a single classifier SVM on L 2.
Abstract	The objective is to learn SVM classifiers in both languages, denoted as SVMC and SVMe respectively, in a BAL fashion to improve their classification performance.

SVM is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

39. Identifying Bad Semantic Neighbors for Improving Distributional Thesauri

Ferret, Olivier

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments and evaluation	As mentioned before, this classifier is a linear SVM .
Improving a distributional thesaurus	More precisely, we follow (Lee and Ng, 2002), a reference work for WSD, by adopting a Support Vector Machines ( SVM ) classifier with a linear kernel and three kinds of features for characterizing each considered occur-
Improving a distributional thesaurus	For the second type of features, we take more precisely the POS of the three words before E and those of the three words after E. Each pair {POS, position} corresponds to a binary feature for the SVM classifier.
Improving a distributional thesaurus	Each instance of the 11 types of collocations is represented by a tuple (lemmal, positionl, lemma2, position2> and leads to a binary feature for the SVM classifier.

SVM is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

40. Online Relative Margin Maximization for Statistical Machine Translation

Eidelman, Vladimir and Marton, Yuval and Resnik, Philip

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	We focus on large-margin methods such as SVM (Joachims, 1998) and passive-aggressive algorithms such as MIRA.
The Relative Margin Machine in SMT	It is maximized by minimizing the norm in SVM , or analogously, the proximity constraint in MIRA: arg minW — wt\|\|2.
The Relative Margin Machine in SMT	RMM was introduced as a generalization over SVM that incorporates both the margin constraint
The Relative Margin Machine in SMT	Nonetheless, since structured RMM is a generalization of Structured SVM , which shares its underlying objective with MIRA, our intuition is that SMT should be able to benefit as well.

SVM is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

feature set (25)
BLEU (18)
TER (11)

41. Medical Relation Extraction with Manifold Models

Wang, Chang and Fan, James

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	In SVM implementations, the tradeoff parameter between training error and margin was set to l for all experiments.
Experiments	We compare our approaches to three state-of-the-art approaches including SVM with convolution tree kernels (Collins and Duffy, 2001), linear regression and SVM with linear kernels (Scholkopf and Smola, 2002).
Experiments	The SVM with linear kernels and the linear regression model used the same features as the manifold models.

SVM is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

42. Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis

Kim, Jungi and Li, Jin-Ji and Lee, Jong-Hyeok

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiment	4.3 Classification task — SVM
Experiment	4.3.1 Experimental Setting To test our SVM classifier, we perform the classification task.
Experiment	Table 3: Average tenfold cross-validation accuracies of polarity classification task with SVM .
Term Weighting and Sentiment Analysis	Specifically, we explore the statistical term weighting features of the word generation model with Support Vector machine ( SVM ), faithfully reproducing previous work as closely as possible (Pang et al., 2002).

SVM is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

43. Automatic Detection of Cognates Using Orthographic Alignment

Ciobanu, Alina Maria and Dinu, Liviu P.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We experiment with two machine-learning approaches: Naive Bayes and SVM .
Experiments	We report the n-gram values for which the best results are obtained and the hyperparameters for SVM , c and 7.
Experiments	The SVM produces better results for all languages except Portuguese, where the accuracy is equal.
Our Approach	For SVM , we use the wrapper provided by Weka for LibSVM (Chang and Lin, 2011).

SVM is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

44. Mining Bilingual Data from the Web with Adaptively Learnt Patterns

Jiang, Long and Yang, Shiquan and Zhou, Ming and Liu, Xiaohua and Zhu, Qingsheng

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Adaptive Pattern-based Bilingual Data Mining	Next, in the pattern learning module, those translation snippet pairs are used to find candidate patterns and then a SVM classifier is built to select the most useful patterns shared by most translation pairs in the whole text.
Adaptive Pattern-based Bilingual Data Mining	After all pattern candidates are extracted, a SVM classifier is used to select the good ones:
Adaptive Pattern-based Bilingual Data Mining	In this SVM model, each pattern candidate pi has the following four features:
Overview of the Proposed Approach	Then a SVM classifier is trained to select good patterns from all extracted pattern candidates.

SVM is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

45. Kernel Based Discourse Relation Recognition with Temporal Ordering Information

Wang, WenTing and Su, Jian and Tan, Chew Lim

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments and Results	We employ an SVM coreference resolver trained and tested on ACE 2005 with 79.5% Precision, 66.7% Recall and 72.5% F1 to label coreference mentions of the same named entity in an article.
Incorporating Structural Syntactic Information	And thus an SVM classifier can be learned and then used for recognition.
Introduction	Section 4 introduces the frame work for discourse recognition, as well as the baseline feature space and the SVM classifier.
The Recognition Framework	The classifier learned by SVM is:
The Recognition Framework	One advantage of SVM is that we can use tree kernel approach to capture syntactic parse tree information in a particular high-dimension space.

SVM is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

46. Coherent Citation-Based Summarization of Scientific Papers

Abu-Jbara, Amjad and Radev, Dragomir

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approach	We use Support Vector Machines ( SVM ) with linear kernel as our classifier.
Approach	We use SVM with linear kernel as our classifier.
Evaluation	Sentence Filtering Evaluation: We used Support Vector Machines ( SVM ) with linear kernel as our classifier.
Evaluation	Sentence Classification Evaluation: We used SVM in this step as well.
Evaluation	Author Name Replacement Evaluation: The classifier used in this task is also SVM .

SVM is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

47. Detecting Experiences from Weblogs

Park, Keun Chan and Jeong, Yoonjae and Myaeng, Sung Hyon

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experience Detection	While we tested several classifiers, we chose to use two different classifiers based on SVM and Logistic Regression for the final experimental results because they showed the best performance.
Experience Detection	Logistic Feature Regression SVM
Experience Detection	Logistic Feature Regression SVM
Lexicon Construction	ME SVM Prec.

SVM is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

48. A Convolutional Neural Network for Modelling Sentences

Kalchbrenner, Nal and Grefenstette, Edward and Blunsom, Phil

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	NB 41.0 81.8 BINB 41.9 83.1 SVM 40.7 79.4 REcNTN 45.7 85.4 MAX-TDNN 37.4 77.1 NBOW 42.4 80.5 DCNN 48.5 86.8
Experiments	SVM is a support vector machine with unigram and bigram features.
Experiments	head word, parser SVM hypernyms, WordNet

SVM is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

49. Exploring Syntactic Structural Features for Sub-Tree Alignment Using Bilingual Tree Kernels

Sun, Jun and Zhang, Min and Tan, Chew Lim

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Substructure Spaces for BTKs	In the 1st phase, a kernel based classifier, SVM in our study, is employed to classify each candidate subtree pair as aligned or unaligned.
Substructure Spaces for BTKs	Since SVM is a large margin based discriminative classifier rather than a probabilistic model, we introduce a sigmoid function to convert the distance against the hyperplane to a posterior alignment probability as follows:
Substructure Spaces for BTKs	We use SVM with binary classes as the classifier.

SVM is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

50. Automatic Generation of Story Highlights

Woodsend, Kristian and Lapata, Mirella

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	We learned the feature weights with a linear SVM , using the software SVM-OOPS (Woodsend and Gondzio, 2009).
Experimental Setup	For each phrase, features were extracted and salience scores calculated from the feature weights determined through SVM training.
Experimental Setup	The distance from the SVM hyperplane represents the salience score.

SVM is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

51. Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing

Sassano, Manabu and Kurohashi, Sadao

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Evaluation and Discussion	We set the degree of the kernels to 3 since cubic kernels with SVM have proved effective for Japanese dependency parsing (Kudo and Matsumoto, 2000; Kudo and Matsumoto, 2002).
Experimental Evaluation and Discussion	Stopping Criteria It is known that increment rate of the number of support vectors in SVM indicates saturation of accuracy improvement during iterations of active learning (Schohn and Cohn, 2000).
Experimental Evaluation and Discussion	It is interesting to examine whether the observation for SVM is also useful for support vectors7 of the averaged perceptron.

SVM is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

52. Discourse Complements Lexical Semantics for Non-factoid Answer Reranking

Jansen, Peter and Surdeanu, Mihai and Clark, Peter

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

CR + LS + DMM + DPM 39.32* +24% 47.86* +20%	For all experiments we used a linear SVM kernel.15
CR + LS + DMM + DPM 39.32* +24% 47.86* +20%	Table 4 shows that of the highest-weighted SVM features learned when training models for HOW questions on YA and Bio, many are shared (e.g., 56.5% of the features in the top half of both DPMs are shared), suggesting that a core set of discourse features may be of utility across domains.
CR + LS + DMM + DPM 39.32* +24% 47.86* +20%	Table 4: Percentage of top features with the highest SVM weights that are shared between Bio HOW and YA models.

SVM is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

53. Using Deep Morphology to Improve Automatic Error Detection in Arabic Handwriting Recognition

Habash, Nizar and Roth, Ryan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Settings	The PZD system relies on a set of SVM classifiers trained using morphological and lexical features.
Experimental Settings	The SVM classifiers are built using Yamcha (Kudo and Matsumoto, 2003).
Experimental Settings	Simple features are used directly by the PZD SVM models, whereas Binned features’ (numerical) values are reduced to a small, labeled category set whose labels are used as model features.

SVM is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

54. Target-dependent Twitter Sentiment Classification

Jiang, Long and Yu, Mo and Zhou, Ming and Liu, Xiaohua and Zhao, Tiejun

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approach Overview	In each of the first two steps, a binary SVM classifier is built to perform the classification.
Experiments	In the experiments, we consider the positive and negative tweets annotated by humans as subjective tweets (i.e., positive instances in the SVM classifiers), which amount to 727 tweets.
Related Work	According to the experimental results, machine learning based classifiers outperform the unsupervised approach, where the best performance is achieved by the SVM classifier with unigram presences as features.
Related Work	In contrast, (Barbosa and Feng, 2010) propose a two-step approach to classify the sentiments of tweets using SVM classifiers with abstract features.

SVM is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

55. Exploiting Bilingual Information to Improve Web Search

Gao, Wei and Blitzer, John and Zhou, Ming and Wong, Kam-Fai

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments and Results	fied; (2) for the identified query pairs, there should be sufficient statistics of associated clickthrough data; (3) The click frequency should be well distributed at both sides so that the preference order between bilingual document pairs can be derived for SVM learning.
Introduction	For both languages, we achieve significant improvements over monolingual Ranking SVM (RSVM) baselines (Herbrich et al., 2000; J oachims, 2002), which exploit a variety of monolingual features.
Learning to Rank Using Bilingual Information	We resort to Ranking SVM (RSVM) (Herbrich et al., 2000; Joachims, 2002) learning for classification on pairs of instances.
Learning to Rank Using Bilingual Information	The problem is to solve SVM objective: rrgn + A 21.2].

SVM is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

56. Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification

Tang, Duyu and Wei, Furu and Yang, Nan and Zhou, Ming and Liu, Ting and Qin, Bing

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related Work	(2) SVM : The ngram features and Support Vector Machine are widely used baseline methods to build sentiment classifiers (Pang et al., 2002).
Related Work	LibLinear is used to train the SVM classifier.
Related Work	(3) NBSVM: NBSVM (Wang and Manning, 2012) is a state-of—the-art performer on many sentiment classification datasets, which trades-off between Naive Bayes and NB-enhanced SVM .

SVM is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

57. Contrasting Opposing Views of News Articles on Contentious Issues

Park, Souneil and Lee, Kyung Soon and Song, Junehwa

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We applied a modified version of HITS algorithm and an SVM classifier trained with pseudo-relevant data for article analysis.
Disputant relation-based method	As for the rest of the sentences, a similarity analysis is conducted with an SVM classifier.
Disputant relation-based method	where SU: number of all sentences of the article Qi: number of quotes from the side i. Qij: number of quotes from either side i or j. Si: number of sentences classified to i by SVM .
Introduction	We applied a modified version of HITS algorithm to identify the key opponents of an issue, and used disputant extraction techniques combined with an SVM classifier for article analysis.

SVM is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

58. Robust Approach to Abbreviating Terms: A Discriminative Latent Variable Model with Global Information

Sun, Xu and Okazaki, Naoaki and Tsujii, Jun'ichi

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Results and Discussion	We compared the performance of the DPLVM with the CRFs and other baseline systems, including the heuristic system (Heu), the HMM model, and the SVM model described in $08, i.e., Sun et al.
Results and Discussion	The SVM method described by Sun et al.
Results and Discussion	In general, the results indicate that all of the sequential labeling models outperformed the SVM regression model with less training time.3 In the SVM regression approach, a large number of negative examples are explicitly generated for the training, which slowed the process.

SVM is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

59. New Word Detection for Sentiment Analysis

Huang, Minlie and Ye, Borui and Wang, Yichen and Chen, Haiqiang and Cheng, Junjun and Zhu, Xiaoyan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	# Pos/Neg Lexicon SVM Hownet 627/1,038 0.737 0.756 Hownet+NW 743/ 1,150 0.770 0.779 Hownet+T100 679/ 1,172 0.761 0.774 cptHownet 138/125 0.738 0.758 cptHownet+NW 254/237 0.774 0.782 cptHownet+T100 190/159 0.764 0.775
Experiment	The second model is a SVM model in which opinion words are used as feature, and 5-fold cross validation is conducted.
Experiment	5 This is not necessary for the SVM model.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

60. Omni-word Feature and Soft Constraint for Chinese Relation Extraction

Chen, Yanping and Zheng, Qinghua and Zhang, Wei

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related Work	The TRE systems use techniques such as: Rules (Regulars, Patterns and Propositions) (Miller et al., 1998), Kernel method (Zhang et al., 2006b; Zelenko et al., 2003), Belief network (Roth and Yih, 2002), Linear programming (Roth and Yih, 2007), Maximum entropy (Kambhatla, 2004) or SVM (GuoDong et al., 2005).
Related Work	(2005) introduced a feature based method, which utilized lexicon information around entities and was evaluated on Winnow and SVM classifiers.
Related Work	For each type of these relations, a SVM was trained and tested independently.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

61. Measuring Sentiment Annotation Complexity of Text

Joshi, Aditya and Mishra, Abhijit and Senthamilselvan, Nivvedan and Bhattacharyya, Pushpak

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion	We use three sentiment classification techniques: Na‘1've Bayes, MaxEnt and SVM with un-igrams, bigrams and trigrams as features.
Discussion	7http://scikit-learn.org/stable/ 8In case of SVM , the probability of predicted class is computed as given in Platt (1999).
Discussion	MaxEnt (Movie) -0.29 (72.17) MaxEnt (Twitter) -0.26 (71.68) SVM (Movie) -().24 (66.27) SVM (Twitter) -().19 (73.15)

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

62. Toward Future Scenario Generation: Extracting Event Causality Exploiting Semantic Relation, Context, and Association Features

Hashimoto, Chikara and Torisawa, Kentaro and Kloetzer, Julien and Sano, Motoki and Varga, István and Oh, Jong-Hoon and Kidawara, Yutaka

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Event Causality Extraction Method	An event causality candidate is given a causality score 0 8 core, which is the SVM score (distance from the hyperplane) that is normalized to [0,1] by the sigmoid function Each event causality candidate may be given multiple original sentences, since a phrase pair can appear in multiple sentences, in which case it is given more than one SVM score.
Experiments	(2011): CEAWS is an unsupervised method that uses CEA to rank event causality candidates, and CEAsup is a supervised method using SVM and the CEA features, whose ranking is based on the SVM scores.
Experiments	The baselines are as follows: Csuns is an unsupervised method that uses 03 for ranking, and Cssup is a supervised method using SVM with 03 as the only feature that uses SVM scores for ranking.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

63. Evaluating Multilanguage-Comparability of Subjectivity Analysis Systems

Kim, Jungi and Li, Jin-Ji and Lee, Jong-Hyeok

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Multilingual Subjectivity System	Previous studies have found that, among several ML-based approaches, the SVM classifier generally performs well in many subjectivity analysis tasks (Pang et al., 2002; Banea et al., 2008).
Multilingual Subjectivity System	An SVM score (a margin or the distance from a learned decision boundary) with a positive value predicts the input as being subjective, and negative value as objective.
Multilingual Subjectivity System	The second and the third approaches are carried out as follows: Corpus-based (T-CB): We translate the MPQA corpus into the target languages sentence by sentence using a web-based service.6 Using the same method for S-CB, we train an SVM model for each language with the translated training corpora.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

64. Incorporating Extra-Linguistic Information into Reference Resolution in Collaborative Task Dialogue

Iida, Ryu and Kobayashi, Syumpei and Tokunaga, Takenobu

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Empirical Evaluation	SVM ) should be separately created with regards to distinct features.
Empirical Evaluation	We utilised SVanl‘;8 as an implementation of the Ranking SVM algorithm, in which the parameter c was set as 1.0 and the remaining parameters were set to their defaults.
Reference Resolution using Extra-linguistic Information	Although the work by Denis and Baldridge (2008) uses Maximum Entropy to create their ranking-based model, we adopt the Ranking SVM algorithm (J oachims, 2002), which learns a weight vector to rank candidates for a given partial ranking of each referent.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

coreference (4)
SVM (3)

65. Predicting Power Relations between Participants in Written Dialog from a Single Thread

Prabhakaran, Vinodkumar and Rambow, Owen

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Predicting Direction of Power	Handling of undefined values for features in SVM is not straightforward.
Predicting Direction of Power	Most SVM implementations assume the value of 0 by default in such cases, conflating them
Predicting Direction of Power	Since we use a quadratic kernel, we expect the SVM to pick up the interaction between each feature and its indicator feature.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

66. Temporal Information Processing of a New Language: Fast Porting with Minimal Resources

Costa, Francisco and Branco, António

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Comparing the two Datasets	SMO is an implementation of Support Vector Machines ( SVM ), rules.JRip is the RIPPER algorithm, and bayes .NaiveBayes is a Naive Bayes classifier.
Comparing the two Datasets	In task C, the SVM algorithm was also the best performing algorithm among those that were also tried on the English data, but decision trees produced even better results here.
Comparing the two Datasets	The results are: in task A the lazy.KStar classifier scored 58.6%, and the SVM classifier scored 75.5% in task B and 59.4% in task C, with trees .

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

67. Semi-supervised Learning for Automatic Prosodic Event Detection Using Co-training Algorithm

Jeon, Je Hun and Liu, Yang

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Previous work	She also exploited a semi-supervised approach using Laplacian SVM classification on a small set of examples.
Prosodic event detection method	Our previous supervised learning approach (Jeon and Liu, 2009) showed that a combined model using Neural Network (NN) classifier for acoustic-prosodic evidence and Support Vector Machine ( SVM ) classifier for syntactic-prosodic evidence performed better than other classifiers.
Prosodic event detection method	We therefore use NN and SVM in this study.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

68. A Feature-Enriched Tree Kernel for Relation Extraction

Sun, Le and Han, Xianpei

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Finally, new relation instances are extracted using kernel based classifiers, e. g., the SVM classifier.
Introduction	We apply the one vs. others strategy for multiple classification using SVM .
Introduction	For SVM training, the parameter C is set to 2.4 for all experiments, and the tree kernel parameter A is tuned to 0.2 for FTK and 0.4 (the optimal parameter setting used in Qian et al.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

69. A Linear-Time Bottom-Up Discourse Parser with Constraints and Post-Editing

Feng, Vanessa Wei and Hirst, Graeme

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Related work	In particular, starting from EDUs, at each step of the tree-building, a binary SVM classifier is first applied to determine which pair of adjacent discourse constituents should be merged to form a larger span, and another multi-class SVM classifier is then applied to assign the type of discourse relation that holds between the chosen pair.
Related work	Also, the employment of SVM classifiers allows the incorporation of rich features for better data representation (Feng and Hirst, 2012).
Related work	However, HILDA’s approach also has obvious weakness: the greedy algorithm may lead to poor performance due to local optima, and more importantly, the SVM classifiers are not well-suited for solving structural problems due to the difficulty of taking context into account.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

70. Automatic Evaluation of Linguistic Quality in Multi-Document Summarization

Pitler, Emily and Louis, Annie and Nenkova, Ani

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental setup	We use a Ranking SVM (Si/Mug“ (Joachims, 2002)) to score summaries using our features.
Experimental setup	The Ranking SVM seeks to minimize the number of discordant pairs (pairs in which the gold standard has :31 ranked strictly higher than :52, but the learner ranks x2 strictly higher than :01).
Experimental setup	For system-level evaluation, we treat the real-valued output of the SVM ranker for each summary as the linguistic quality score.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

71. Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification

Dong, Li and Wei, Furu and Tan, Chuanqi and Tang, Duyu and Zhou, Ming and Xu, Ke

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	the target-independent (SVM-indep) and target-dependent features and uses SVM as the classifier.
Experiments	SVM-conn: The words, punctuations, emoti-cons, and #hashtags included in the converted dependency tree are used as the features for SVM .
Experiments	AdaRNN-comb: We combine the root vectors obtained by AdaRNN-Wfli with the uni/bi-gram features, and they are fed into a SVM classifier.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

72. Automatically Detecting Corresponding Edit-Turn-Pairs in Wikipedia

Daxenberger, Johannes and Gurevych, Iryna

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Machine Learning with Edit-Turn-Pairs	Baseline R. Forest SVM Accuracy .799 :\|:.031 .866 :\|:.026T .858 :\|:.027T Fimac, NaN .789 1.032 .763 1.033 Precisionmac.
Machine Learning with Edit-Turn-Pairs	A reduction of the feature set as judged by a X2 ranker improved the results for both Random Forest as well as the SVM , so we limited our feature set to the 100 best features.
Machine Learning with Edit-Turn-Pairs	In a 10-fold cross-validation experiment, we tested a Random Forest classifier (Breiman, 2001) and an SVM (Platt, 1998) with polynomial kernel.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

73. A New Dataset and Method for Automatically Grading ESOL Texts

Yannakoudakis, Helen and Briscoe, Ted and Medlock, Ben

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approach	In its basic form, a binary SVM classifier learns a linear threshold function that discriminates data points of two categories.
Evaluation	We trained a SVM regression model with our full set of feature types and compared it to the SVM rank preference model.
Introduction	In this paper, we report experiments on rank preference Support Vector Machines (SVMs) trained on a relatively small amount of data, on identification of appropriate feature types derived automatically from generic text processing tools, on comparison with a regression SVM model, and on the robustness of the best model to ‘outlier’ texts.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

74. Learning Soft Linear Constraints with Application to Citation Field Extraction

Anzaroot, Sam and Passos, Alexandre and Belanger, David and McCallum, Andrew

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Soft Constraints in Dual Decomposition	All we need to employ the structured perceptron algorithm (Collins, 2002) or the structured SVM algorithm (Tsochantaridis et al., 2004) is a black-box procedure for performing MAP inference in the structured linear model given an arbitrary cost vector.
Soft Constraints in Dual Decomposition	This can be ensured by simple modifications of the perceptron and subgradient descent optimization of the structured SVM objective simply by truncating c coordinate-wise to be nonnegative at every learning iteration.
Soft Constraints in Dual Decomposition	A similar analysis holds for the structured SVM ap-

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

75. Improved Bayesian Logistic Supervised Topic Models with Data Augmentation

Zhu, Jun and Zheng, Xun and Zhang, Bo

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For gLDA, we learn a binary linear SVM on its topic representations using SVMLight (J oachims, 1999).
Experiments	The results of DiscLDA (Lacoste-Jullien et al., 2009) and linear SVM on raw bag-of-words features were reported in (Zhu et al., 2012).
Experiments	The fact that gLDA+SVM performs better than the standard gSLDA is due to the same reason, since the SVM part of gLDA+SVM can well capture the supervision information to learn a classifier for good prediction, while standard sLDA can’t well-balance the influence of supervision.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

76. Models of Semantic Representation with Visual Attributes

Silberer, Carina and Ferrari, Vittorio and Lapata, Mirella

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Attribute-based Classification	We used an L2-regularized L2-loss linear SVM (Fan et a1., 2008) to learn the attribute predictions.
Attribute-based Classification	data was randomly split into a training and validation set of equal size in order to find the optimal cost parameter C. The final SVM for the attribute was trained on the entire training data, i.e., on all positive and negative examples.
Attribute-based Classification	The SVM learners used the four different feature types proposed in Farhadi et al.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

77. Detecting Turnarounds in Sentiment Analysis: Thwarting

Ramteke, Ankit and Malu, Akshat and Bhattacharyya, Pushpak and Nath, J. Saketha

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Machine Learning based approach	We use the SVM classifier with features generated using the following steps.
Conclusions and Future Work	This ontology guides a rule based approach to thwarting detection, and also provides features for an SVM based learning system.
Results	We used the CVX3 library in Matlab to solve the optimization problem for learning weights and the LIBSVM4 library to implement the svm classifier.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

78. Utterance-Level Multimodal Sentiment Analysis

Perez-Rosas, Veronica and Mihalcea, Rada and Morency, Louis-Philippe

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion	As before, the linguistic, acoustic, and visual features are averaged over the entire video, and we use an SVM classifier in tenfold cross validation experiments.
Experiments and Results	We use the entire set of 412 utterances and run ten fold cross validations using an SVM classifier, as implemented in the Weka toolkit.5 In line with previous work on emotion recognition in speech (Haq and Jackson, 2009; Anagnostopoulos and Vovoli, 2010) where utterances are selected in a speaker dependent manner (i.e., utterances from the same speaker are included in both training and test), as well as work on sentence-level opinion classification where document boundaries are not considered in the split performed between the training and test sets (Wilson et al., 2004; Wiegand and Klakow, 2009), the training/test split for each fold is performed at utterance level regardless of the video they belong to.
Multimodal Sentiment Analysis	These simple weighted unigram features have been successfully used in the past to build sentiment classifiers on text, and in conjunction with Support Vector Machines ( SVM ) have been shown to lead to state-of-the-art performance (Maas et al., 2011).

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

79. Jointly Learning to Extract and Compress

Berg-Kirkpatrick, Taylor and Gillick, Dan and Klein, Dan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Structured Learning	We use a soft-margin support vector machine ( SVM ) (Vapnik, 1998) objective over the full structured output space (Taskar et al., 2003; Tsochantaridis et al., 2004) of extractive and compressive summaries:
Structured Learning	In our application, this approach efficiently solves the structured SVM training problem up to some specified tolerance 6.
Structured Learning	Thus, if loss-augmented prediction turns up no new constraints on a given iteration, the current solution to the reduced problem, w and E, is the solution to the full SVM training problem.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

80. Extracting Social Power Relationships from Natural Language

Bramsen, Philip and Escobar-Molano, Martha and Patel, Ami and Alonso, Rafael

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Oberlander and Nowson explore using a Na'ive Bayes and an SVM classifier to perform binary classification of text on each personality dimension.
Abstract	The results of the SVM classifier, shown in line (1) of Table 2, were fairly poor.
Abstract	Training a multiclass SVM on the binned n-gram features from (5) produces 51.6% cross-validation accuracy on training data and 44.4% accuracy on the weighted test set (both numbers should be compared to a 33% baseline).

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

81. A computational approach to politeness with application to social factors

Danescu-Niculescu-Mizil, Cristian and Sudhof, Moritz and Jurafsky, Dan and Leskovec, Jure and Potts, Christopher

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Predicting politeness	The BOW classifier is an SVM using a unigram feature representation.6 We consider this to be a strong baseline for this new
Predicting politeness	is an SVM using the linguistic features listed in Table 3 in addition to the unigram features.
Predicting politeness	For new requests, we use class probability estimates obtained by fitting a logistic regression model to the output of the SVM (Witten and Frank, 2005) as predicted politeness scores (with values between 0 and l; henceforth politeness, by abuse of language).

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

in-domain (6)
SVM (3)

82. Extracting bilingual terminologies from comparable corpora

Aker, Ahmet and Paramita, Monica and Gaizauskas, Rob

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	For classification we use an SVM binary classifier and training data taken from the EUROVOC thesaurus.
Feature extraction	To align or map source and target terms we use an SVM binary classifier (J oachims, 2002) with a linear kernel and the tradeoff between training error and margin parameter c = 10.
Method	For classification purposes we use an SVM binary classifier.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

83. Age Prediction in Blogs: A Study of Style, Content, and Online Behavior in Pre- and Post-Social Media Generations

Rosenthal, Sara and McKeown, Kathleen

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments and Results	We experimented with an SVM classifier and found logistic regression to do slightly better.
Related Work	They use an SVM classifier with only n-grams as features.
Related Work	Nowson et al (2006) employed dictionary and n—gram based content analysis and achieved 91.5% accuracy using an SVM classifier.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

84. Verb Classification using Distributional Similarity in Syntactic and Semantic Structures

Croce, Danilo and Moschitti, Alessandro and Basili, Roberto and Palmer, Martha

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	This uses a set of binary SVM classifiers, one for each verb class (frame) 73.
Experiments	In the classification phase the binary classifiers are applied by (i) only considering classes that are compatible with the target verbs; and (ii) selecting the class associated with the maximum positive SVM margin.
Model Analysis and Discussion	In line with the method discussed in (Pighin and Moschitti, 2009b), these fragments are extracted as they appear in most of the support vectors selected during SVM training.

SVM is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: