Experiments | Features Model MRR Topl Tops Baseline — 42.3% 32.7% 54.5% QTCF SVM 51 .9% 44.6% 63.4% SSL 49.5% 43.1% 60.9% LexSem SVM 48.2% 40.6% 61.4% SSL 47.9% 40.1% 58.4% QComp SVM 54.2% 47.5% 64.3% SSL 51.9% 45.5% 62.4% |
Experiments | We performed manual iterative parameter optimization during training based on prediction accuracy to find the best k—nearest parameter for SSL, i.e., k = {3,5,10,20,50} , and best 0 = {10—2,..,102} and 7 2 {2‘2, ..,23} for RBF kernel SVM . |
Experiments | We applied SVM and our graph based SSL method with no summarization to learn models using labeled training and testing datasets. |
Feature Extraction for Entailment | The QC model is trained via support vector machines ( SVM ) (Vapnik, 1995) considering different features such as semantic headword feature based on variation of Collins rules, hypernym extraction via Lesk word disambiguation (Lesk, 1988), regular expressions for wh-word indicators, n-grams, word-shapes(capitals), etc. |
Graph Summarization | Using a separate learner, e.g., SVM (Vapnik, 1995), we obtain predicted outputs, Y5 = (33f, ..., 39214) of X 5 and append observed labels Y5 = Y5 U YL. |
Data and Task | Dolan and Brockett (2005) remark that this corpus was created semiautomatically by first training an SVM classifier on a disjoint annotated 10,000 sentence pair dataset and then applying the SVM on an unseen 49,375 sentence pair corpus, with its output probabilities skewed towards over-identification, i.e., towards generating some false paraphrases. |
Experimental Evaluation | The SVM was trained to classify positive and negative examples of paraphrase using SVMlight (J oachims, 1999).8 Metaparameters, tuned on the development data, were the regularization constant and the degree of the polynomial kernel (chosen in [10—5, 102] and 1—5 respectively. |
Experimental Evaluation | It is unsurprising that the SVM performs very well on the MSRPC because of the corpus creation process (see Sec. |
Experimental Evaluation | 4) where an SVM was applied as well, with very similar features and a skewed decision process (Dolan and Brockett, 2005). |
Product of Experts | LR (like the QG) provides a probability distribution, but uses surface features (like the SVM ). |
Product of Experts | 2; this model is on par with the SVM , though trading recall in favor of precision. |
Product of Experts | We view it as a probabilistic simulation of the SVM more suitable for combination with the QG. |
Evaluation | Transductive SVM . |
Evaluation | Specifically, we begin by training an inductive SVM on one labeled example from each class, iteratively labeling the most uncertain unlabeled point on each side of the hyperplane and retraining the SVM until 100 points are labeled. |
Evaluation | Finally, we train a transductive SVM on the 100 labeled points and the remaining 1900 unlabeled points, obtaining the results in row 3 of Table 1. |
Our Approach | Specifically, we train a discriminative classifier using the support vector machine ( SVM ) learning algorithm (J oachims, 1999) on the set of unambiguous reviews, and then apply the resulting classifier to all the reviews in the training folds4 that are not seeds. |
Our Approach | As our weakly supervised learner, we employ a transductive SVM . |
Our Approach | Hence, instead of training just one SVM classifier, we aim to reduce classification errors by training an ensemble of five classifiers, each of which uses all 100 manually labeled reviews and a different subset of the 500 automatically labeled reviews. |
Abstract | We represent words as sequences of substrings, and use the substrings as features in a Support Vector Machine ( SVM ) ranker, which is trained to rank possible stress patterns. |
Automatic Stress Prediction | We use a support vector machine ( SVM ) to rank the possible patterns for each sequence (Section 3.2). |
Automatic Stress Prediction | These units are used to define the features and outputs used by the SVM ranker. |
Automatic Stress Prediction | The SVM can thus generalize from observed words to similarly-spelled, unseen examples. |
Introduction | We divide each word into a sequence of substrings, and use these substrings as features for a Support Vector Machine ( SVM ) ranker. |
Introduction | The task of the SVM is to rank the true stress pattern above the small number of acceptable alternatives. |
Introduction | The SVM ranker achieves exceptional 96.2% word accuracy on the challenging task of predicting the full stress pattern in English. |
Empirical Evaluation 4.1 Evaluation Setup | SVM(CN): This method applies the inductive SVM with only Chinese features for sentiment classification in the Chinese view. |
Empirical Evaluation 4.1 Evaluation Setup | SVM(EN): This method applies the inductive SVM with only English features for sentiment classification in the English view. |
Empirical Evaluation 4.1 Evaluation Setup | SVM(ENCNI): This method applies the inductive SVM with both English and Chinese features for sentiment classification in the two views. |
Introduction | SVM , NB), and the classification performance is far from satisfactory because of the language gap between the original language and the translated language. |
Introduction | The SVM classifier is adopted as the basic classifier in the proposed approach. |
Related Work 2.1 Sentiment Classification | Standard Na'1've Bayes and SVM classifiers have been applied for subjectivity classification in Romanian (Mihalcea et al., 2007; Banea et al., 2008), and the results show that automatic translation is a viable alternative for the construction of resources and tools for subjectivity analysis in a new target language. |
Related Work 2.1 Sentiment Classification | To date, many semi-supervised learning algorithms have been developed for addressing the cross-domain text classification problem by transferring knowledge across domains, including Transductive SVM (Joachims, 1999), EM(Nigam et al., 2000), EM-based Na'1've Bayes classifier (Dai et al., 2007a), Topic-bridged PLSA (Xue et al., 2008), Co-Clustering based classification (Dai et al., 2007b), two-stage approach (Jiang and Zhai, 2007). |
The Co-Training Approach | Typical text classifiers include Support Vector Machine ( SVM ), Na'1've Bayes (NB), Maximum Entropy (ME), K-Nearest Neighbor (KNN), etc. |
The Co-Training Approach | In this study, we adopt the widely-used SVM classifier (Joachims, 2002). |
The Co-Training Approach | as two sets of vectors in a feature space, SVM constructs a separating hyperplane in the space by maximizing the margin between the two data sets. |
Building a Discourse Parser | l SVM Training | ‘ Feature Extraction w Classification [ SVM Models (Binary and Multiclass) J v [ Scored RS sub-trees ] v ‘ Bottom-up Tree Construction |
Building a Discourse Parser | Support Vector Machines (SVM) (Vapnik, 1995) are used to model classifiers S and L. SVM refers to a set of supervised learning algorithms that are based on margin maximization. |
Building a Discourse Parser | This makes SVM well-fitted to treat classification problems involving relatively large feature spaces such as ours (% 105 features). |
Evaluation | 4.2 Raw SVM Classification |
Evaluation | Although our final goal is to achieve good performance on the entire tree-building task, a useful intermediate evaluation of our system can be conducted by measuring raw performance of SVM classifiers. |
Evaluation | Table 1: SVM Classifier performance. |
Features | Instrumental to our system’s performance is the choice of a set of salient characteristics (“features”) to be used as input to the SVM algorithm for training and classification. |
Corpus Details | As our reference algorithm, we used the current state-of-the-art system developed by Boulis and Ostendorf (2005) using unigram and bigram features in a SVM framework. |
Corpus Details | Table 12 Top 20 ngram features for gender, ranked by the weights assigned by the linear SVM model |
Corpus Details | After extracting the ngrams, a SVM model was trained via the SVMlight toolkit (J oachims, 1999) using the linear kernel with the default toolkit settings. |
Experiments | We also compare the results of SSMFLK with those of two supervised classification methods: Support Vector Machine ( SVM ) and Naive Bayes. |
Experiments | —e— Consistency Method 0.51 —I— Homonic—CMN + Green Function 0.45 7 + SVM |
Experiments | 6 f —9— SSMFLK 0.5 - —6— Consistency Method -—I— Homonic—CMN + Green Function 0.4 ' + SVM + Naive Bayes |
Related Work | Most work in machine learning literature on utilizing labeled features has focused on using them to generate weakly labeled examples that are then used for standard supervised learning: (Schapire et al., 2002) propose one such framework for boosting logistic regression; (Wu and Srihari, 2004) build a modified SVM and (Liu et al., 2004) use a combination of clustering and EM based methods to instantiate similar frameworks. |
Adaptive Pattern-based Bilingual Data Mining | Next, in the pattern learning module, those translation snippet pairs are used to find candidate patterns and then a SVM classifier is built to select the most useful patterns shared by most translation pairs in the whole text. |
Adaptive Pattern-based Bilingual Data Mining | After all pattern candidates are extracted, a SVM classifier is used to select the good ones: |
Adaptive Pattern-based Bilingual Data Mining | In this SVM model, each pattern candidate pi has the following four features: |
Overview of the Proposed Approach | Then a SVM classifier is trained to select good patterns from all extracted pattern candidates. |
Experiment | 4.3 Classification task — SVM |
Experiment | 4.3.1 Experimental Setting To test our SVM classifier, we perform the classification task. |
Experiment | Table 3: Average tenfold cross-validation accuracies of polarity classification task with SVM . |
Term Weighting and Sentiment Analysis | Specifically, we explore the statistical term weighting features of the word generation model with Support Vector machine ( SVM ), faithfully reproducing previous work as closely as possible (Pang et al., 2002). |
Experiments and Results | fied; (2) for the identified query pairs, there should be sufficient statistics of associated clickthrough data; (3) The click frequency should be well distributed at both sides so that the preference order between bilingual document pairs can be derived for SVM learning. |
Introduction | For both languages, we achieve significant improvements over monolingual Ranking SVM (RSVM) baselines (Herbrich et al., 2000; J oachims, 2002), which exploit a variety of monolingual features. |
Learning to Rank Using Bilingual Information | We resort to Ranking SVM (RSVM) (Herbrich et al., 2000; Joachims, 2002) learning for classification on pairs of instances. |
Learning to Rank Using Bilingual Information | The problem is to solve SVM objective: rrgn + A 21.2]. |
Results and Discussion | We compared the performance of the DPLVM with the CRFs and other baseline systems, including the heuristic system (Heu), the HMM model, and the SVM model described in $08, i.e., Sun et al. |
Results and Discussion | The SVM method described by Sun et al. |
Results and Discussion | In general, the results indicate that all of the sequential labeling models outperformed the SVM regression model with less training time.3 In the SVM regression approach, a large number of negative examples are explicitly generated for the training, which slowed the process. |
Previous work | She also exploited a semi-supervised approach using Laplacian SVM classification on a small set of examples. |
Prosodic event detection method | Our previous supervised learning approach (Jeon and Liu, 2009) showed that a combined model using Neural Network (NN) classifier for acoustic-prosodic evidence and Support Vector Machine ( SVM ) classifier for syntactic-prosodic evidence performed better than other classifiers. |
Prosodic event detection method | We therefore use NN and SVM in this study. |