Domain Adaptation in Sentiment Research | Most text-level sentiment classifiers use standard machine learning techniques to learn and select features from labeled corpora. |
Domain Adaptation in Sentiment Research | There are two alternatives to supervised machine learning that can be used to get around this problem: on the one hand, general lists of sentiment clues/features can be acquired from domain-independent sources such as dictionaries or the Internet, on the other hand, unsupervised and weakly-supervised approaches can be used to take advantage of a small number of annotated in-domain examples and/or of unlabelled in-domain data. |
Domain Adaptation in Sentiment Research | On other domains, such as product reviews, the performance of systems that use general word lists is comparable to the performance of supervised machine learning approaches (Gamon and Aue, 2005). |
Experiments | results depends on the genre and size of the n-gram: on product reviews, all results are statistically significant at oz 2 0.025 level; on movie reviews, the difference between NaVe Bayes and SVM is statistically significant at oz 2 0.01 but the significance diminishes as the size of the n- gram increases; on news, only bigrams produce a statistically significant (a = 0.01) difference between the two machine learning methods, while on blogs the difference between SVMs and NaVe Bayes is most pronounced when unigrams are used (a = 0.025). |
Integrating the Corpus-based and Dictionary-based Approaches | For this reason, the numbers reported for the corpus-based classifier do not reflect the full potential of machine learning approaches when sufficient in-domain training data is available. |
Introduction | One of the emerging directions in NLP is the development of machine learning methods that perform well not only on the domain on which they were trained, but also on other domains, for which training data is not available or is not sufficient to ensure adequate machine learning . |
Conclusion and Future Work | The basic idea is to measure the accuracy improvements of the PPI extraction task by incorporating the parser output as statistical features of a machine learning classifier. |
Evaluation Methodology | the parser output is embedded as statistical features of a machine learning classifier. |
Evaluation Methodology | Recent studies on PPI extraction demonstrated that dependency relations between target proteins are effective features for machine learning classifiers (Katrenko and Adriaans, 2006; Erkan et al., 2007; Seetre et al., 2007). |
Introduction | Our approach to parser evaluation is to measure accuracy improvement in the task of identifying protein-protein interaction (PPI) information in biomedical papers, by incorporating the output of different parsers as statistical features in a machine learning classifier (Yakushiji et al., 2005; Katrenko and Adriaans, 2006; Erkan et al., 2007; Seetre et al., 2007). |
Introduction | Section 3 also shows how to automatically extract and collect counts for context patterns, and how to combine the information using a machine learned classifier. |
Related Work | In particular, note the pioneering work of Paice and Husk (1987), the inclusion of non-referential it detection in a full anaphora resolution system by Lappin and Leass (1994), and the machine learning approach of Evans (2001). |
Related Work | Although machine learned systems can flexibly balance the various indicators and contra-indicators of non-referentiality, a particular feature is only useful if it is relevant to an example in limited labelled training data. |
Introduction | The standard classification process is to find in an auxiliary corpus a set of patterns in which a given training word pair co-appears, and use pattern-word pair co-appearance statistics as features for machine learning algorithms. |
Related Work | In this paper, we use these pattern clusters as the (only) source of machine learning features for a nominal relationship classification problem. |
Relationship Classification | In this method we treat the HITS measure for a cluster as a feature for a machine learning classification |
Experiments | We think this shows one of the strengths of machine learning methods such as CRFs. |
Related Work and Discussion | Parallelization has recently regained attention in the machine learning community because of the need for learning from very large sets of data. |
Related Work and Discussion | (2006) presented the MapReduce framework for a wide range of machine learning algorithms, including the EM algorithm. |
Entity-mention Model with ILP | However, normal machine learning algorithms work on attribute-value vectors, which only allows the representation of atomic proposition. |
Introduction | Even worse, the number of mentions in an entity is not fixed, which would result in variant-length feature vectors and make trouble for normal machine learning algorithms. |
Modelling Coreference Resolution | Both (2) and (1) can be approximated with a machine learning method, leading to the traditional mention-pair model and the entity-mention model for coreference resolution, respectively. |