Abstract | The created thesaurus is then used to expand feature vectors to train a binary classifier. |
Introduction | a unigram or a bigram of word lemma) in a review using a feature vector . |
Introduction | We model the cross-domain sentiment classification problem as one of feature expansion, where we append additional related features to feature vectors that represent source and target domain reviews in order to reduce the mismatch of features between the two domains. |
Introduction | thesaurus to expand feature vectors in a binary classifier at train and test times by introducing related lexical elements from the thesaurus. |
Sentiment Sensitive Thesaurus | For example, if we know that both excellent and delicious are positive sentiment words, then we can use this knowledge to expand a feature vector that contains the word delicious using the word excellent, thereby reducing the mismatch between features in a test instance and a trained model. |
Sentiment Sensitive Thesaurus | Let us denote the value of a feature 21) in the feature vector u representing a lexical element u by f (u, The vector u can be seen as a compact representation of the distribution of a lexical element u over the set of features that co-occur with u in the reviews. |
Sentiment Sensitive Thesaurus | From the construction of the feature vector u described in the previous paragraph, it follows that 21) can be either a sentiment feature or another lexical element that co-occurs with u in some review sentence. |
Abstract | The first compresses the query into a query feature vector, which aggregates all document instances in the same query, and then conducts query weighting based on the query feature vector . |
Evaluation | Specifically, after document feature aggregation, the number of query feature vectors in all adaptation tasks is no more than 150 in source and target domains. |
Introduction | Take Figure 2 as a toy example, where the document instance is represented as a feature vector with four features. |
Introduction | In this work, we present two simple but very effective approaches attempting to resolve the problem from distinct perspectives: (1) we compress each query into a query feature vector by aggregating all of its document instances, and then conduct query weighting on these query feature vectors ; (2) we measure the similarity between the source query and each target query one by one, and then combine these fine- grained similarity values to calculate its importance to the target domain. |
Query Weighting | The query can be compressed into a query feature vector , where each feature value is obtained by the aggregate of its corresponding features of all documents in the query. |
Query Weighting | We concatenate two types of aggregates to construct the query feature vector : the mean [i = fi Zlqzll |
Query Weighting | is the feature vector of document 2' and |q| denotes the number of documents in q . |
Answer Grading System | We use Mani, $8) to denote the feature vector associated with a pair of nodes (55,-, :58), where :10,- is a node from the instructor answer A, and x8 is a node from the student answer A8. |
Answer Grading System | For a given answer pair (A1, As), we assemble the eight graph alignment scores into a feature vector |
Answer Grading System | We combine the alignment scores $001,, A8) with the scores ¢B(Ai, As) from the lexical semantic similarity measures into a single feature vector ¢(A,-,AS) = [¢G(A,-,AS)|¢B(A,-,AS)]. |
Results | We report the results of running the systems on three subsets of features ¢(Ai, A8): BOW features ¢B(Ai, As) only, alignment features $901,, As) only, or the full feature vector (labeled “Hybrid”). |
Related Work | Instead, we create new feature vectors Fgen on the basis of the feature vectors Fseed in S. For each class in S, we extract all attribute-value pairs from the feature vectors for this particular class. |
Related Work | For each class, we randomly select features (with replacement) from Fseed and combine them into a new feature vector Fgen, retaining the distribution of the different classes in the data. |
Related Work | As a result, we obtain a more general set of feature vectors Fgen with characteristic features being distributed more evenly over the different feature vectors . |
Results and Discussion | Then (Einstein,he), (Hawking,he) and (Novoselov, he) will all be assigned the feature vector <l, No, Proper Noun, Personal Pronoun, Yes>. |
Results and Discussion | Using the same representation of pairs, suppose that for the sequence of markables Biden, Obama, President the markable pairs (Biden,President) and (0bama,President) are assigned the feature vectors <8, No, Proper Noun, Proper Noun, Yes> and <l, No, Proper Noun, Proper Noun, Yes>, respectively. |
Results and Discussion | with the second feature vector (distance=l) as coreferent than with the first one (distance=8) in the entire automatically labeled training set. |
System Architecture | After filtering, we then calculate a feature vector for each generated pair that survived filters (i)—(iv). |
Clustering phrase pairs directly using the K-means algorithm | We thus propose to represent each phrase pair instance (including its bilingual one-word contexts) as feature vectors , i.e., points of a vector space. |
Clustering phrase pairs directly using the K-means algorithm | then use these data points to partition the space into clusters, and subsequently assign each phrase pair instance the cluster of its corresponding feature vector as label. |
Clustering phrase pairs directly using the K-means algorithm | In the same fashion, we can incorporate multiple tagging schemes (e.g., word clusterings of different gran-ularities) into the same feature vector . |
Background | Distributional similarity algorithms differ in their feature representation: Some use a binary representation: each predicate is represented by one feature vector where each feature is a pair of arguments (Szpektor et al., 2004; Yates and Etzioni, 2009). |
Learning Typed Entailment Graphs | 2) Feature representation Each example pair of predicates (191,192) is represented by a feature vector , where each feature is a specific distributional |
Learning Typed Entailment Graphs | We want to use Pm, to derive the posterior P(G|F), where F = Uu¢vFuv and Fm, is the feature vector for a node pair (u, 2)). |
Abstract | The results of these experiments were not particularly strong, likely owing to the increased sparsity of the feature vectors . |
Abstract | Binning: Next, we wished to explore longer n-grams of words or POS tags and to reduce the sparsity of the feature vectors . |
Abstract | Self-Training: Besides sparse feature vectors , another factor likely to be hurting our classifier was the limited amount of training data. |
Our Method | 5: for Each word w E t d0 6: Get the feature vector ID: 13 = reprw(w, t). |
Our Method | 10: end if 11: end for 12: Get the feature vector t: 7? |
Our Method | 4: Get the feature vector ID: 13 = reprw (w, t). |