Conclusion and Future Work | These documents are converted to a bag-of-words input and fed into neural networks. |
Introduction | The levels inferred from neural network correspond to distinct levels of concepts, where high-level representations are obtained from low-level bag-of-words input. |
Topic Similarity Model with Neural Network | The most relevant N documents d f and d6 are retrieved and converted to a high-dimensional, bag-of-words input f and e for the representation learningl. |
Topic Similarity Model with Neural Network | Assuming that the input is a n-of-V binary vector X representing the bag-of-words (V is the vocabulary size), an auto-encoder consists of an encoding process g(X) and a decoding process The objective of the auto-encoder is to minimize the reconstruction error £(h(g(X)), X). |
Topic Similarity Model with Neural Network | In our task, for each sentence, we treat the retrieved N relevant documents as a single large document and convert it to a bag-of-words vector X in Figure 2. |
Conclusion | In this paper we apply recursive neural networks to political ideology detection, a problem where previous work relies heavily on bag-of-words models and hand-designed lexica. |
Related Work | In general, work in this category tends to combine traditional surface lexical modeling (e. g., bag-of-words ) with hand-designed syntactic features or lexicons. |
Related Work | Most previous work on ideology detection ignores the syntactic structure of the language in use in favor of familiar bag-of-words representations for |
Related Work | E.g., Gerrish and Blei (2011) predict the voting patterns of Congress members based on bag-of-words representations of bills and inferred political leanings of those members. |
Where Compositionality Helps Detect Ideological Bias | Experimental Results Table 1 shows the RNN models outperforming the bag-of-words baselines as well as the word2vec baseline on both datasets. |
Where Compositionality Helps Detect Ideological Bias | We obtain better results on Convote than on IBC with both bag-of-words and RNN models. |
Abstract | We rely on the tree kernel technology to automatically extract and learn features with better generalization power than bag-of-words . |
Experiments | We conjecture that sentiment prediction for AUTO category is largely driven by one-shot phrases and statements where it is hard to improve upon the bag-of-words and sentiment lexicon features. |
Experiments | The bag-of-words model seems to be affected by the data sparsity problem which becomes a crucial issue when only a small training set is available. |
Introduction | The comment contains a product name xoom and some negative expressions, thus, a bag-of-words model would derive a negative polarity for this product. |
Introduction | Clearly, the bag-of-words lacks the structural information linking the sentiment with the target product. |
Representations and models | Such classifiers are traditionally based on bag-of-words and more advanced features. |
Abstract | These regularizers impose linguistic bias in feature weights, enabling us to incorporate prior knowledge into conventional bag-of-words models. |
Introduction | For tasks like text classification, sentiment analysis, and text-driven forecasting, this is an open question, as cheap “bag-of-words” models often perform well. |
Introduction | We embrace the conventional bag-of-words representation of text, instead bringing linguistic bias to bear on regularization. |
Introduction | Our experiments demonstrate that structured regularizers can squeeze higher performance out of conventional bag-of-words models on seven out of eight of text categorization tasks tested, in six cases with more compact models than the best-performing unstructured-regularized model. |
Related and Future Work | Overall, our results demonstrate that linguistic structure in the data can be used to improve bag-of-words models, through structured regularization. |
Related and Future Work | Our experimental focus has been on a controlled comparison between regularizers for a fixed model family (the simplest available, linear with bag-of-words features). |
Approach | This is a distributed bag-of-words approach as sentence ordering is not taken into account by the model. |
Approach | The use of a nonlinearity enables the model to learn interesting interactions between words in a document, which the bag-of-words approach of ADD is not capable of learning. |
Related Work | (2013) proposed a bag-of-words autoencoder model, where the bag-of-words representation in one language is used to train the embeddings in another. |
Related Work | Hermann and Blunsom (2014) propose a large-margin learner for multilingual word representations, similar to the basic additive model proposed here, which, like the approaches above, relies on a bag-of-words model for sentence representations. |
Identifying Key Medical Relations | The similarity of two sentences is defined as the bag-of-words similarity of the dependency paths connecting arguments. |
Relation Extraction with Manifold Models | 0 (7) Bag-of-words features modeling the dependency path. |
Relation Extraction with Manifold Models | o (8) Bag-of-words features modeling the whole sentence. |
Abstract | By performing probability integral transform, our approach moves beyond the standard count-based bag-of-words models in NLP, and improves previous work on text regression by incorporating the correlation among local features in the form of semiparametric Gaussian copula. |
Copula Models for Text Regression | By doing this, we are essentially performing probability integral transform— an important statistical technique that moves beyond the count-based bag-of-words feature space to marginal cumulative density functions space. |
Copula Models for Text Regression | This is of crucial importance to modeling text data: instead of using the classic bag-of-words representation that uses raw counts, we are now working with uniform marginal CDFs, which helps coping with the overfitting issue due to noise and data sparsity. |
Introduction | bag-of-words or indivisible n-gram). |
Related Work | One method considers the phrases as bag-of-words and employs a convolution model to transform the word embeddings to phrase embeddings (Collobert et al., 2011; Kalchbrenner and Blunsom, 2013). |
Related Work | (2013) also use bag-of-words but learn BLEU sensitive phrase embeddings. |