Index of papers in Proc. ACL that mention
  • bag-of-words
Nastase, Vivi and Strapparava, Carlo
Abstract
In a straightforward bag-of-words experimental setup we add etymological ancestors of the words in the documents, and investigate the performance of a model built on English data, on Italian test data (and viceversa).
Abstract
The results show not only statistically significant, but a large improvement — a jump of almost 40 points in Fl-score — over the raw (vanilla bag-of-words ) representation.
Cross Language Text Categorization
The most frequently, and successfully, used document representation is the bag-of-words (BoWs).
Cross Language Text Categorization
As is commonly done in text categorization (Sebastiani, 2005), the documents in our data are represented as bag-of-words , and classification is done using support vector machines (SVMs).
Cross Language Text Categorization
The bag-of-words representation for each document is expanded with the corresponding etymological features.
Discussion
Feature filtering is commonly done in machine learning when the data has many features, and in text categorization when using the bag-of-words representation in particular.
Discussion
The difference in results on the two dictionary versions was significant: a 4 and 5 points increase respectively in micro-averaged Fl-score in the bag-of-words setting for English trainingfltalian testing and Italian trainingflEnglish testing, and a 2 and 6 points increase in the LSA setting.
Introduction
We start with the basic setup, representing the documents as bag-of-words , where we train a model on the English training data, and use this model to categorize documents from the Italian test data (and viceversa).
Introduction
We then add the etymological roots of the words in the data to the bag-of-words , and notice a large — 21 points — increase in performance in terms of Fl-score.
Introduction
We then use the bag-of-words representation of the training data to build a semantic space using LSA, and use the generated word vectors to represent the training and test data.
bag-of-words is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Yih, Wen-tau and Chang, Ming-Wei and Meek, Christopher and Pastusiak, Andrzej
Conclusions
Following the word-alignment paradigm, we find that the rich lexical semantic information improves the models consistently in the unstructured bag-of-words setting and also in the framework of learning latent structures.
Experiments
For the unstructured, bag-of-words setting, we tested logistic regression (LR) and boosted decision trees (BDT).
Introduction
Due to the variety of word choices and inherent ambiguities in natural languages, bag-of-words approaches with simple surface-form word matching tend to produce brittle results with poor prediction accuracy (Bilotti et al., 2007).
Learning QA Matching Models
In this section, we investigate the effectiveness of various learning models for matching questions and sentences, including the bag-of-words setting
Learning QA Matching Models
5.1 Bag-of-Words Model
Learning QA Matching Models
The bag-of-words model treats each question and sentence as an unstructured bag of words.
Problem Definition
For instance, if we assume a naive complete bipartite matching, then effectively it reduces to the simple bag-of-words model.
Related Work
Observing the limitations of the bag-of-words models, Wang et al.
bag-of-words is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Xie, Boyi and Passonneau, Rebecca J. and Wu, Leon and Creamer, Germán G.
Experiments
Experiments evaluate the FWD and SemTree feature spaces compared to two baselines: bag-of-words (BOW) and supervised latent Dirichlet allocation (sLDA) (Blei and McAuliffe, 2007).
Introduction
Our main contribution is a novel tree representation based on semantic frame parses that performs significantly better than enriched bag-of-words vectors.
Introduction
On the polarity task, the semantic frame features encoded as trees perform significantly better across years and sectors than bag-of-words vectors (BOW), and outperform BOW vectors enhanced with semantic frame features, and a supervised topic modeling approach.
Methods
Table 1 lists 24 types of features, including semantic Frame attributes, bag-of-Words , and scores for words in the Dictionary of Affect in Language by part of speech (pDAL).
Methods
Bag-of-Words features include term frequency and tfidf of unigrams, bigrams, and trigrams.
Motivation
Bag-of-Words (BOW) document representation is difficult to surpass for many document classification tasks, but cannot capture the degree of semantic similarity among these sentences.
Related Work
NLP has recently been applied to financial text for market analysis, primarily using bag-of-words (BOW) document representation.
Related Work
Table 1: FWD features (Frame, bag-of-Words , part-of-speech DAL score) and their value types.
bag-of-words is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Iyyer, Mohit and Enns, Peter and Boyd-Graber, Jordan and Resnik, Philip
Conclusion
In this paper we apply recursive neural networks to political ideology detection, a problem where previous work relies heavily on bag-of-words models and hand-designed lexica.
Related Work
In general, work in this category tends to combine traditional surface lexical modeling (e. g., bag-of-words ) with hand-designed syntactic features or lexicons.
Related Work
Most previous work on ideology detection ignores the syntactic structure of the language in use in favor of familiar bag-of-words representations for
Related Work
E.g., Gerrish and Blei (2011) predict the voting patterns of Congress members based on bag-of-words representations of bills and inferred political leanings of those members.
Where Compositionality Helps Detect Ideological Bias
Experimental Results Table 1 shows the RNN models outperforming the bag-of-words baselines as well as the word2vec baseline on both datasets.
Where Compositionality Helps Detect Ideological Bias
We obtain better results on Convote than on IBC with both bag-of-words and RNN models.
bag-of-words is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Cui, Lei and Zhang, Dongdong and Liu, Shujie and Chen, Qiming and Li, Mu and Zhou, Ming and Yang, Muyun
Conclusion and Future Work
These documents are converted to a bag-of-words input and fed into neural networks.
Introduction
The levels inferred from neural network correspond to distinct levels of concepts, where high-level representations are obtained from low-level bag-of-words input.
Topic Similarity Model with Neural Network
The most relevant N documents d f and d6 are retrieved and converted to a high-dimensional, bag-of-words input f and e for the representation learningl.
Topic Similarity Model with Neural Network
Assuming that the input is a n-of-V binary vector X representing the bag-of-words (V is the vocabulary size), an auto-encoder consists of an encoding process g(X) and a decoding process The objective of the auto-encoder is to minimize the reconstruction error £(h(g(X)), X).
Topic Similarity Model with Neural Network
In our task, for each sentence, we treat the retrieved N relevant documents as a single large document and convert it to a bag-of-words vector X in Figure 2.
bag-of-words is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Bramsen, Philip and Escobar-Molano, Martha and Patel, Ami and Alonso, Rafael
Abstract
Previous work in traditional text classification and its variants — such as sentiment analysis — has achieved successful results by using the bag-of-words representation; that is, by treating text as a collection of words with no interdependencies, training a classifier on a large feature set of word unigrams which appear in the corpus.
Abstract
Defining features in this manner allows us to both explore the bag-of-words representation as well as use groups of n-grams as features, which we believed would be a better fit for this problem.
Abstract
In experiments based on the bag-of-words model, we only consider an absolute frequency threshold, whereas in later experiments, we also take into account the relative frequency ratio threshold.
bag-of-words is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Yogatama, Dani and Smith, Noah A.
Abstract
These regularizers impose linguistic bias in feature weights, enabling us to incorporate prior knowledge into conventional bag-of-words models.
Introduction
For tasks like text classification, sentiment analysis, and text-driven forecasting, this is an open question, as cheap “bag-of-words” models often perform well.
Introduction
We embrace the conventional bag-of-words representation of text, instead bringing linguistic bias to bear on regularization.
Introduction
Our experiments demonstrate that structured regularizers can squeeze higher performance out of conventional bag-of-words models on seven out of eight of text categorization tasks tested, in six cases with more compact models than the best-performing unstructured-regularized model.
Related and Future Work
Overall, our results demonstrate that linguistic structure in the data can be used to improve bag-of-words models, through structured regularization.
Related and Future Work
Our experimental focus has been on a controlled comparison between regularizers for a fixed model family (the simplest available, linear with bag-of-words features).
bag-of-words is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Severyn, Aliaksei and Moschitti, Alessandro and Uryupina, Olga and Plank, Barbara and Filippova, Katja
Abstract
We rely on the tree kernel technology to automatically extract and learn features with better generalization power than bag-of-words .
Experiments
We conjecture that sentiment prediction for AUTO category is largely driven by one-shot phrases and statements where it is hard to improve upon the bag-of-words and sentiment lexicon features.
Experiments
The bag-of-words model seems to be affected by the data sparsity problem which becomes a crucial issue when only a small training set is available.
Introduction
The comment contains a product name xoom and some negative expressions, thus, a bag-of-words model would derive a negative polarity for this product.
Introduction
Clearly, the bag-of-words lacks the structural information linking the sentiment with the target product.
Representations and models
Such classifiers are traditionally based on bag-of-words and more advanced features.
bag-of-words is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Mayfield, Elijah and Penstein Rosé, Carolyn
Background
Our baseline approach to both problems is to use a bag-of-words model of the contribution, and use machine learning for classification.
Background
We build a contextual feature space, described in section 4.2, to enhance our baseline bag-of-words model.
Background
This is a distinction that a bag-of-words model would have difficulty with.
bag-of-words is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Danescu-Niculescu-Mizil, Cristian and Cheng, Justin and Kleinberg, Jon and Lee, Lillian
Never send a human to do a machine’s job.
Our first formulation of the prediction task uses a standard bag-of-words model“).
Never send a human to do a machine’s job.
If there were no information in the textual content of a quote to determine whether it were memorable, then an SVM employing bag-of-words features should perform no better than chance.
Never send a human to do a machine’s job.
Even a relatively small number of distinctiveness features, on their own, improve significantly over the much larger bag-of-words model.
bag-of-words is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Sarioglu, Efsun and Yadav, Kabir and Choi, Hyeong-Ah
Abstract
Representing reports according to their topic distributions is more compact than bag-of-words representation and can be processed faster than raw text in subsequent automated processes.
Background
2.1 Bag-of-Words (BOW) Representation
Background
One way of doing this is bag-of-words (BoW) representation where each document becomes a vector of its words/tokens.
Conclusion
Firstly, bag-of-words representation is replaced with topic vectors which provide good dimensionality reduction and still get comparable classification performance.
bag-of-words is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Hermann, Karl Moritz and Blunsom, Phil
Approach
This is a distributed bag-of-words approach as sentence ordering is not taken into account by the model.
Approach
The use of a nonlinearity enables the model to learn interesting interactions between words in a document, which the bag-of-words approach of ADD is not capable of learning.
Related Work
(2013) proposed a bag-of-words autoencoder model, where the bag-of-words representation in one language is used to train the embeddings in another.
Related Work
Hermann and Blunsom (2014) propose a large-margin learner for multilingual word representations, similar to the basic additive model proposed here, which, like the approaches above, relies on a bag-of-words model for sentence representations.
bag-of-words is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Ravi, Sujith
Abstract
Following a probabilistic decipherment approach, we first introduce a new framework for decipherment training that is flexible enough to incorporate any number/type of features (besides simple bag-of-words ) as side-information used for estimating translation models.
Bayesian MT Decipherment via Hash Sampling
Firstly, we would like to include as many features as possible to represent the source/target words in our framework besides simple bag-of-words context similarity (for example, left-context, right-context, and other general-purpose features based on topic models, etc.).
Introduction
Secondly, we introduce a new feature-based representation for sampling translation candidates that allows one to incorporate any amount of additional features (beyond simple bag-of-words ) as side-information during decipherment training.
bag-of-words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Labutov, Igor and Lipson, Hod
Approach
We use the document’s binary bag-of-words vector vj, and compute the document’s vector space representation through the matrix-vector product (Dij.
Results and Discussion
Features Number of training examples + Bag-of-words features .5K 5K 20K | .5K 5K 20K
Results and Discussion
Additional features: Across all embeddings, appending the document’s binary bag-of-words representation increases classification accuracy.
bag-of-words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Celikyilmaz, Asli and Hakkani-Tur, Dilek and Tur, Gokhan and Sarikaya, Ruhi
Abstract
Second, by going beyond a bag-of-words approach, it takes into account the inherent sequential nature of utterances to learn semantic classes based on context.
Experiments
Similarly, if no Markov properties are used ( bag-of-words ), MTR reduces to w—LDA.
Related Work and Motivation
Standard topic models, such as Latent Dirichlet Allocation (LDA) (Blei et al., 2003), use a bag-of-words approach, which disregards word order and clusters words together that appear in a similar global context.
bag-of-words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wang, Chang and Fan, James
Identifying Key Medical Relations
The similarity of two sentences is defined as the bag-of-words similarity of the dependency paths connecting arguments.
Relation Extraction with Manifold Models
0 (7) Bag-of-words features modeling the dependency path.
Relation Extraction with Manifold Models
o (8) Bag-of-words features modeling the whole sentence.
bag-of-words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wang, William Yang and Hua, Zhenhao
Abstract
By performing probability integral transform, our approach moves beyond the standard count-based bag-of-words models in NLP, and improves previous work on text regression by incorporating the correlation among local features in the form of semiparametric Gaussian copula.
Copula Models for Text Regression
By doing this, we are essentially performing probability integral transform— an important statistical technique that moves beyond the count-based bag-of-words feature space to marginal cumulative density functions space.
Copula Models for Text Regression
This is of crucial importance to modeling text data: instead of using the classic bag-of-words representation that uses raw counts, we are now working with uniform marginal CDFs, which helps coping with the overfitting issue due to noise and data sparsity.
bag-of-words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lin, Dekang and Wu, Xiaoyun
Query Classification
The baseline system uses only the words in the queries as features (the bag-of-words representation), treating the query classification problem as a typical text categorization problem.
Query Classification
Here, bow indicates the use of bag-of-words features; WN refers to word clusters of size N; and PN refers to phrase clusters of size N. All the clusters are soft clusters created with the web corpus using 3-word context windows.
Query Classification
The bag-of-words features alone have dismal performance.
bag-of-words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhang, Jiajun and Liu, Shujie and Li, Mu and Zhou, Ming and Zong, Chengqing
Introduction
bag-of-words or indivisible n-gram).
Related Work
One method considers the phrases as bag-of-words and employs a convolution model to transform the word embeddings to phrase embeddings (Collobert et al., 2011; Kalchbrenner and Blunsom, 2013).
Related Work
(2013) also use bag-of-words but learn BLEU sensitive phrase embeddings.
bag-of-words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: