Index of papers in Proc. ACL that mention
  • unigram
Bollegala, Danushka and Weir, David and Carroll, John
Distribution Prediction
For this purpose, we represent a word 21) using unigrams and bigrams that co-occur with w in a sentence as follows.
Distribution Prediction
Using a standard stop word list, we filter out frequent non-content unigrams and select the remainder as unigram features to represent a sentence.
Distribution Prediction
Bigram features capture negations more accurately than unigrams , and have been found to be useful for sentiment classification tasks.
unigram is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Chambers, Nathanael
Previous Work
They learned unigram language models (LMs) for specific time periods and scored articles with log-likelihood ratio scores.
Previous Work
Kanhabua and Norvag (2008; 2009) extended this approach with the same model, but expanded its unigrams with POS tags, collocations, and tf-idf scores.
Previous Work
As above, they learned unigram LMs, but instead measured the KL-divergence between a document and a time period’s LM.
Timestamp Classifiers
The unigrams w are lowercased tokens.
Timestamp Classifiers
model as the Unigram NLLR.
Timestamp Classifiers
Followup work by Kanhabua and Norvag (2008) applied two filtering techniques to the unigrams in the model:
unigram is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Schütze, Hinrich
Experimental Setup
256,873 unique unigrams and 4,494,222 unique bigrams.
Experimental Setup
We cluster unigrams (i = l) and bigrams (i = 2).
Experimental Setup
For all experiments, |l31| = |l32| (except in cases where |l32| exceeds the number of unigrams , see below).
Models
The parameters d’, d”, and d’” are the discounts for unigrams , bigrams and trigrams, respectively, as defined by Chen and Goodman (1996, p. 20, (26)).
Models
232) is the set of unigram (resp.
Models
We cluster bigram histories and unigram histories separately and write 193 (7.03 |w1w2) for the bigram cluster model and pB(w3|w2) for the unigram cluster model.
Related work
symbol | denotation 2w (sum over all unigrams w)
unigram is mentioned in 23 sentences in this paper.
Topics mentioned in this paper:
Johnson, Mark and Demuth, Katherine and Frank, Michael
Abstract
We show how to model the task of inferring which objects are being talked about (and which words refer to which objects) as standard grammatical inference, and describe PCFG-based unigram models and adaptor grammar-based collocation models for the task.
Introduction
The unigram model we describe below corresponds most closely to the Frank
Introduction
2.1 Topic models and the unigram PCFG
Introduction
This leads to our first model, the unigram grammar, which is a PCFG.1
unigram is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Börschinger, Benjamin and Johnson, Mark and Demuth, Katherine
Experiments 4.1 The data
Best performance for both the Unigram and the Bigram model in the GOLD-p condition is achieved under the left-right setting, in line with the standard analyses of /t/-deleti0n as primarily being determined by the preceding and the following context.
Experiments 4.1 The data
For the LEARN-p condition, the Bigram model still performs best in the left-right setting but the Unigram model’s performance drops
Experiments 4.1 The data
Unigram
Introduction
We find that models that capture bigram dependencies between underlying forms provide considerably more accurate estimates of those probabilities than corresponding unigram or “bag of words” models of underlying forms.
The computational model
Our models build on the Unigram and the Bigram model introduced in Goldwater et al.
The computational model
Figure 1 shows the graphical model for our joint Bigram model (the Unigram case is trivially recovered by generating the Ums directly from L rather than from LUi,j_1).
unigram is mentioned in 22 sentences in this paper.
Topics mentioned in this paper:
Bollegala, Danushka and Weir, David and Carroll, John
A Motivating Example
( unigrams ) opment, civilization
Feature Expansion
,w N}, where the elements 212,- are either unigrams or bigrams that appear in the review d. We then represent a review d by a real-valued term-frequency vector d 6 RN , where the value of the j-th element dj is set to the total number of occurrences of the unigram or bigram wj in the review d. To find the suitable candidates to expand a vector d for the review d, we define a ranking score score(ui, d) for each base entry in the thesaurus as follows:
Feature Expansion
Moreover, we weight the relatedness scores for each word wj by its normalized term-frequency to emphasize the salient unigrams and bigrams in a review.
Feature Expansion
This is particularly important because we would like to score base entries ui considering all the unigrams and bigrams that appear in a review d, instead of considering each unigram or bigram individually.
Introduction
a unigram or a bigram of word lemma) in a review using a feature vector.
Sentiment Sensitive Thesaurus
We select unigrams and bigrams from each sentence.
Sentiment Sensitive Thesaurus
For the remainder of this paper, we will refer to unigrams and bigrams collectively as lexical elements.
Sentiment Sensitive Thesaurus
Previous work on sentiment classification has shown that both unigrams and bigrams are useful for training a sentiment classifier (Blitzer et al., 2007).
unigram is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Hall, David and Klein, Dan
Experiments
Besides the heuristic baseline, we tried our model-based approach using Unigrams, Bigrams and Anchored Unigrams , with and without learning the parametric edit distances.
Learning
To find this maximizer for any given 7m, we need to find a marginal distribution over the edges connecting any two languages a and d. With this distribution, we calculate the expected “alignment unigrams.” That is, for each pair of phonemes cc and y (or empty phoneme 5), we need to find the quantity:
Message Approximation
In the context of transducers, previous authors have focused on a combination of n-best lists and unigram back-off models (Dreyer and Eisner, 2009), a schematic diagram of which is in Figure 2(d).
Message Approximation
Figure 2: Various topologies for approximating topologies: (a) a unigram model, (b) a bigram model, (c) the anchored uni gram model, and (d) the n-best plus backoff model used in Dreyer and Eisner (2009).
Message Approximation
Another is to choose 7'(w) to be a unigram language model over the language in question with a geometric probability over lengths.
unigram is mentioned in 21 sentences in this paper.
Topics mentioned in this paper:
Celikyilmaz, Asli and Hakkani-Tur, Dilek
Experiments and Discussions
We use R-l (recall against unigrams ), R-2 (recall against bigrams), and R-SU4 (recall against skip-4 bigrams).
Experiments and Discussions
Note that R-2 is a measure of bigram recall and sumHLDA of HybHSumg is built on unigrams rather than bigrams.
Regression Model
(I) nGram Meta-Features (NMF): For each document cluster D, we identify most frequent (nonstop word) unigrams, i.e., vfreq {wiflzl C V, where 7“ is a model parameter of number of most frequent unigram features.
Regression Model
We measure observed unigram probabilities for each 212,- E vfreq with pD(wi) = nD(w,-)/ 2'32, 7mm), where nD(w,-) is the number of times 212,- appears in D and |V| is the total number of unigrams .
Regression Model
To characterize this feature, we reuse the 7“ most frequent unigrams , i.e., w,- E vfreq.
Tree-Based Sentence Scoring
* sparse unigram distributions (siml) at each topic I on com: similarity between p(w0m,l 17 Com: vl) and p(wsn,l zsn : 17 Com: vl)
Tree-Based Sentence Scoring
— siml: We define two sparse (discrete) unigram distributions for candidate 0m and summary 3,, at each node Z on a vocabulary identified with words generated by the topic at that node, v; C V. Given wom = {2111, ...,wl0m|}, let WOW; C wom be the set of words in am that are generated from topic zom at level I on path com.
Tree-Based Sentence Scoring
The discrete unigram distribution pom; = p(w0m,l zom = l, cowvl) represents the probability over all words 2); assigned to topic zom at level 1, by sampling only for words in woml.
unigram is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Elliott, Desmond and Keller, Frank
Abstract
The evaluation of computer-generated text is a notoriously difficult problem, however, the quality of image descriptions has typically been measured using unigram BLEU and human judgements.
Abstract
We estimate the correlation of unigram and Smoothed BLEU, TER, ROUGE-SU4, and Meteor against human judgements on two data sets.
Abstract
The main finding is that unigram BLEU has a weak correlation, and Meteor has the strongest correlation with human judgements.
Introduction
The main finding of our analysis is that TER and unigram BLEU are weakly corre-
Methodology
Unigram BLEU without a brevity penalty has been reported by Kulkarni et a1.
Methodology
(2011) to perform a sentence-level analysis, setting n = 1 and no brevity penalty to get the unigram BLEU measure, or n = 4 with the brevity penalty to get the Smoothed BLEU measure.
Methodology
We set dskip = 4 and award partial credit for unigram only matches, otherwise known as ROUGE-SU4.
unigram is mentioned in 21 sentences in this paper.
Topics mentioned in this paper:
Kobayashi, Hayato
Introduction
In Section 3, we theoretically derive the tradeoff formulae of the cutoff for unigram models, k-gram models, and topic models, each of which represents its perplexity with respect to a reduced vocabulary, under the assumption that the corpus follows Zipf’s law.
Perplexity on Reduced Corpora
3.1 Perplexity of Unigram Models
Perplexity on Reduced Corpora
Let us consider the perplexity of a unigram model learned from a reduced corpus.
Perplexity on Reduced Corpora
In unigram models, a predictive distribution 19’ on a reduced corpus w’ can be simply calculated as p’ (w’) = f (w’ ) / N’ .
unigram is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Chan, Yee Seng and Ng, Hwee Tou
Automatic Evaluation Metrics
Then, unigram matching is performed on the remaining words that are not matched using paraphrases.
Automatic Evaluation Metrics
Based on the matches, ParaEval will then elect to use either unigram precision or unigram recall as its score for the sentence pair.
Automatic Evaluation Metrics
Based on the number of word or unigram matches and the amount of string fragmentation represented by the alignment, METEOR calculates a score for the pair of strings.
unigram is mentioned in 24 sentences in this paper.
Topics mentioned in this paper:
Andreevskaia, Alina and Bergler, Sabine
Experiments
Consistent with findings in the literature (Cui et al., 2006; Dave et al., 2003; Gamon and Aue, 2005), on the large corpus of movie review texts, the in-domain-trained system based solely on unigrams had lower accuracy than the similar system trained on bigrams.
Experiments
On sentences, however, we have observed an inverse pattern: unigrams performed better than bigrams and trigrams.
Experiments
Due to lower frequency of higher-order n-grams (as opposed to unigrams ), higher-order n-gram language models are more sparse, which increases the probability of missing a particular sentiment marker in a sentence (Table 33).
Factors Affecting System Performance
System runs with unigrams , bigrams, and trigrams as features and with different training set sizes are presented.
Lexicon-Based Approach
One of the limitations of general lexicons and dictionaries, such as WordNet (Fellbaum, 1998), as training sets for sentiment tagging systems is that they contain only definitions of individual words and, hence, only unigrams could be effectively learned from dictionary entries.
Lexicon-Based Approach
Since the structure of WordNet glosses is fairly different from that of other types of corpora, we developed a system that used the list of human-annotated adjectives from (Hatzivassiloglou and McKeown, 1997) as a seed list and then learned additional unigrams
unigram is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Johnson, Mark
Word segmentation with adaptor grammars
Figure l: The unigram word adaptor grammar, which uses a unigram model to generate a sequence of words, where each word is a sequence of phonemes.
Word segmentation with adaptor grammars
3.1 Unigram word adaptor grammar
Word segmentation with adaptor grammars
(2007a) presented an adaptor grammar that defines a unigram model of word segmentation and showed that it performs as well as the unigram DP word segmentation model presented by (Goldwater et al., 2006a).
unigram is mentioned in 25 sentences in this paper.
Topics mentioned in this paper:
Tan, Chenhao and Lee, Lillian and Pang, Bo
Introduction
twitter unigram TTT * YES (54%) twitter bigram TTT * YES (52%) personal uni gram MT * YES (52%) personal bigram — NO (48%)
Introduction
We measure a tweet’s similarity to expectations by its score according to the relevant language model, fi ZweTlog(p(m)), where T refers to either all the unigrams ( unigram model) or all and only bi-grams (bigram model).16 We trained a Twitter-community language model from our 558M unpaired tweets, and personal language models from each author’s tweet history.
Introduction
headline unigram TT YES (53%) headline bigram TTTT * YES (52%)
unigram is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Tang, Duyu and Wei, Furu and Yang, Nan and Zhou, Ming and Liu, Ting and Qin, Bing
Related Work
We learn embedding for unigrams , bigrams and trigrams separately with same neural network and same parameter setting.
Related Work
The contexts of unigram (bigram/trigram) are the surrounding unigrams (bigrams/trigrams), respectively.
Related Work
25$ employs the embedding of unigrams , bigrams and trigrams separately and conducts the matrix-vector operation of ac on the sequence represented by columns in each lookup table.
unigram is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Mukherjee, Arjun and Liu, Bing
Model
For notational convenience, we use terms to denote both words ( unigrams ) and phrases (n-grams).
Phrase Ranking based on Relevance
Topics in most topic models like LDA are usually unigram distributions.
Phrase Ranking based on Relevance
For each word, a topic is sampled first, then its status as a unigram or bigram is sampled, and finally the word is sampled from a topic-specific unigram or bigram distribution.
Phrase Ranking based on Relevance
Yet another thread of research post-processes the discovered topical unigrams to form multi-word phrases using likelihood scores (Blei and Lafferty, 2009).
unigram is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Blunsom, Phil and Cohn, Trevor
Background
This work differs from previous Bayesian models in that we explicitly model a complex backoff path using a hierachical prior, such that our model jointly infers distributions over tag trigrams, bigrams and unigrams and whole words and their character level representation.
Experiments
Note that the bigram PYP-HMM outperforms the closely related BHMM (the main difference being that we smooth tag bigrams with unigrams ).
The PYP-HMM
The trigram transition distribution, Tij, is drawn from a hierarchical PYP prior which backs off to a bigram Bj and then a unigram U distribution,
The PYP-HMM
This allows the modelling of trigram tag sequences, while smoothing these estimates with their corresponding bigram and unigram distributions.
The PYP-HMM
That is, each table at one level is equivalent to a customer at the next deeper level, creating the invari-ants: Kh} = n;- andKu—i 2 715, where u = tl_1 indicates the unigram backoff context of h. The recursion terminates at the lowest level where the base distribution is static.
unigram is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Li, Jiwei and Ott, Myle and Cardie, Claire and Hovy, Eduard
Experiments
We train the OVR classifier on three sets of features, LI WC, Unigram , and POS.9
Experiments
In particular, the three-class classifier is around 65% accurate at distinguishing between Employee, Customer, and Tarker for each of the domains using Unigram , significantly higher than random guess.
Experiments
Best performance is achieved on Unigram features, constantly outperforming LIWC and POS features in both three-class and two-class settings in the hotel domain.
Introduction
In the examples in Table l, we trained a linear SVM classifier on Ott’s Chicago-hotel dataset on unigram features and tested it on a couple of different domains (the details of data acquisition are illustrated in Section 3).
Introduction
Table 1: SVM performance on datasets for a classifier trained on Chicago hotel review based on Unigram feature.
unigram is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Nuhn, Malte and Ney, Hermann
Definitions
Given a ciphertext ffv , we define the unigram count Nf off 6 Vf as1
Definitions
Similarly, we define language model matrices S for the unigram and the bigram case.
Definitions
The unigram language model Sf is defined as
Introduction
In Section 4 we show that decipherment using a unigram language model corresponds to solving a linear sum assignment problem (LSAP).
unigram is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Razmara, Majid and Siahbani, Maryam and Haffari, Reza and Sarkar, Anoop
Conclusion
However, oovs can be considered as n-grams (phrases) instead of unigrams .
Conclusion
In this scenario, we also can look for paraphrases and translations for phrases containing oovs and add them to the phrase-table as new translations along with the translations for unigram oovs.
Experiments & Results 4.1 Experimental Setup
Table 4: Intrinsic results of different types of graphs when using unigram nodes on Europarl.
Experiments & Results 4.1 Experimental Setup
Type Node \ MRR % \ RCL % \ Bipartite unigram 5.2 12.5 bigram 6.8 15.7 Tripartite unigram 5.9 12.6 bigram 6.9 15.9 Baseline bigram 3.9 7.7
Experiments & Results 4.1 Experimental Setup
Table 5: Results on using unigram or bigram nodes.
unigram is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Kalchbrenner, Nal and Grefenstette, Edward and Blunsom, Phil
Experiments
The baselines NB and BINB are Naive Bayes classifiers with, respectively, unigram features and unigram and bigram features.
Experiments
SVM is a support vector machine with unigram and bigram features.
Experiments
unigram , POS, head chunks 91.0
Introduction
On the hand-labelled test set, the network achieves a greater than 25% reduction in the prediction error with respect to the strongest unigram and bigram baseline reported in Go et al.
unigram is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Kim, Joohyun and Mooney, Raymond
Reranking Features
Long-range Unigram .
Reranking Features
in the parse tree: f(L2 «a left) = l and f(L4 «A turn) 2 l. Two-level Long-range Unigram .
Reranking Features
Unigram .
unigram is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Bramsen, Philip and Escobar-Molano, Martha and Patel, Ami and Alonso, Rafael
Abstract
Previous work in traditional text classification and its variants — such as sentiment analysis — has achieved successful results by using the bag-of-words representation; that is, by treating text as a collection of words with no interdependencies, training a classifier on a large feature set of word unigrams which appear in the corpus.
Abstract
Few of these tactics would be effectively encapsulated by word unigrams .
Abstract
Many would be better modeled by POS tag unigrams (with no word information) or by longer n-grams consisting of either words, POS tags, or a combination of the two.
unigram is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Wang, Xiaolin and Utiyama, Masao and Finch, Andrew and Sumita, Eiichiro
Complexity Analysis
For the monolingual bigram model, the number of states in the HMM is U times more than that of the monolingual unigram model, as the states at specific position of F are not only related to the length of the current word, but also related to the length of the word before it.
Complexity Analysis
Thus its complexity is U 2 times the unigram model’s complexity:
Complexity Analysis
( unigram ) 0.729 0.804 3 s 50 s Prop.
Methods
This section uses a unigram model for description convenience, but the method can be extended to n-gram models.
unigram is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Wintrode, Jonathan and Khudanpur, Sanjeev
Term and Document Frequency Statistics
Figure 4: Difference between observed and predicted IDFw for Tagalog unigrams .
Term and Document Frequency Statistics
3.1 Unigram Probabilities
Term and Document Frequency Statistics
We encounter the burstiness property of words again by looking at unigram occurrence probabilities.
unigram is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Zhang, Hui and Chiang, David
Smoothing on count distributions
p’(w | u’) For the example above, the estimates for the unigram model p’(w) are p'(cat) = x 0.489 p'(dog) = x 0.511.
Smoothing on count distributions
For the example above, the count distributions used for the unigram distribution would be: ‘ r = 0 r = l p(c(cat) = r) 0.14 0.86 p(c(dog) = r) 0.1 0.9
Smoothing on integral counts
Absolute discounting chooses p’(w | u’) to be the maximum-likelihood unigram distribution; under KN smoothing (Kneser and Ney, 1995), it is chosen to make [9 in (2) satisfy the following constraint for all (n — l)—grams u’w:
Word Alignment
Following common practice in language modeling, we use the unigram distribution p( f ) as the lower-order distribution.
Word Alignment
As shown in Table l, for KN smoothing, interpolation with the unigram distribution performs the best, while for WB smoothing, interestingly, interpolation with the uniform distribution performs the best.
Word Alignment
In WB smoothing, p’( f) is the empirical unigram distribution.
unigram is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Saluja, Avneesh and Hassan, Hany and Toutanova, Kristina and Quirk, Chris
Evaluation
In our first set of experiments, we looked at the impact of choosing bigrams over unigrams as our basic unit of representation, along with performance of LP (Eq.
Evaluation
Using unigrams (“SLP l-gram”) actually does worse than the baseline, indicating the importance of focusing on translations for sparser bigrams.
Evaluation
6It is relatively straightforward to combine both unigrams and bi grams in one source graph, but for experimental clarity we did not mix these phrase lengths.
Generation & Propagation
Although our technique applies to phrases of any length, in this work we concentrate on unigram and bigram phrases, which provides substantial computational cost savings.
Introduction
Unlike previous work (Irvine and Callison-Burch, 2013a; Razmara et al., 2013), we use higher order n-grams instead of restricting to unigrams , since our approach goes beyond OOV mitigation and can enrich the entire translation model by using evidence from monolingual text.
Related Work
Recent improvements to BLI (Tamura et al., 2012; Irvine and Callison-Burch, 2013b) have contained a graph-based flavor by presenting label propagation-based approaches using a seed lexicon, but evaluation is once again done on top-1 or top-3 accuracy, and the focus is on unigrams .
Related Work
(2013) and Irvine and Callison-Burch (2013a) conduct a more extensive evaluation of their graph-based BLI techniques, where the emphasis and end-to-end BLEU evaluations concentrated on OOVs, i.e., unigrams , and not on enriching the entire translation model.
unigram is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Mochihashi, Daichi and Yamada, Takeshi and Ueda, Naonori
Inference
While previous work used p(l<:|@) = (l —p($))k_1p($), this is only true for unigrams .
Inference
the number of tables tew for w in word unigrams .
Nested Pitman-Yor Language Model
Thus far we have assumed that the unigram G1 is already given, but of course it should also be generated as G1 ~ PY(G0, d, 6).
Nested Pitman-Yor Language Model
2Note that this is different from unigrams , which are posterior distribution given data.
Nested Pitman-Yor Language Model
When a word 21) is generated from its parent at the unigram node, it means that w
Pitman-Yor process and n-gram models
Suppose we have a unigram word distribution G1 = } where - ranges over each word in the lexicon.
Pitman-Yor process and n-gram models
In this representation, each n-gram context h (including the null context 6 for unigrams ) is a Chinese restaurant whose customers are the n-gram counts seated over the tables 1 - - -thw.
unigram is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Yannakoudakis, Helen and Briscoe, Ted and Medlock, Ben
Approach
(a) Word unigrams (b) Word bigrams
Approach
(a) PoS unigrams (b) PoS bigrams (c) PoS trigrams
Approach
Word unigrams and bigrams are lower-cased and used in their inflected forms.
Previous work
The Bayesian Essay Test Scoring sYstem (BETSY) (Rudner and Liang, 2002) uses multinomial or Bernoulli Naive Bayes models to classify texts into different classes (e. g. pass/fail, grades AF) based on content and style features such as word unigrams and bigrams, sentence length, number of verbs, noun—verb pairs etc.
Validity tests
(a) word unigrams within a sentence (b) word bigrams within a sentence (c) word trigrams within a sentence
unigram is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Szarvas, Gy"orgy
Conclusions
Our finding that token unigram features are capable of solving the task accurately agrees with the the results of previous works on hedge classification ((Light et al., 2004), (Med-
Methods
For trigrams, bigrams and unigrams — processed separately — we calculated a new class-conditional probability for each feature cc, discarding those observations of c in speculative instances where c was not among the two highest ranked candidate.
Results
About half of these were the kind of phrases that had no unigram components of themselves in the feature set, so these could be regarded as meaningful standalone features.
Results
Our model using just unigram features achieved a BEP(spec) score of 78.68% and F5=1(spec) score of 80.23%, which means that using bigram and trigram hedge cues here significantly improved the performance (the difference in BEP (spec) and F5=1(spec) scores were 5.23% and 4.97%, respectively).
Results
Our experiments revealed that in radiology reports, which mainly concentrate on listing the identified diseases and symptoms (facts) and the physician’s impressions (speculative parts), detecting hedge instances can be performed accurately using unigram features.
unigram is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Das, Dipanjan and Smith, Noah A.
Data and Task
Notice the high lexical overlap between the two sentences ( unigram overlap of 100% in one direction and 72% in the other).
Data and Task
19 is another true paraphrase pair with much lower lexical overlap ( unigram overlap of 50% in one direction and 30% in the other).
Experimental Evaluation
(2006), using features calculated directly from 51 and 52 without recourse to any hidden structure: proportion of word unigram matches, proportion of lemma-tized unigram matches, BLEU score (Papineni et al., 2001), BLEU score on lemmatized tokens, F measure (Turian et al., 2003), difference of sentence length, and proportion of dependency relation overlap.
Experimental Evaluation
10This is accomplished by eliminating lines 12 and 13 from the definition of pm and redefining pword to be the unigram word distribution estimated from the Gigaword corpus, as in G0, without the help of WordNet.
QG for Paraphrase Modeling
(15) Here aw is the Good-Turing unigram probability estimate of a word 21) from the Gigaword corpus (Graff, 2003).
QG for Paraphrase Modeling
As noted, the distributions pm, the word unigram weights in Eq.
unigram is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Sun, Xu and Wang, Houfeng and Li, Wenjie
System Architecture
To derive word features, first of all, our system automatically collect a list of word unigrams and bigrams from the training data.
System Architecture
To avoid overfitting, we only collect the word unigrams and bigrams whose frequency is larger than 2 in the training set.
System Architecture
This list of word unigrams and bigrams are then used as a unigram-dictionary and a bigram-dictionary to generate word-based unigram and bigram features.
unigram is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Yao, Xuchen and Van Durme, Benjamin and Clark, Peter
Experiments
6This is because the weights of unigram to trigram features in a loglinear CRF model is a balanced consequence for maximization.
Experiments
A unigram feature might end up with lower weight because another trigram containing this unigram gets a higher weight.
Experiments
Then we would have missed this feature if we only used top unigram features.
Method
Unigram QA Model The QA system uses up to trigram features (Table 1 shows examples of unigram and bigram features).
Method
We drop this strict constraint (which may need further smoothing) and only use unigram features, not by simply extracting “good” unigram features from the trained model, but by retraining the model with only unigram features.
unigram is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Baumel, Tal and Cohen, Raphael and Elhadad, Michael
Algorithms
When constructing a summary, we update the unigram distribution of the constructed summary so that it includes a smoothed distribution of the previous summaries in order to eliminate redundancy between the successive steps in the chain.
Algorithms
For example, when we summarize the documents that were retrieved as a result to the first query, we calculate the unigram distribution in the same manner as we did in Focused KLSum; but for the second query, we calculate the unigram distribution as if all the sentences we selected for the previous summary were selected for the current query too, with a damping factor.
Algorithms
In this variant, the Unigram Distribution estimate of word X is computed as:
Previous Work
KLSum adopts a language model approach to compute relevance: the documents in the input set are modeled as a distribution over words (the original algorithm uses a unigram distribution over the bag of words in documents D).
Previous Work
KLSum is a sentence extraction algorithm: it searches for a subset of the sentences in D with a unigram distribution as similar as possible to that of the overall collection D, but with a limited length.
Previous Work
After the words are classified, the algorithm uses a KLSum variant to find the summary that best matches the unigram distribution of topic specific words.
unigram is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Fleischman, Michael and Roy, Deb
Evaluation
The remaining 93 unlabeled games are used to train unigram , bigram, and trigram grounded language models.
Evaluation
Only unigrams , bigrams, and tri—grams that are not proper names, appear greater than three times, and are not composed only of stop words were used.
Evaluation
with traditional unigram , bigram, and trigram language models generated from a combination of the closed captioning transcripts of all training games and data from the switchboard corpus (see below).
Linguistic Mapping
3 In the discussion that follows, we describe a method for estimating unigram grounded language models.
unigram is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Ramanath, Rohan and Liu, Fei and Sadeh, Norman and Smith, Noah A.
Approach
0,; is generated by repeatedly sampling from a distribution over terms that includes all unigrams and bigrams except those that occur in fewer than 5% of the documents and in more than 98% of the documents.
Approach
models (e. g., a bigram may be generated by as many as three draws from the emission distribution: once for each unigram it contains and once for the bigram).
Evaluation
We derived unigram tfidf vectors for each section in each of 50 randomly sampled policies per category.
Experiment
The implementation uses unigram features and cosine similarity.
Experiment
Our second baseline is latent Dirichlet allocation (LDA; Blei et al., 2003), with ten topics and online variational Bayes for inference (Hoffman et al., 2010).7 To more closely match our models, LDA is given access to the same unigram and bigram tokens.
unigram is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Li, Fangtao and Gao, Yang and Zhou, Shuchang and Si, Xiance and Dai, Decheng
Experiments
Besides unigram and bigram, the most effective textual feature is URL.
Proposed Features
3.1.1 Unigrams and Bigrams The most common type of feature for text classi-
Proposed Features
feature selection method X2 (Yang and Pedersen, 1997) to select the top 200 unigrams and bigrams as features.
Proposed Features
The top ten unigrams related to deceptive answers are shown on Table 1.
unigram is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Roark, Brian and Allauzen, Cyril and Riley, Michael
Experimental results
Note that unigrams in the models are never pruned, hence all models assign probabilities over an identical vocabulary and perplexity is comparable across models.
Marginal distribution constraints
Thus the unigram distribution is with respect to the bigram model, the bigram model is with respect to the trigram model, and so forth.
Model constraint algorithm
Thus we process each history length in descending order, finishing with the unigram state.
Model constraint algorithm
This can be particularly clearly seen at the unigram state, which has an arc for every unigram (the size of the vocabulary): for every bigram state (also order of the vocabulary), in the naive algorithm we must look for every possible arc.
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Huang, Zhiheng and Chang, Yi and Long, Bo and Crespo, Jean-Francois and Dong, Anlei and Keerthi, Sathiya and Wu, Su-Lin
Experiments
In particular, we use the unigrams of the current and its neighboring words, word bigrams, prefixes and suffixes of the current word, capitalization, all-number, punctuation, and tag bigrams for POS, CoNLL2000 and CoNLL 2003 datasets.
Experiments
For supertag dataset, we use the same features for the word inputs, and the unigrams and bigrams for gold POS inputs.
Problem formulation
To simplify the discussion, we divide the features into two groups: unigram label features and bi-gram label features.
Problem formulation
Unigram features are of form fk(yt, xt) which are concerned with the current label and arbitrary feature patterns from input sequence.
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Bojar, Ondřej and Kos, Kamil and Mareċek, David
Extensions of SemPOS
For the purposes of the combination, we compute BLEU only on unigrams up to fourgrams (denoted BLEUl, ..., BLEU4) but including the brevity penalty as usual.
Extensions of SemPOS
This is also confirmed by the observation that using BLEU alone is rather unreliable for Czech and BLEU-l (which judges unigrams only) is even worse.
Problems of BLEU
Fortunately, there are relatively few false positives in n-gram based metrics: 6.3% of unigrams and far fewer higher n-grams.
Problems of BLEU
This amounts to 34% of running unigrams , giving enough space to differ in human judgments and still remain unscored.
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Gormley, Matthew R. and Mitchell, Margaret and Van Durme, Benjamin and Dredze, Mark
Approaches
We consider both template unigrams and bigrams, combining two templates in sequence.
Approaches
Constructing all feature template unigrams and bigrams would yield an unwieldy number of features.
Experiments
Our primary feature set IGC consists of 127 template unigrams that emphasize coarse properties (i.e., properties 7, 9, and 11 in Table 1).
Experiments
However, the original unigram Bjorkelund features (Bdeflmemh), which were tuned for a high-resource model, obtain higher Fl than our information gain set using the same features in unigram and bigram templates (1GB).
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
P, Deepak and Visweswariah, Karthik
Conclusions and Future Work
We model and harness lexical correlations using translation models, in the company of unigram language models that are used to characterize reply posts, and formulate a clustering-based EM approach for solution identification.
Introduction
We model the lexical correlation and solution post character using regularized translation models and unigram language models respectively.
Our Approach
Consider a unigram language model 83 that models the lexical characteristics of solution posts, and a translation model 73 that models the lexical correlation between problems and solutions.
Our Approach
Consider the post and reply vocabularies to be of sizes A and B respectively; then, the translation model would have A x B variables, whereas the unigram language model has only B variables.
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Prabhakaran, Vinodkumar and Rambow, Owen
Predicting Direction of Power
Baseline (Always Superior) 52.54 Baseline (Word Unigrams + Bigrams) 68.56 THRNCW 55.90 THRPR 54.30 DIAPR 54.05 THRPR + THRNew 61.49 DIAPR + THRPR + THRNew 62.47 LEX 70.74 LEX + DIAPR + THRPR 67.44 LEX + DIAPR + THRPR + THRNew 68.56 BEST (= LEX + THRNeW) 73.03 BEST (Using p1 features only) 72.08 BEST (Using IMt features only) 72.11 BEST (Using Mt only) 71.27 BEST (No Indicator Variables) 72.44
Predicting Direction of Power
We found the best setting to be using both unigrams and bigrams for all three types of ngrams, by tuning in our dev set.
Predicting Direction of Power
We also use a stronger baseline using word unigrams and bigrams as features, which obtained an accuracy of 68.6%.
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Kruengkrai, Canasai and Uchimoto, Kiyotaka and Kazama, Jun'ichi and Wang, Yiou and Torisawa, Kentaro and Isahara, Hitoshi
Training method
Table 2: Unigram features.
Training method
We broadly classify features into two categories: unigram and bigram features.
Training method
Unigram features: Table 2 shows our unigram features.
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Garera, Nikesh and Yarowsky, David
Corpus Details
As our reference algorithm, we used the current state-of-the-art system developed by Boulis and Ostendorf (2005) using unigram and bigram features in a SVM framework.
Corpus Details
For each conversation side, a training example was created using unigram and bigram features with tf-idf weighting, as done in standard text classification approaches.
Corpus Details
Also, named entity “Mike” shows up as a discriminative unigram , this maybe due to the self-introduction at the beginning of the conversations and “Mike” being a common male name.
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Lavergne, Thomas and Cappé, Olivier and Yvon, François
Conditional Random Fields
Using only unigram features {fy,$}(y,$)€y>< X results in a model equivalent to a simple bag-of-tokens position-by-position logistic regression model.
Conditional Random Fields
The same idea can be used when the set {My,$t+1}yey of unigram features is sparse.
Conditional Random Fields
The features used in Nettalk experiments take the form fyflu ( unigram ) and fy/ww (bigram), where w is a n-gram of letters.
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Lin, Shih-Hsiang and Chen, Berlin
Experimental results and discussions 6.1 Baseline experiments
Another is that BC utilizes a rich set of features to characterize a given spoken sentence while LM is constructed solely on the basis of the lexical ( unigram ) information.
Experimental setup 5.1 Data
They are, respectively, the ROUGE-l ( unigram ) measure, the ROUGE-2 (bigram) measure and the ROUGE-L (longest common subsequence) measure (Lin, 2004).
Proposed Methods
In the LM approach, each sentence in a document can be simply regarded as a probabilistic generative model consisting of a unigram distribution (the so-called “bag-0f-words” assumption) for generating the document (Chen et al., 2009): (w)
Proposed Methods
To mitigate this potential defect, a unigram probability estimated from a general collection, which models the general distribution of words in the target language, is often used to smooth the sentence model.
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Mayfield, Elijah and Adamson, David and Penstein Rosé, Carolyn
Cue Discovery for Content Selection
.xm} consists of m unigram features representing the observed vocabulary used in our corpus.
Experimental Results
We use a binary unigram feature space, and we perform 7-fold cross-va1idation.
Prediction
One challenge of this approach is our underlying unigram feature space - tree-based algorithms are generally poor classifiers for the high-dimensionality, low-information features in a lexical feature space (Han et al., 2001).
Prediction
splits than would unigrams alone.
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhong, Zhi and Ng, Hwee Tou
Experiments
We use the Lemur toolkit (Ogilvie and Callan, 2001) version 4.11 as the basic retrieval tool, and select the default unigram LM approach based on KL-divergence and Dirichlet-prior smoothing method in Lemur as our basic retrieval approach.
The Language Modeling Approach to IR
The most commonly used language model in IR is the unigram model, in which terms are assumed to be independent of each other.
The Language Modeling Approach to IR
In the rest of this paper, language model will refer to the unigram language model.
The Language Modeling Approach to IR
With unigram model, the negative KL-divergence between model 6.1 of query (1 and model 6d of document d is calculated as follows:
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Carpuat, Marine and Daume III, Hal and Henry, Katharine and Irvine, Ann and Jagarlamudi, Jagadeesh and Rudinger, Rachel
New Sense Indicators
As such, we compute unigram log probabilities (via smoothed relative frequencies) of each word under consideration in the old domain and the new domain.
New Sense Indicators
However, we do not simply want to capture unusual words, but words that are unlikely in context, so we also need to look at the respective unigram log probabilities: 635' and Eflgw.
New Sense Indicators
From these four values, we compute corpus-level (and therefore type-based) statistics of the new domain n-gram log probability (Eflgw, the difference between the n-gram probabilities in each domain (623” — 6:51), the difference between the n-gram and unigram probabilities in the new domain (EQSW — 633‘”), and finally the combined difference: 623"” — [SSW + 63:: — 635’).
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Kaji, Nobuhiro and Fujiwara, Yasuhiro and Yoshinaga, Naoki and Kitsuregawa, Masaru
Introduction
where we explicitly distinguish the unigram feature function o; and bigram feature function Comparing the form of the two functions, we can see that our discussion on HMMs can be extended to perceptrons by substituting 2k wigbflwwn) and 2k wg¢%(w,yn_1,yn) for logp(:cn|yn) and 10gp(yn|yn—1)-
Introduction
For unigram features, we compute the maximum, maxy 2k wligbflmw), as a preprocess in
Introduction
In POS tagging, we used unigrams of the current and its neighboring words, word bigrams, prefixes and suffixes of the current word, capitalization, and tag bigrams.
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Kauchak, David
Why Does Unsimplified Data Help?
This is particularly important for unigrams (i.e.
Why Does Unsimplified Data Help?
Table 3 shows the percentage of unigrams , bigrams and trigrams from the two test sets that are found in the simple and normal training data.
Why Does Unsimplified Data Help?
Even at the unigram level, the normal data contained significantly more of the test set unigrams than the simple data.
unigram is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Levitan, Rivka and Elson, David
Features
We also count the number of unigrams the two transcripts have in common and the length, absolute and relative, of the longest unigram overlap.
Features
In addition, we look at the number of characters and unigrams and the audio duration of each query, with the intuition that the length of a query may be correlated with its likelihood of being retried (or a retry).
Prediction task
T-tests between the two categories showed that all edit distance features—character, word, reduced, and phonetic; raw and normalized—are significantly more similar between retry query pairs.1 Similarly, the number of unigrams the two queries have in common is significantly higher for retries.
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Sauper, Christina and Haghighi, Aria and Barzilay, Regina
Experiments
The DISCRIMINATIVE baseline for this task is a standard maximum entropy discriminative binary classifier over unigrams .
Model
Global Distributions: At the global level, we draw several unigram distributions: a global background distribution 63 and attribute distributions 6% for each attribute.
Model
Product Level: For the ith product, we draw property unigram distributions 6351, .
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Pauls, Adam and Klein, Dan
Experiments
We used a simple measure for isolating the syntactic likelihood of a sentence: we take the log-probability under our model and subtract the log-probability under a unigram model, then normalize by the length of the sentence.8 This measure, which we call the syntactic log-odds ratio (SLR), is a crude way of “subtracting out” the semantic component of the generative probability, so that sentences that use rare words are not penalized for doing so.
Experiments
(2004) also report using a parser probability normalized by the unigram probability (but not length), and did not find it effective.
Treelet Language Modeling
p(w|P, R, r’, w_1, w_2) to p(w|P, R, r’, w_1) and then p(w|P, R, r’ From there, we back off to p(w|P, R) where R is the sibling immediately to the right of P, then to a raw PCFG p(w|P), and finally to a unigram distribution.
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Sun, Weiwei and Wan, Xiaojun
Structure-based Stacking
0 Character unigrams : ck (i — l S k: S i + l) 0 Character bigrams: ckck+1 (i — l S k: < i + l)
Structure-based Stacking
0 Character label unigrams : cgpd (i—lppd S k: 3 73+ zppd)
Structure-based Stacking
0 Unigram features: C(sk) (i — l0 3 k: S +l0), Tctb(3k) (i — 1351) S k?
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
DeNero, John and Chiang, David and Knight, Kevin
Computing Feature Expectations
where h(t) is the unigram prefix of bigram t.
Consensus Decoding Algorithms
where T1 is the set of unigrams in the language, and 6(6, 25) is an indicator function that equals 1 if 75 appears in e and 0 otherwise.
Consensus Decoding Algorithms
Figure 1: For the linear similarity measure U (c; e’ ), which computes unigram precision, the MBR translation can be found by iterating either over sentence pairs (Algorithm 1) or over features (Algorithm 2).
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Ott, Myle and Choi, Yejin and Cardie, Claire and Hancock, Jeffrey T.
Automated Approaches to Deceptive Opinion Spam Detection
Specifically, we consider the following three n-gram feature sets, with the corresponding features lowercased and unstemmed: UNIGRAMS , BIGRAMS+, TRIGRAMS+, where the superscript + indicates that the feature set subsumes the preceding feature set.
Automated Approaches to Deceptive Opinion Spam Detection
We consider all three n-gram feature sets, namely UNIGRAMS , BIGRAMS+, and TRIGRAMS+, with corresponding language models smoothed using the interpolated Kneser-Ney method (Chen and Goodman, 1996).
Automated Approaches to Deceptive Opinion Spam Detection
We use SVMlight (Joachims, 1999) to train our linear SVM models on all three approaches and feature sets described above, namely POS, LIWC, UNIGRAMS , BIGRAMS+, and TRIGRAMS+.
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Woodsend, Kristian and Lapata, Mirella
Experimental Setup
The mapping of sentence labels to phrase labels was unsupervised: if the phrase came from a sentence labeled (1), and there was a unigram overlap (excluding stop words) between the phrase and any of the original highlights, we marked this phrase with a positive label.
Experimental Setup
Our feature set comprised surface features such as sentence and paragraph position information, POS tags, unigram and bigram overlap with the title, and whether high-scoring tf.idf words were present in the phrase (66 features in total).
Experimental Setup
We report unigram overlap (ROUGE-l) as a means of assessing informativeness and the longest common subsequence (ROUGE-L) as a means of assessing fluency.
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Li, Zhifei and Eisner, Jason and Khudanpur, Sanjeev
Experimental Results
As shown in Table 2a, decoding with a single variational n-gram model (VM) as per (14) improves the Viterbi baseline (except the case with a unigram VM), though often not statistically significant.
Experimental Results
The interpolation between a VM and a word penalty feature (“wp”) improves over the unigram
Experimental Results
This is necessarily true, but it is interesting to see that most of the improvement is obtained just by moving from a unigram to a bigram model.
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Kim, Young-Bum and Snyder, Benjamin
Inference
where n(t) and n(t, t’) are, respectively, unigram and bigram tag counts excluding those containing character w. Conversely, n’(t) and n’(t,t’) are, respectively, unigram and bigram tag counts only including those containing character w. The notation am] denotes the ascending factorial: a(a + l) - - - (a +n — 1).
Inference
where is the unigram count of character w, and n(t’) is the unigram count of tag 75, over all characters tokens (including 7.0).
Inference
where n(j, 19,25) and n(j, 19,75, 25’) are the numbers of languages currently assigned to cluster k which have more than j occurrences of unigram (t) and bigram (t, t’ ), respectively.
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Tsuruoka, Yoshimasa and Tsujii, Jun'ichi and Ananiadou, Sophia
Log-Linear Models
The features used in this experiment were unigrams and bigrams of neighboring words, and unigrams , bigrams and trigrams of neighboring POS tags.
Log-Linear Models
For the features, we used unigrams of neighboring chunk tags, substrings (shorter than 10 characters) of the current word, and the shape of the word (e. g. “IL-2” is converted into “AA-#”), on top of the features used in the text chunking experiments.
Log-Linear Models
For the features, we used unigrams and bigrams of neighboring words, prefixes and suffixes of the current word, and some characteristics of the word.
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Ji, Yangfeng and Eisenstein, Jacob
Experiments
The vocabulary V includes all unigrams after down-casing.
Experiments
In total, there are 16250 unique unigrams in V.
Experiments
fication for visualization is we consider only the top 1000 frequent unigrams in the RST—DT training set.
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Kozareva, Zornitsa
Task A: Polarity Classification
We studied the influence of unigrams, bigrams and a combination of the two, and saw that the best performing feature set consists of the combination of unigrams and bigrams.
Task A: Polarity Classification
In this paper, we will refer from now on to n-grams as the combination of unigrams and bigrams.
Task B: Valence Prediction
Those include n-grams ( unigrams , bigrams and combination of the two), LIWC scores.
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wan, Xiaojun
Empirical Evaluation 4.1 Evaluation Setup
In the above experiments, all features ( unigram + bigram) are used.
The Co-Training Approach
The English or Chinese features used in this study include both unigrams and bigrams5 and the feature weight is simply set to term frequency6.
The Co-Training Approach
5 For Chinese text, a unigram refers to a Chinese word and a bigram refers to two adjacent Chinese words.
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Setiawan, Hendra and Zhou, Bowen and Xiang, Bing and Shen, Libin
Experiments
As the backbone of our string-to-dependency system, we train 3-gram models for left and right dependencies and unigram for head using the target side of the bilingual training data.
Introduction
In this way, we hope to upgrade the unigram formulation of existing reordering models to a higher order formulation.
Related Work
Our TNO model is closely related to the Unigram Orientation Model (UOM) (Tillman, 2004), which is the de facto reordering model of phrase-based SMT (Koehn et al., 2007).
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Perez-Rosas, Veronica and Mihalcea, Rada and Morency, Louis-Philippe
Multimodal Sentiment Analysis
We use a bag-of-words representation of the video transcriptions of each utterance to derive unigram counts, which are then used as linguistic features.
Multimodal Sentiment Analysis
The remaining words represent the unigram features, which are then associated with a value corresponding to the frequency of the unigram inside each utterance transcription.
Multimodal Sentiment Analysis
These simple weighted unigram features have been successfully used in the past to build sentiment classifiers on text, and in conjunction with Support Vector Machines (SVM) have been shown to lead to state-of-the-art performance (Maas et al., 2011).
unigram is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: