Index of papers in Proc. ACL 2014 that mention

n-grams

Seen in text as:

n-grams (49)
N-grams (4)

Seen in 49 sentences in 10 papers.

1. A Convolutional Neural Network for Modelling Sentences

Kalchbrenner, Nal and Grefenstette, Edward and Blunsom, Phil

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	These generally consist of a projection layer that maps words, sub-word units or n-grams to high dimensional embeddings; the latter are then combined component-wise with an operation such as summation.
Background	The trained weights in the filter m correspond to a linguistic feature detector that learns to recognise a specific class of n-grams .
Background	These n-grams have size n g m, where m is the width of the filter.
Experiments	tures based on long n-grams and to hierarchically combine these features is highly beneficial.
Experiments	In the first layer, the sequence is a continuous n-gram from the input sentence; in higher layers, sequences can be made of multiple separate n-grams .
Experiments	The feature detectors learn to recognise not just single n-grams, but patterns within n-grams that have syntactic, semantic or structural significance.
Introduction	Since individual sentences are rarely observed or not observed at all, one must represent a sentence in terms of features that depend on the words and short n-grams in the sentence that are frequently observed.
Introduction	by which the features of the sentence are extracted from the features of the words or n-grams .
Properties of the Sentence Model	The filters m of the wide convolution in the first layer can learn to recognise specific n-grams that have size less or equal to the filter width m; as we see in the experiments, m in the first layer is often set to a relatively large value
Properties of the Sentence Model	The subsequence of n-grams extracted by the generalised pooling operation induces in-variance to absolute positions, but maintains their order and relative positions.
Properties of the Sentence Model	This gives the RNN excellent performance at language modelling, but it is suboptimal for remembering at once the n-grams further back in the input sentence.

n-grams is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

2. Structured Learning for Taxonomy Induction with Belief Propagation

Bansal, Mohit and Burkett, David and de Melo, Gerard and Klein, Dan

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Our model incorporates heterogeneous relational evidence about both hypernymy and siblinghood, captured by semantic features based on patterns and statistics from Web n-grams and Wikipedia abstracts.
Analysis	Here, our Web 71- grams dataset (which only contains frequent n-grams ) and Wikipedia abstracts do not suffice and we would need to add richer Web data for such world knowledge to be reflected in the features.
Experiments	Feature sources: The n-gram semantic features are extracted from the Google n-grams corpus (Brants and Franz, 2006), a large collection of English n-grams (for n = 1 to 5) and their frequencies computed from almost 1 trillion tokens (95 billion sentences) of Web text.
Experiments	For this, we use a hash-trie on term pairs (similar to that of Bansal and Klein (2011)), and scan once through the 71- gram (or abstract) set, skipping many n-grams (or abstracts) based on fast checks of missing unigrams, exceeding length, suffix mismatches, etc.
Features	The Web n-grams corpus has broad coverage but is limited to up to 5-grams, so it may not contain pattem-based evidence for various longer multi-word terms and pairs.
Features	Similar to the Web n-grams case, we also fire Wikipedia-based pattern order features.
Introduction	The belief propagation approach allows us to efficiently and effectively incorporate heterogeneous relational evidence via hypernymy and siblinghood (e.g., coordination) cues, which we capture by semantic features based on simple surface patterns and statistics from Web n-grams and Wikipedia abstracts.
Related Work	8All the patterns and counts for our Web and Wikipedia edge and sibling features described above are extracted after stemming the words in the terms, the n-grams , and the abstracts (using the Porter stemmer).

n-grams is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

3. Tri-Training for Authorship Attribution with Limited Training Data

Qian, Tieyun and Liu, Bing and Chen, Li and Peng, Zhiyong

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Evaluation	CNG is a pro-file-based method which represents the author as the N most frequent character n-grams of all his/her training texts.
Proposed Tri-Training Algorithm	The features in the character view are the character n-grams of a document.
Proposed Tri-Training Algorithm	Character n-grams are simple and easily available for any natural language.
Proposed Tri-Training Algorithm	We use four content-independent structures including n-grams of POS tags (n = 1..3) and rewrite rules (Kim et al., 2011).
Related Work	Example features include function words (Argamon et al., 2007), richness features (Gamon 2004), punctuation frequencies (Graham et al., 2005), character (Grieve, 2007), word (Burrows, 1992) and POS n-grams (Gamon, 2004; Hirst and Feiguina, 2007), rewrite rules (Halteren et al., 1996), and similarities (Qian and Liu, 2013).

n-grams is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

4. Automatic Detection of Cognates Using Orthographic Alignment

Ciobanu, Alina Maria and Dinu, Liviu P.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Our Approach	The features we use are character n-grams around mismatches.
Our Approach	i) n-grams around gaps, i.e., we account only for insertions and deletions;
Our Approach	ii) n-grams around any type of mismatch, i.e., we account for all three types of mismatches.

n-grams is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

5. Biases in Predicting the Human Language Model

Fine, Alex B. and Frank, Austin F. and Jaeger, T. Florian and Van Durme, Benjamin

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Contrasting the predictive ability of statistics derived from 6 different corpora, we find intuitive results showing that, e.g., a British corpus over-predicts the speed with which an American will react to the words ward and duke, and that the Google n-grams over-predicts familiarity with technology terms.
Fitting Behavioral Data 2.1 Data	2Surprisingly, fife was determined to be one of the words with the largest frequency asymmetry between Switchboard and the Google n-grams corpus.
Introduction	Specifically, we predict human data from three widely used psycholinguistic experimental paradigms—lexical decision, word naming, and picture naming—using unigram frequency estimates from Google n-grams (Brants and Franz, 2006), Switchboard (Godfrey et al., 1992), spoken and written English portions of CELEX (Baayen et al., 1995), and spoken and written portions of the British National Corpus (BNC Consortium, 2007).
Introduction	For example, Google n-grams overestimates the ease with which humans will process words related to the web (tech, code, search, site), while the Switchboard corpus—a collection of informal telephone conversations between strangers—overestimates how quickly humans will react to colloquialisms (heck, dam) and backchannels (wow, right).

n-grams is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

language model (9)
n-grams (4)

6. Graph-based Semi-Supervised Learning of Translation Models from Monolingual Data

Saluja, Avneesh and Hassan, Hany and Toutanova, Kristina and Quirk, Chris

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Generation & Propagation	For the unlabeled phrases, the set of possible target translations could be extremely large (e.g., all target language n-grams ).
Generation & Propagation	A nai've way to achieve this goal would be to extract all n-grams , from n = l to a maximum n-gram order, from the monolingual data, but this strategy would lead to a combinatorial explosion in the number of target phrases.
Generation & Propagation	This set of candidate phrases is filtered to include only n-grams occurring in the target monolingual corpus, and helps to prune passed-through OOV words and invalid translations.
Introduction	Unlike previous work (Irvine and Callison-Burch, 2013a; Razmara et al., 2013), we use higher order n-grams instead of restricting to unigrams, since our approach goes beyond OOV mitigation and can enrich the entire translation model by using evidence from monolingual text.

n-grams is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

language model (18)
BLEU (15)
bigram (13)

7. Approximation Strategies for Multi-Structure Sentence Compression

Thadani, Kapil

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Following evaluations in machine translation as well as previous work in sentence compression (Unno et al., 2006; Clarke and Lapata, 2008; Martins and Smith, 2009; Napoles et al., 2011b; Thadani and McKeown, 2013), we evaluate system performance using F1 metrics over n-grams and dependency edges produced by parsing system output with RASP (Briscoe et al., 2006) and the Stanford parser.
Experiments	We report results over the following systems grouped into three categories of models: tokens + n-grams , tokens + dependencies, and joint models.
Introduction	Joint methods have also been proposed that invoke integer linear programming (ILP) formulations to simultaneously consider multiple structural inference problems—both over n-grams and input dependencies (Martins and Smith, 2009) or n-grams and all possible dependencies (Thadani and McKeown, 2013).
Multi-Structure Sentence Compression	C. In addition, we define bigram indicator variables yij E {0, l} to represent whether a particular order-preserving bigram2 (ti, tj> from S is present as a contiguous bigram in C as well as dependency indicator variables zij E {0, 1} corresponding to whether the dependency arc ti —> 253- is present in the dependency parse of C. The score for a given compression 0 can now be defined to factor over its tokens, n-grams and dependencies as follows.

n-grams is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

ILP (19)
bigram (13)
spanning tree (9)

8. Comparing Automatic Evaluation Measures for Image Description

Elliott, Desmond and Keller, Frank

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Recent approaches to this task have been based on slot-filling (Yang et al., 2011; Elliott and Keller, 2013), combining web-scale n-grams (Li et al., 2011), syntactic tree substitution (Mitchell et al., 2012), and description-by-retrieval (Farhadi et al., 2010; Ordonez et al., 2011; Hodosh et al., 2013).
Methodology	pn measures the effective overlap by calculating the proportion of the maximum number of n-grams co-occurring between a candidate and a reference and the total number of n-grams in the candidate text.
Methodology	(2012); to the best of our knowledge, the only image description work to use higher-order n-grams with BLEU is Elliott and Keller (2013).

n-grams is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

9. Enhancing Grammatical Cohesion: Generating Transitional Expressions for SMT

Tu, Mei and Zhou, Yu and Zong, Chengqing

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	4.4 Analysis of Different Effects of Different N-grams
Experiments	To evaluate the effects of different n-grams for our proposed transfer model, we compared the uni-/bi-/tri-gram transfer models in SMT, and illustrate the results in Fig-
Experiments	Different translation qualities along with different n-grams for transfer model.

n-grams is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

10. Can You Repeat That? Using Word Repetition to Improve Spoken Term Detection

Wintrode, Jonathan and Khudanpur, Sanjeev

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Yet, though many language models more sophisticated than N- grams have been proposed, N-grams are empirically hard to beat in terms of WER.
Motivation	Given the rise of unsupervised latent topic modeling with Latent Dirchlet Allocation (Blei et al., 2003) and similar latent variable approaches for discovering meaningful word co-occurrence patterns in large text corpora, we ought to be able to leverage these topic contexts instead of merely N-grams .
Motivation	information retrieval, and again, interpolate latent topic models with N-grams to improve retrieval performance.

n-grams is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: