Index of papers in Proc. ACL 2012 that mention

n-gram

Seen in text as:

n-gram (67)
N-gram (15)

Seen in 78 sentences in 13 papers.

1. Large-Scale Syntactic Language Modeling with Treelets

Pauls, Adam and Klein, Dan

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We propose a simple generative, syntactic language model that conditions on overlapping windows of tree context (or treelets) in the same way that n-gram language models condition on overlapping windows of linear context.
Abstract	We estimate the parameters of our model by collecting counts from automatically parsed text using standard n-gram language model estimation techniques, allowing us to train a model on over one billion tokens of data using a single machine in a matter of hours.
Introduction	At the same time, because n-gram language models only condition on a local window of linear word-level context, they are poor models of long-range syntactic dependencies.
Introduction	Although several lines of work have proposed generative syntactic language models that improve on n-gram models for moderate amounts of data (Chelba, 1997; Xu et al., 2002; Charniak, 2001; Hall, 2004; Roark,
Introduction	Our model can be trained simply by collecting counts and using the same smoothing techniques normally applied to n-gram models (Kneser and Ney, 1995), enabling us to apply techniques developed for scaling 71- gram models out of the box (Brants et al., 2007; Pauls and Klein, 2011).
Treelet Language Modeling	The common denominator of most n-gram language models is that they assign probabilities roughly according to empirical frequencies for observed 77.-grams, but fall back to distributions conditioned on smaller contexts for unobserved n-grams, as shown in Figure 1(a).
Treelet Language Modeling	As in the n-gram case, we would like to pick h to be large enough to capture relevant dependencies, but small enough that we can obtain meaningful estimates from data.
Treelet Language Modeling	Although it is tempting to think that we can replace the left-to-right generation of n-gram models with the purely top-down generation of typical PCFGs, in practice, words are often highly predictive of the words that follow them — indeed, n-gram models would be terrible language models if this were not the case.

n-gram is mentioned in 15 sentences in this paper.

Topics mentioned in this paper:

2. Fast Online Lexicon Learning for Grounded Language Acquisition

Chen, David

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Background	To learn the meaning of an n-gram 21), Chen and Mooney first collect all navigation plans 9 that co-occur with w. This forms the initial candidate meaning set for 21).
Conclusion	In contrast to the previous approach that computed common subgraphs between different contexts in which an n-gram appeared, we instead focus on small, connected subgraphs and introduce an algorithm, SGOLL, that is an order of magnitude faster.
Online Lexicon Learning Algorithm	Even though they use beam-search to limit the size of the candidate set, if the initial candidate meaning set for a n-gram is large, it can take a long time to take just one pass through the list of all candidates.
Online Lexicon Learning Algorithm	: function Update(training example (67;, 1%)) for n-gram w that appears in 67; do
Online Lexicon Learning Algorithm	14: Increase the count of examples, each n-gram w and each subgraph g 15: end function

n-gram is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

3. Character-Level Machine Translation Evaluation for Languages with Ambiguous Word Boundaries

Liu, Chang and Ng, Hwee Tou

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion and Future Work	The TESLA-M metric allows each n-gram to have a weight, which is primarily used to discount function words.
Experiments	Based on these synonyms, TESLA-CELAB is able to award less trivial n-gram matches, such as T fiflifififl.
Experiments	The covered n-gram matching rule is then able to award tricky n-grams such as TE, Ti, /1\ [E], 1/13 [IE5 and i9}.
Motivation	We formulate the n-gram matching process as a real-valued linear programming problem, which can be solved efficiently.
The Algorithm	The basic n-gram matching problem is shown in Figure 2.
The Algorithm	We observe that since has been matched, all its sub-n-grams should be considered matched as well, including and We call this the covered n-gram matching rule.
The Algorithm	However, we cannot simply perform covered n-gram matching as a post processing step.

n-gram is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

BLEU (19)
word-level (15)
n-grams (12)

4. Discriminative Pronunciation Modeling: A Large-Margin, Feature-Rich Approach

Tang, Hao and Keshet, Joseph and Livescu, Karen

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Feature functions	We use 1‘91.” to denote the n-gram substring p1 .
Feature functions	The two substrings E and b are said to be equal if they have the same length and a7; 2 ()7; for 1 S i S n. For a given sub-word unit n-gram U E 73’", we use the shorthand U 6 T9 to mean that we can find U in 1‘9; i.e., there eXists an indeX i such that my”, 2 H. We use \|f9\| to denote the length of the sequence 1‘9.
Feature functions	Similarly to (Zweig et al., 2010), we adapt TF and IDF by treating a sequence of sub-word units as a “document” and n-gram sub-sequences as “words.” In this analogy, we use sub-sequences in surface pronunciations to “search” for baseforms in the dictionary.

n-gram is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

5. Utilizing Dependency Language Models for Graph-based Dependency Parsing Models

Chen, Wenliang and Zhang, Min and Li, Haizhou

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Dependency language model	The standard N-gram based language model predicts the next word based on the N — 1 immediate previous words.
Dependency language model	However, the traditional N-gram language model can not capture long-distance word relations.
Dependency language model	The N-gram DLM predicts the next child of a head based on the N — 1 immediate previous children and the head itself.
Experiments	Then, we studied the effect of adding different N-gram DLMs to MSTl.
Introduction	The N-gram DLM has the ability to predict the next child based on the N-l immediate previous children and their head (Shen et al., 2008).
Introduction	The DLM-based features can capture the N-gram information of the parent-children structures for the parsing model.

n-gram is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

6. A Joint Model for Discovery of Aspects in Utterances

Celikyilmaz, Asli and Hakkani-Tur, Dilek

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Data and Approach Overview	(Ilia) Entity List Prior (Illa) Web N-Gram Context Prior
Data and Approach Overview	n-gram ——> web quer logs domain specific entity act parameters prior \
Experiments	* Base—MCM: Our first version injects an informative prior for domain, dialog act and slot topic distributions using information extracted from only labeled training utterances and inject as prior constraints (corpus n-gram base measure during topic assignments.
MultiLayer Context Model - MCM	* Web n-Gram Context Base Measure (2%): As explained in §3, we use the web n-grams as additional information for calculating the base measures of the Dirichlet topic distributions.
MultiLayer Context Model - MCM	* Corpus n-Gram Base Measure (2%): Similar to other measures, MCM also encodes n-gram constraints as word-frequency features extracted from labeled utterances.

n-gram is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

7. PORT: a Precision-Order-Recall MT Evaluation Metric for Tuning

Chen, Boxing and Kuhn, Roland and Larkin, Samuel

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

BLEU and PORT	First, define n-gram precision p(n) and recall r(n):
BLEU and PORT	where Pg N) is the geometr1c average of n-gram prec1s1ons
BLEU and PORT	The average precision and average recall used in PORT (unlike those used in BLEU) are the arithmetic average of n-gram precisions Pa(N) and recalls Ra(N):
Experiments	As usual, French-English is the outlier: the two outputs here are typically so similar that BLEU and Qmean tuning yield very similar n-gram statistics.

n-gram is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

8. Modeling Sentences in the Latent Space

Guo, Weiwei and Diab, Mona

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments and Results	The performance of WTMF on CDR is compared with (a) an Information Retrieval model (IR) that is based on surface word matching, (b) an n-gram model ( N-gram ) that captures phrase overlaps by returning the number of overlapping ngrams as the similarity score of two sentences, (c) LSA that uses svds() function in Matlab, and (d) LDA that uses Gibbs Sampling for inference (Griffiths and Steyvers, 2004).
Experiments and Results	The similarity of two sentences is computed by cosine similarity (except N-gram ).
Experiments and Results	We mainly compare the performance of IR, N-gram , LSA, LDA, and WTMF models.

n-gram is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

9. Learning Translation Consensus with Structured Label Propagation

Liu, Shujie and Li, Chi-Ho and Li, Mu and Zhou, Ming

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Features and Training	Here simple n-gram similarity is used for the sake of efficiency.
Features and Training	Like GC , there are four features with respect to the value of n in n-gram similarity measure.
Graph Construction	Solid lines are edges connecting nodes with sufficient source side n-gram similarity, such as the one between "E A M N" and "E A B C".
Introduction	Collaborative decoding (Li et al., 2009) scores the translation of a source span by its n-gram similarity to the translations by other systems.

n-gram is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

10. Deciphering Foreign Language by Combining Language Models and Context Vectors

Nuhn, Malte and Mauser, Arne and Ney, Hermann

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	On the task shown in (Ravi and Knight, 2011) we obtain better results with only 5% of the computational effort when running our method with an n-gram language model.
Experimental Evaluation	being approximately 15 to 20 times faster than their n-gram based approach.
Experimental Evaluation	To summarize: Our method is significantly faster than n-gram LM based approaches and obtains better results than any previously published method.
Translation Model	Stochastically generate the target sentence according to an n-gram language model.

n-gram is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

LM (27)
BLEU (16)
translation model (15)

11. Coreference Semantics from Web Features

Bansal, Mohit and Klein, Dan

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	To address semantic ambiguities in coreference resolution, we use Web n-gram features that capture a range of world knowledge in a diffuse but robust way.
Introduction	In order to harness the information on the Web without presupposing a deep understanding of all Web text, we instead turn to a diverse collection of Web n-gram counts (Brants and Franz, 2006) which, in aggregate, contain diffuse and indirect, but often robust, cues to reference.
Semantics via Web Features	These clusters come from distributional K -Means clustering (with K = 1000) on phrases, using the n-gram context as features.

n-gram is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

12. Mixing Multiple Translation Models in Statistical Machine Translation

Razmara, Majid and Foster, George and Sankaran, Baskaran and Sarkar, Anoop

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion & Future Work	In addition, we can extend our approach by applying some of the techniques used in other system combination approaches such as consensus decoding, using n-gram features, tuning using forest-based MERT, among other possible extensions.
Related Work 5.1 Domain Adaptation	In other words, it requires all component models to fully decode each sentence, compute n-gram expectations from each component model and calculate posterior probabilities over translation derivations.
Related Work 5.1 Domain Adaptation	Finally, main techniques used in this work are orthogonal to our approach such as Minimum Bayes Risk decoding, using n-gram features and tuning using MERT.

n-gram is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

13. Computational Approaches to Sentence Completion

Zweig, Geoffrey and Platt, John C. and Meek, Christopher and Burges, Christopher J.C. and Yessenalina, Ainur and Liu, Qiang

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Sentence Completion via Language Modeling	3.1 Backoff N-gram Language Model
Sentence Completion via Language Modeling	3.2 Maximum Entropy Class-Based N-gram Language Model
Sentence Completion via Latent Semantic Analysis	4.3 A LSA N-gram Language Model

n-gram is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: