Index of papers in Proc. ACL 2012 that mention

n-grams

Seen in text as:

n-grams (46)

Seen in 45 sentences in 10 papers.

1. Character-Level Machine Translation Evaluation for Languages with Ambiguous Word Boundaries

Liu, Chang and Ng, Hwee Tou

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Discussion and Future Work	In the current formulation of TESLA-CELAB, two n-grams X and Y are either synonyms which completely match each other, or are completely unrelated.
Experiments	Compared to BLEU, TESLA allows more sophisticated weighting of n-grams and measures of word similarity including synonym relations.
Experiments	The covered n-gram matching rule is then able to award tricky n-grams such as TE, Ti, /1\ [E], 1/13 [IE5 and i9}.
Motivation	For example, between ¥_l—?fi_$ and ¥_5l?, higher-order n-grams such as and still have no match, and will be penalized accordingly, even though ¥_l—?fi_5lk and ¥_5l?
Motivation	N-grams such as which cross natural word boundaries and are meaningless by themselves can be particularly tricky.
The Algorithm	Two n-grams are connected if they are identical, or if they are identified as synonyms by Cilin.
The Algorithm	Notice that all n-grams are put in the same matching problem regardless of n, unlike in translation evaluation metrics designed for European languages.
The Algorithm	This enables us to designate n-grams with different values of n as synonyms, such as (n = 2) and 5!k (n = 1).

n-grams is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

BLEU (19)
word-level (15)
n-grams (12)

2. A Joint Model for Discovery of Aspects in Utterances

Celikyilmaz, Asli and Hakkani-Tur, Dilek

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Data and Approach Overview	We represent each utterance a as a vector wu of Nu word n-grams (segments), wuj, each of which are chosen from a vocabulary W of fixed-size V. We use entity lists obtained from web sources (explained next) to identify segments in the corpus.
Data and Approach Overview	Web n-Grams (G).
Experiments	Our vocabulary consists of n—grams and segments (phrases) in utterances that are extracted using web n-grams and entity lists of §3.
MultiLayer Context Model - MCM	* Web n-Gram Context Base Measure (2%): As explained in §3, we use the web n-grams as additional information for calculating the base measures of the Dirichlet topic distributions.
MultiLayer Context Model - MCM	In (1) we assume that entities (E) are more indicative of the domain compared to other n-grams (G) and should be more dominant in sampling decision for domain topics.
MultiLayer Context Model - MCM	During Gibbs sampling, we keep track of the frequency of draws of domain, dialog act and slot indicating n-grams wj, in M D, M A and MS matrices, respectively.

n-grams is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

3. Coreference Semantics from Web Features

Bansal, Mohit and Klein, Dan

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Semantics via Web Features	As the source of Web information, we use the Google n-grams corpus (Brants and Franz, 2006) which contains English n-grams (n = 1 to 5) and their Web frequency counts, derived from nearly 1 trillion word tokens and 95 billion sentences.
Semantics via Web Features	Using the n-grams corpus (for n = l to 5), we collect co-occurrence Web-counts by allowing a varying number of wildcards between hl and hg in the query.
Semantics via Web Features	2These clusters are derived form the V2 Google n-grams corpus.

n-grams is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

4. PORT: a Precision-Order-Recall MT Evaluation Metric for Tuning

Chen, Boxing and Kuhn, Roland and Larkin, Samuel

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

BLEU and PORT	translation hypothesis to compute the numbers of the reference n-grams .
Experiments	Both BLEU and PORT perform matching of n-grams up to n = 4.
Experiments	In all tuning experiments, both BLEU and PORT performed lower case matching of n-grams up to n = 4.
Experiments	The BLEU-tuned and Qmean-tuned systems generate similar numbers of matching n-grams, but Qmean-tuned systems produce fewer n-grams (thus, shorter translations).

n-grams is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

5. Fast Online Lexicon Learning for Grounded Language Acquisition

Chen, David

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	This is mainly due to the additional minimum support constraint we added which discards many noisy lexical entries from infrequently seen n-grams .
Online Lexicon Learning Algorithm	and the corresponding navigation plan, we first segment the instruction into word tokens and construct n-grams from them.
Online Lexicon Learning Algorithm	From the corresponding navigation plan, we find all connected subgraphs of size less than or equal to m. We then update the co-occurrence counts between all the n-grams w and all the connected subgraphs 9.
Online Lexicon Learning Algorithm	We also update the counts of how many examples we have encountered so far and counts of the n-grams w and subgraphs 9.

n-grams is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

6. Discriminative Strategies to Integrate Multiword Expression Recognition and Parsing

Constant, Matthieu and Sigogne, Anthony and Watrin, Patrick

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Evaluation	Table 3: MWE identification with CRF: base are the features corresponding to token properties and word n-grams .
MWE-dedicated Features	Word n-grams .
MWE-dedicated Features	POS n-grams .

n-grams is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

7. Concept-to-text Generation via Discriminative Reranking

Konstas, Ioannis and Lapata, Mirella

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Design	Field bigrams/trigrams Analogously to the lexical features mentioned above, we introduce a series of nonlocal features that capture field n-grams , given a specific record.
Related Work	Local and nonlocal information (e.g., word n-grams , long-
Results	The l-BEST system has some grammaticality issues, which we avoid by defining features over lexical n-grams and repeated words.

n-grams is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

8. Learning Translation Consensus with Structured Label Propagation

Liu, Shujie and Li, Chi-Ho and Li, Mu and Zhou, Ming

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion and Future Work	In this paper, we only tried Dice coefficient of n-grams and symmetrical sentence level BLEU as similarity measures.
Features and Training	Tl(e,e') is the propagating probability in equation (8), with the similarity measure Sim(e,e') defined as the Dice coefficient over the set of all n-grams in e and those in e'.
Features and Training	where N Grn(x) is the set of n-grams in string x, and Dice (A, B) is the Dice coefficient over sets A and B:

n-grams is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

9. Fast Syntactic Analysis for Statistical Language Modeling via Substructure Sharing and Uptraining

Rastrow, Ariya and Dredze, Mark and Khudanpur, Sanjeev

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	While traditional LMs use word n-grams , where the n — 1 previous words predict the next word, newer models integrate long-span information in making decisions.
Introduction	For example, incorporating long-distance dependencies and syntactic structure can help the LM better predict words by complementing the predictive power of n-grams (Chelba and Jelinek, 2000; Collins et al., 2005; Filimonov and Harper, 2009; Kuo et al., 2009).
Syntactic Language Models	Structured language modeling incorporates syntactic parse trees to identify the head words in a hypothesis for modeling dependencies beyond n-grams .

n-grams is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

10. Joint Feature Selection in Distributed Stochastic Learning for Large-Scale Discriminative Training in SMT

Simianer, Patrick and Riezler, Stefan and Dyer, Chris

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Feature templates such as rule n-grams and rule shapes only work if iterative mixing (algorithm 3) or feature selection (algorithm 4) are used.
Introduction	Such features include rule ids, rule-local n-grams , or types of rule shapes.
Local Features for Synchronous CFGs	Rule n-grams: These features identify n-grams of consecutive items in a rule.

n-grams is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: