Index of papers in Proc. ACL 2014 that mention

Seen in text as:

Seen in 47 sentences in 6 papers.

Wang, Xiaolin and Utiyama, Masao and Finch, Andrew and Sumita, Eiichiro

Abstract	Experimental results show that the proposed method is comparable to supervised segmenters on the in-domain NIST OpenMT corpus, and yields a 0.96 BLEU relative increase on NTCIR PatentMT corpus which is out-of-domain.
Complexity Analysis	Character-based segmentation, LDC segmenter and Stanford Chinese segmenters were used as the baseline methods.
Complexity Analysis	The training was started from assuming that there was no previous segmentations on each sentence (pair), and the number of iterations was fixed.
Complexity Analysis	The monolingual bigram model, however, was slower to converge, so we started it from the segmentations of the unigram model, and using 10 iterations.
Introduction	Though supervised-learning approaches which involve training segmenters on manually segmented corpora are widely used (Chang et al., 2008), yet the criteria for manually annotating words are arbitrary, and the available annotated corpora are limited in both quantity and genre variety.
Methods	The set .7: is chosen to represent an unsegmented foreign language sentence (a sequence of characters), because an unsegmented sentence can be seen as the set of all possible segmentations of the sentence denoted F, i.e.

segmentations is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

Zeng, Xiaodong and Chao, Lidia S. and Wong, Derek F. and Trancoso, Isabel and Tian, Liang

Experiments	0 Self-training Segmenters (STS): two variant models were defined by the approach reported in (Subramanya et al., 2010) that uses the supervised CRFs model’s decodings, incorporating empirical and constraint information, for unlabeled examples as additional labeled data to retrain a CRFs model.
Experiments	0 Virtual Evidences Segmenters (VES): Two variant models based on the approach in (Zeng et al., 2013) were defined.
Experiments	This behaviour illustrates that the conventional optimizations to the monolingual supervised model, e. g., accumulating more supervised data or predefined segmentation properties, are insufficient to help model for achieving better segmentations for SMT.
Introduction	The prior works showed that these models help to find some segmentations tailored for SMT, since the bilingual word occurrence feature can be captured by the character-based alignment (Och and Ney, 2003).
Introduction	Instead of directly merging the characters into concrete segmentations , this work attempts to extract word boundary distributions for character-level trigrams (types) from the “chars-to-word” mappings.
Methodology	It is worth mentioning that prior works presented a straightforward usage for candidate words, treating them as golden segmentations , either dictionary units or labeled resources.

segmentations is mentioned in 18 sentences in this paper.

Topics mentioned in this paper:

Srivastava, Shashank and Hovy, Eduard

Introduction	The model accounts for possible segmentations of a sentence into potential motifs, and prefers recurrent and cohesive motifs through features that capture frequency-based and statistical
Introduction	A slightly modified version of Viterbi could also be used to find segmentations that are constrained to agree with some given motif boundaries, but can segment other parts of the sentence optimally under these constraints.
Introduction	Additionally, a few feature for the segmentations model contained minor orthographic features based on word shape (length and capitalization patterns).

segmentations is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

Johnson, Mark and Christophe, Anne and Dupoux, Emmanuel and Demuth, Katherine

Word segmentation results	800 sample segmentations of each utterance.
Word segmentation results	The most frequent segmentation in these 800 sample segmentations is the one we score in the evaluations below.
Word segmentation results	Here we evaluate the word segmentations found by the “function word” Adaptor Grammar model described in section 2.3 and compare it to the baseline grammar with collocations and phonotactics from Johnson and Goldwater (2009).

segmentations is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

Monroe, Will and Green, Spence and Manning, Christopher D.

Abstract	However, state-of-the-art Arabic word segmenters are either limited to formal Modern Standard Arabic, performing poorly on Arabic text featuring dialectal vocabulary and grammar, or rely on linguistic knowledge that is hand-tuned for each dialect.
Arabic Word Segmentation Model	Some incorrect segmentations produced by the original system could be ruled out with the knowledge of these statistics.
Error Analysis	In 36 of the 100 sampled errors, we conjecture that the presence of the error indicates a shortcoming of the feature set, resulting in segmentations that make sense locally but are not plausible given the full token.
Error Analysis	4.3 Context-sensitive segmentations and multiple word senses

segmentations is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

Salameh, Mohammad and Cherry, Colin and Kondrak, Grzegorz

Experimental Setup	To generate the desegmentation table, we analyze the segmentations from the Arabic side of the parallel training data to collect mappings from morpheme sequences to surface forms.
Methods	where [prefix], [stem] and [suffix] are non-overlapping sets of morphemes, whose members are easily determined using the segmenter’s segment boundary markers.3 The second disjunct of Equation 1 covers words that have no clear stem, such as the Arabic «J lh “for him”, segmented as 1+ “for” +h “him”.
Related Work	For many segmentations , especially unsupervised ones, this amounts to simple concatenation.
Related Work	However, more complex segmentations , such as the Arabic tokenization provided by MADA (Habash et al., 2009), require further orthographic adjustments to reverse normalizations performed during segmentation.

segmentations is mentioned in 4 sentences in this paper.

Topics mentioned in this paper: