Index of papers in Proc. ACL that mention

segmentations

Seen in text as:

segmentations (139)
segmenters (29)
Seg (21)
SEG (10)
segmenter’s (3)
Segmentations (3)

Seen in 195 sentences in 26 papers.

1. Toward Better Chinese Word Segmentation for SMT via Bilingual Constraints

Zeng, Xiaodong and Chao, Lidia S. and Wong, Derek F. and Trancoso, Isabel and Tian, Liang

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	0 Self-training Segmenters (STS): two variant models were defined by the approach reported in (Subramanya et al., 2010) that uses the supervised CRFs model’s decodings, incorporating empirical and constraint information, for unlabeled examples as additional labeled data to retrain a CRFs model.
Experiments	0 Virtual Evidences Segmenters (VES): Two variant models based on the approach in (Zeng et al., 2013) were defined.
Experiments	This behaviour illustrates that the conventional optimizations to the monolingual supervised model, e. g., accumulating more supervised data or predefined segmentation properties, are insufficient to help model for achieving better segmentations for SMT.
Introduction	The prior works showed that these models help to find some segmentations tailored for SMT, since the bilingual word occurrence feature can be captured by the character-based alignment (Och and Ney, 2003).
Introduction	Instead of directly merging the characters into concrete segmentations , this work attempts to extract word boundary distributions for character-level trigrams (types) from the “chars-to-word” mappings.
Methodology	It is worth mentioning that prior works presented a straightforward usage for candidate words, treating them as golden segmentations , either dictionary units or labeled resources.

segmentations is mentioned in 18 sentences in this paper.

Topics mentioned in this paper:

CRFs (30)
segmentations (18)
treebank (16)

2. Empirical Study of Unsupervised Chinese Word Segmentation Methods for SMT on Large-scale Corpora

Wang, Xiaolin and Utiyama, Masao and Finch, Andrew and Sumita, Eiichiro

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Experimental results show that the proposed method is comparable to supervised segmenters on the in-domain NIST OpenMT corpus, and yields a 0.96 BLEU relative increase on NTCIR PatentMT corpus which is out-of-domain.
Complexity Analysis	Character-based segmentation, LDC segmenter and Stanford Chinese segmenters were used as the baseline methods.
Complexity Analysis	The training was started from assuming that there was no previous segmentations on each sentence (pair), and the number of iterations was fixed.
Complexity Analysis	The monolingual bigram model, however, was slower to converge, so we started it from the segmentations of the unigram model, and using 10 iterations.
Introduction	Though supervised-learning approaches which involve training segmenters on manually segmented corpora are widely used (Chang et al., 2008), yet the criteria for manually annotating words are arbitrary, and the available annotated corpora are limited in both quantity and genre variety.
Methods	The set .7: is chosen to represent an unsegmented foreign language sentence (a sequence of characters), because an unsegmented sentence can be seen as the set of all possible segmentations of the sentence denoted F, i.e.

segmentations is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

segmenters (11)
BLEU (9)
bigram (7)

3. Evaluating Text Segmentation using Boundary Edit Distance

Fournier, Chris

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Existing segmentation metrics such as Pk, WindowD-iff, and Segmentation Similarity (S) are all able to award partial credit for near misses between boundaries, but are biased towards segmentations containing few or tightly clustered boundaries.
Introduction	A variety of segmentation granularities, or atomic units, exist, including segmentations at the morpheme (e.g., Sirts and Alum'ae 2012), word (e.g., Chang et al.
Introduction	Segmentations can also represent the structure of text as being organized linearly (e.g., Hearst 1997), hierarchically (e.g., Eisenstein 2009), etc.
Introduction	Theoretically, segmentations could also contain varying bound-
Related Work	Many early studies evaluated automatic segmenters using information retrieval (IR) metrics such as precision, recall, etc.
Related Work	To attempt to overcome this issue, both Passonneau and Litman (1993) and Hearst (1993) conflated multiple manual segmentations into one that contained only those boundaries which the majority of coders agreed upon.
Related Work	IR metrics were then used to compare automatic segmenters to this majority solution.

segmentations is mentioned in 47 sentences in this paper.

Topics mentioned in this paper:

4. Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling

Mochihashi, Daichi and Yamada, Takeshi and Ueda, Naonori

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Japanese word segmentation, with all supervised segmentations removed in advance.
Experiments	Semi-supervised results used only 10K sentences (1/5) of supervised segmentations .
Experiments	segmentations .
Inference	When we repeat this process, it is expected to mix rapidly because it implicitly considers all possible segmentations of the given string at the same time.
Inference	Segmentations before the final k characters are marginalized using the following recursive relationship:
Inference	Figure 4: Forward filtering of a[t] to marginalize out possible segmentations j before 75— k.
Introduction	In order to extract “words” from text streams, unsupervised word segmentation is an important research area because the criteria for creating supervised training data could be arbitrary, and will be suboptimal for applications that rely on segmentations .
Introduction	It is particularly difficult to create “correct” training data for speech transcripts, colloquial texts, and classics where segmentations are often ambiguous, let alone is impossible for unknown languages whose properties computational linguists might seek to uncover.

segmentations is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

5. SITS: A Hierarchical Nonparametric Model using Speaker Identity for Topic Segmentation in Multiparty Conversations

Nguyen, Viet-An and Boyd-Graber, Jordan and Resnik, Philip

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Datasets	For evaluation, we used a standard set of reference segmentations (Galley et al., 2003) of 25 meetings.
Datasets	Segmentations are binary, i.e., each point of the document is either a segment boundary or not, and on average each meeting has 8 segment boundaries.
Datasets	To get reference segmentations , we assign each turn a real value from 0 to 1 indicating how much a turn changes the topic.
Topic Segmentation Experiments	Evaluation Metrics To evaluate segmentations , we use Pk (Beeferman et al., 1999) and WindowDiff (WD) (Pevzner and Hearst, 2002).
Topic Segmentation Experiments	First, they require both hypothesized and reference segmentations to be binary.
Topic Segmentation Experiments	Many algorithms (e. g., probabilistic approaches) give non-binary segmentations where candidate boundaries have real-valued scores (e.g., probability or confidence).

segmentations is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

6. An Error-Driven Word-Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging

Kruengkrai, Canasai and Uchimoto, Kiyotaka and Kazama, Jun'ichi and Wang, Yiou and Torisawa, Kentaro and Isahara, Hitoshi

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We evaluated both word segmentation (Seg) and joint word segmentation and POS tagging ( Seg & Tag).
Experiments	For Seg , a token is considered to be a correct one if the word boundary is correctly identified.
Experiments	For Seg & Tag, both the word boundary and its POS tag have to be correctly identified to be counted as a correct token.

segmentations is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

7. Unsupervised Multilingual Learning for Morphological Segmentation

Snyder, Benjamin and Barzilay, Regina

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present a nonparametric Bayesian model that jointly induces morpheme segmentations of each language under consideration and at the same time identifies cross-lingual morpheme patterns, or abstract morphemes.
Experimental SetUp	We obtained gold standard segmentations of the Arabic translation with a handcrafted Arabic morphological analyzer which utilizes manually constructed word lists and compatibility rules and is further trained on a large corpus of hand-annotated Arabic data (Habash and Ram-bow, 2005).
Experimental SetUp	We don’t have gold standard segmentations for the English and Aramaic portions of the data, and thus restrict our evaluation to Hebrew and Arabic.
Introduction	the space of joint segmentations .
Introduction	For each language in the pair, the model favors segmentations which yield high frequency morphemes.
Model	For word 21) in language 5, we consider at once all possible segmentations , and for each segmentation all possible alignments.
Model	We are thus considering at once: all possible segmentations of 212 along with all possible alignments involving morphemes in 21) with some subset of previously sampled language-]: morphemes.3

segmentations is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

8. Jointly Optimizing a Two-Step Conditional Random Field Model for Machine Transliteration and Its Fast Decoding Algorithm

Yang, Dong and Dixon, Paul and Furui, Sadaoki

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions and future work	The CRF segmentation provides a list of segmentations : A : A1, A2, ..., AN, with conditional probabilities P(A1\|S), P(A2\|S), ..., P(AN\|S).
Conclusions and future work	If we continue performing the CRF conversion to cover all N (N 2 k) segmentations , eventually we will get:
Joint optimization and its fast decoding algorithm	The joint optimization considers all the segmentation possibilities and sums the probability over all the alternative segmentations which generate the same output.
Joint optimization and its fast decoding algorithm	However, exact inference by listing all possible candidates explicitly and summing over all possible segmentations is intractable, because of the exponential computation complexity with the source word’s increasing length.
Joint optimization and its fast decoding algorithm	In the segmentation step, the number of possible segmentations is 2N , where N is the length of the source word and 2 is the size of the tagging set.

segmentations is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

9. A Single Generative Model for Joint Morphological Segmentation and Syntactic Parsing

Goldberg, Yoav and Tsarfaty, Reut

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Generative PCFG Model	Our use of an unweighted lattice reflects our belief that all the segmentations of the given input sentence are a-priori equally likely; the only reason to prefer one segmentation over the another is due to the overall syntactic context which is modeled via the PCFG derivations.
A Generative PCFG Model	(1996) who consider the kind of probabilities a generative parser should get from a PoS tagger, and concludes that these should be P(w\|t) “and nothing fancier”.3 In our setting, therefore, the Lattice is not used to induce a probability distribution on a linear context, but rather, it is used as a common-denominator of state-indexation of all segmentations possibilities of a surface form.
Experimental Setup	We use the HSPELL9 (Har’el and Kenigsberg, 2004) wordlist as a lexeme-based lexicon for pruning segmentations involving invalid segments.
Experimental Setup	To evaluate the performance on the segmentation task, we report SEG , the standard harmonic means for segmentation Precision and Recall F1 (as defined in Bar-Haim et a1.
Previous Work on Hebrew Processing	Morphological analyzers for Hebrew that analyze a surface form in isolation have been proposed by Segal (2000), Yona and Wintner (2005), and recently by the knowledge center for processing Hebrew (Itai et al., 2006).
Previous Work on Hebrew Processing	Tsarfaty (2006) used a morphological analyzer ( Segal , 2000), a PoS tagger (Bar-Haim et al., 2005), and a general purpose parser (Schmid, 2000) in an integrated framework in which morphological and syntactic components interact to share information, leading to improved performance on the joint task.

segmentations is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

10. Vector space semantics with frequency-driven motifs

Srivastava, Shashank and Hovy, Eduard

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	The model accounts for possible segmentations of a sentence into potential motifs, and prefers recurrent and cohesive motifs through features that capture frequency-based and statistical
Introduction	A slightly modified version of Viterbi could also be used to find segmentations that are constrained to agree with some given motif boundaries, but can segment other parts of the sentence optimally under these constraints.
Introduction	Additionally, a few feature for the segmentations model contained minor orthographic features based on word shape (length and capitalization patterns).

segmentations is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

11. Profile Based Cross-Document Coreference Using Kernelized Fuzzy Relational Clustering

Huang, Jian and Taylor, Sarah M. and Smith, Jonathan L. and Fotiadis, Konstantinos A. and Giles, C. Lee

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We then used the SEG algorithm to learn the weight distribution model.
Methods 2.1 Document Level and Profile Based CDC	The chained entities 5 are first objectified into the relation strength matrix R using SEG , the details of which are described in the following section.
Methods 2.1 Document Level and Profile Based CDC	Algorithm 2 SEG (Freund et al., 1997) Input: Initial weight distribution p1; learning rate 77 > 0; training set {< st, 3/75 >} 1: for t=l to T do 2: Predict using:
Methods 2.1 Document Level and Profile Based CDC	We adopt the Specialist Exponentiated Gradient ( SEG ) (Freund et al., 1997) algorithm to learn the mixing weights of the specialists’ prediction (Algorithm 2) in an online manner.

segmentations is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

12. Active Learning for Multilingual Statistical Machine Translation

Haffari, Gholamreza and Sarkar, Anoop

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Sentence Selection: Single Language Pair	where Hx is the space of all possible segmentations for the 00V fragment X, Y)?
Sentence Selection: Single Language Pair	We let Hx to be all possible segmentations of the fragment x for which the resulting phrase lengths are not greater than the maximum length constraint for phrase extraction in the underlying SMT model.
Sentence Selection: Single Language Pair	Since we do not know anything about the segmentations a priori, we have put a uniform distribution over such segmentations .

segmentations is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

13. Training Phrase Translation Models with Leaving-One-Out

Wuebker, Joern and Mauser, Arne and Ney, Hermann

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Alignment	However, (DeNero et al., 2006) experienced similar over-fitting with short phrases due to the fact that the same word sequence can be segmented in different ways, leading to specific segmentations being learned for specific training sentence pairs.
Introduction	Ideally, we would produce all possible segmentations and alignments during training.
Related Work	When given a bilingual sentence pair, we can usually assume there are a number of equally correct phrase segmentations and corresponding alignments.
Related Work	As a result of this ambiguity, different segmentations are recruited for different examples during training.

segmentations is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

14. Joint Annotation of Search Queries

Bendersky, Michael and Croft, W. Bruce and Smith, David A.

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	SEG
Experiments	8- ....... _<>~~~~ °’ SEG <> 3- ’-—’A ‘ ' ' ‘ ' ' ‘ ' - e - - - - “A v- o, g,"’ 0 L 00 r,—A, l0_ [\ E’— O\CAP 0
Experiments	SEG F1 MQA
Joint Query Annotation	2Q 2 {CAP, TAG, SEG }.

segmentations is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

15. Combining Morpheme-based Machine Translation with Post-processing Morpheme Prediction

Clifton, Ann and Sarkar, Anoop

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Models 2.1 Baseline Models	When tested against a human-annotated gold standard of linguistic morpheme segmentations for Finnish, this algorithm outperforms competing unsupervised methods, achieving an F—score of 67.0% on a 3 million sentence corpus (Creutz and Lagus, 2006).
Models 2.1 Baseline Models	In order to get robust, common segmentations , we trained the segmenter on the 5000 most frequent words2; we then used this to segment the entire data set.
Models 2.1 Baseline Models	Of the phrases that included segmentations (‘Morph’ in Table 1), roughly a third were ‘productive’, i.e.

segmentations is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

translation model (21)
BLEU (17)
CRF (12)

16. Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese

Hatori, Jun and Matsuzaki, Takuya and Miyao, Yusuke and Tsujii, Jun'ichi

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Model	Figure 2 shows the F1 scores of the proposed model (SegTagDep) on CTB-Sc-l with respect to the training epoch and different parsing feature weights, where “Seg” , “Tag”, and “Dep” respectively denote the F1 scores of word segmentation, POS tagging, and dependency parsing.
Model	Beam Seg Tag Dep Speed
Model	System Seg Tag Kruengkrai ’09 97.87 93.67 Zhang ’10 97.78 93.67 Sun ’11 98.17 94.02 Wang ’11 98.11 94.18 SegTag 97.66 93.61 SegTagDep 97.73 94.46 SegTag(d) 98.18 94.08 SegTagDep(d) 98.26 94.64

segmentations is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

17. Integrating Translation Memory into Phrase-Based Machine Translation during Decoding

Wang, Kun and Zong, Chengqing and Su, Keh-Yih

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	To estimate the probabilities of proposed models, the corresponding phrase segmentations for bilingual sentences are required.
Experiments	As we want to check what actually happened during decoding in the real situation, cross-fold translation is used to obtain the corresponding phrase segmentations .
Experiments	Afterwards, we generate the corresponding phrase segmentations for the remaining 5% bi-

segmentations is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

TER (14)
BLEU (11)
SMT system (11)

18. Co-regularizing character-based and word-based models for semi-supervised Chinese word segmentation

Zeng, Xiaodong and Wong, Derek F. and Chao, Lidia S. and Trancoso, Isabel

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Semi-supervised Learning via Co-regularizing Both Models	Since each of the models has its own merits, their consensuses signify high confidence segmentations .
Semi-supervised Learning via Co-regularizing Both Models	)”, the two segmentations shown in Figure 1 are the predictions from a character-based and word-based model.
Semi-supervised Learning via Co-regularizing Both Models	Figure l: The segmentations given by character-based and word-based model, Where the words in “D” refer to the segmentation agreements.

segmentations is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

19. Chinese Parsing Exploiting Characters

Zhang, Meishan and Zhang, Yue and Che, Wanxiang and Liu, Ting

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Pipeline Seg 97.35 98.02 97.69 Tag 93.51 94.15 93.83 Parse 81.58 82.95 82.26
Experiments	Flat word Seg 97.32 98.13 97.73 structures Tag 94.09 94.88 94.48 Parse 83.39 83.84 83.61
Experiments	Annotated Seg 97.49 98.18 97.84 word structures Tag 94.46 95.14 94.80 Parse 84.42 84.43 84.43 WS 94.02 94.69 94.35

segmentations is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

20. Modelling function words improves unsupervised word segmentation

Johnson, Mark and Christophe, Anne and Dupoux, Emmanuel and Demuth, Katherine

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Word segmentation results	800 sample segmentations of each utterance.
Word segmentation results	The most frequent segmentation in these 800 sample segmentations is the one we score in the evaluations below.
Word segmentation results	Here we evaluate the word segmentations found by the “function word” Adaptor Grammar model described in section 2.3 and compare it to the baseline grammar with collocations and phonotactics from Johnson and Goldwater (2009).

segmentations is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

21. Word Segmentation of Informal Arabic with Domain Adaptation

Monroe, Will and Green, Spence and Manning, Christopher D.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	However, state-of-the-art Arabic word segmenters are either limited to formal Modern Standard Arabic, performing poorly on Arabic text featuring dialectal vocabulary and grammar, or rely on linguistic knowledge that is hand-tuned for each dialect.
Arabic Word Segmentation Model	Some incorrect segmentations produced by the original system could be ruled out with the knowledge of these statistics.
Error Analysis	In 36 of the 100 sampled errors, we conjecture that the presence of the error indicates a shortcoming of the feature set, resulting in segmentations that make sense locally but are not plausible given the full token.
Error Analysis	4.3 Context-sensitive segmentations and multiple word senses

segmentations is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

22. Lattice Desegmentation for Statistical Machine Translation

Salameh, Mohammad and Cherry, Colin and Kondrak, Grzegorz

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	To generate the desegmentation table, we analyze the segmentations from the Arabic side of the parallel training data to collect mappings from morpheme sequences to surface forms.
Methods	where [prefix], [stem] and [suffix] are non-overlapping sets of morphemes, whose members are easily determined using the segmenter’s segment boundary markers.3 The second disjunct of Equation 1 covers words that have no clear stem, such as the Arabic «J lh “for him”, segmented as 1+ “for” +h “him”.
Related Work	For many segmentations , especially unsupervised ones, this amounts to simple concatenation.
Related Work	However, more complex segmentations , such as the Arabic tokenization provided by MADA (Habash et al., 2009), require further orthographic adjustments to reverse normalizations performed during segmentation.

segmentations is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

LM (16)
language model (13)
BLEU (13)

23. A Syntactic and Lexical-Based Discourse Segmenter

Tofiloski, Milan and Brooke, Julian and Taboada, Maite

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Data and Evaluation	use the coauthor’s segmentations as the gold standard.
Discussion	Also to be investigated is a quantitative study of the effects of high-precision/low-recall vs. low-precision/high-recall segmenters on the construction of discourse trees.
Results	Additionally, we compared SLSeg and SPADE to the original RST segmentations of the three RST texts taken from RST literature.

segmentations is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

24. Exploring Deterministic Constraints: from a Constrained English POS Tagger to an Efficient ILP Solution to Chinese Word Segmentation

Zhao, Qiuye and Marcus, Mitch

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Since it is hard to achieve the best segmentations with tagset IB, we propose an indirect way to use these constraints in the following section, instead of applying these constraints as straightforwardly as in English POS tagging.
Abstract	wEsegGEN(c) i=1 where function segGEN maps character sequence c to the set of all possible segmentations of c. For example, W = (cl..cll)...(cn_lk+1...cn) represents a segmentation of 1:: words and the lengths of the first and last word are [1 and lk respectively.
Abstract	We transform tagged character sequences to word segmentations first, and then evaluate word segmenta-tions by F-measure, as defined in Section 5.2.

segmentations is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

POS tagging (31)
ILP (20)
Viterbi (18)

25. Variational Decoding for Statistical Machine Translation

Li, Zhifei and Eisner, Jason and Khudanpur, Sanjeev

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	That is, the probability of an output string is split among many distinct derivations (e.g., trees or segmentations ).
Background 2.1 Terminology	(2003)), where different segmentations lead to the same translation string (Figure l), and in syntax-based systems (e.g., Chiang (2007)), where different derivation trees yield the same string (Figure 2).
Background 2.1 Terminology	Figure l: Segmentation ambiguity in phrase-based MT: two different segmentations lead to the same translation string.

segmentations is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

n-gram (32)
Viterbi (23)
BLEU (15)

26. Unsupervised Lexicon-Based Resolution of Unknown Words for Full Morphological Analysis

Adler, Meni and Goldberg, Yoav and Gabay, David and Elhadad, Michael

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Evaluation	Our baseline proposes the most frequent tag (proper name) for all possible segmentations of the token, in a uniform distribution.
Method	We hypothesize a uniform distribution among the possible segmentations and aggregate a distribution of possible tags for the analysis.
Previous Work	(of all words in a given sentence) and the POS tagging (of the known words) is based on a Viterbi search over a lattice composed of all possible word segmentations and the possible classifications of all observed characters.

segmentations is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: