Index of papers in Proc. ACL that mention
  • segmentations
Zeng, Xiaodong and Chao, Lidia S. and Wong, Derek F. and Trancoso, Isabel and Tian, Liang
Experiments
0 Self-training Segmenters (STS): two variant models were defined by the approach reported in (Subramanya et al., 2010) that uses the supervised CRFs model’s decodings, incorporating empirical and constraint information, for unlabeled examples as additional labeled data to retrain a CRFs model.
Experiments
0 Virtual Evidences Segmenters (VES): Two variant models based on the approach in (Zeng et al., 2013) were defined.
Experiments
This behaviour illustrates that the conventional optimizations to the monolingual supervised model, e. g., accumulating more supervised data or predefined segmentation properties, are insufficient to help model for achieving better segmentations for SMT.
Introduction
The prior works showed that these models help to find some segmentations tailored for SMT, since the bilingual word occurrence feature can be captured by the character-based alignment (Och and Ney, 2003).
Introduction
Instead of directly merging the characters into concrete segmentations , this work attempts to extract word boundary distributions for character-level trigrams (types) from the “chars-to-word” mappings.
Methodology
It is worth mentioning that prior works presented a straightforward usage for candidate words, treating them as golden segmentations , either dictionary units or labeled resources.
segmentations is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Wang, Xiaolin and Utiyama, Masao and Finch, Andrew and Sumita, Eiichiro
Abstract
Experimental results show that the proposed method is comparable to supervised segmenters on the in-domain NIST OpenMT corpus, and yields a 0.96 BLEU relative increase on NTCIR PatentMT corpus which is out-of-domain.
Complexity Analysis
Character-based segmentation, LDC segmenter and Stanford Chinese segmenters were used as the baseline methods.
Complexity Analysis
The training was started from assuming that there was no previous segmentations on each sentence (pair), and the number of iterations was fixed.
Complexity Analysis
The monolingual bigram model, however, was slower to converge, so we started it from the segmentations of the unigram model, and using 10 iterations.
Introduction
Though supervised-learning approaches which involve training segmenters on manually segmented corpora are widely used (Chang et al., 2008), yet the criteria for manually annotating words are arbitrary, and the available annotated corpora are limited in both quantity and genre variety.
Methods
The set .7: is chosen to represent an unsegmented foreign language sentence (a sequence of characters), because an unsegmented sentence can be seen as the set of all possible segmentations of the sentence denoted F, i.e.
segmentations is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Fournier, Chris
Abstract
Existing segmentation metrics such as Pk, WindowD-iff, and Segmentation Similarity (S) are all able to award partial credit for near misses between boundaries, but are biased towards segmentations containing few or tightly clustered boundaries.
Introduction
A variety of segmentation granularities, or atomic units, exist, including segmentations at the morpheme (e.g., Sirts and Alum'ae 2012), word (e.g., Chang et al.
Introduction
Segmentations can also represent the structure of text as being organized linearly (e.g., Hearst 1997), hierarchically (e.g., Eisenstein 2009), etc.
Introduction
Theoretically, segmentations could also contain varying bound-
Related Work
Many early studies evaluated automatic segmenters using information retrieval (IR) metrics such as precision, recall, etc.
Related Work
To attempt to overcome this issue, both Passonneau and Litman (1993) and Hearst (1993) conflated multiple manual segmentations into one that contained only those boundaries which the majority of coders agreed upon.
Related Work
IR metrics were then used to compare automatic segmenters to this majority solution.
segmentations is mentioned in 47 sentences in this paper.
Topics mentioned in this paper:
Mochihashi, Daichi and Yamada, Takeshi and Ueda, Naonori
Experiments
Japanese word segmentation, with all supervised segmentations removed in advance.
Experiments
Semi-supervised results used only 10K sentences (1/5) of supervised segmentations .
Experiments
segmentations .
Inference
When we repeat this process, it is expected to mix rapidly because it implicitly considers all possible segmentations of the given string at the same time.
Inference
Segmentations before the final k characters are marginalized using the following recursive relationship:
Inference
Figure 4: Forward filtering of a[t] to marginalize out possible segmentations j before 75— k.
Introduction
In order to extract “words” from text streams, unsupervised word segmentation is an important research area because the criteria for creating supervised training data could be arbitrary, and will be suboptimal for applications that rely on segmentations .
Introduction
It is particularly difficult to create “correct” training data for speech transcripts, colloquial texts, and classics where segmentations are often ambiguous, let alone is impossible for unknown languages whose properties computational linguists might seek to uncover.
segmentations is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Nguyen, Viet-An and Boyd-Graber, Jordan and Resnik, Philip
Datasets
For evaluation, we used a standard set of reference segmentations (Galley et al., 2003) of 25 meetings.
Datasets
Segmentations are binary, i.e., each point of the document is either a segment boundary or not, and on average each meeting has 8 segment boundaries.
Datasets
To get reference segmentations , we assign each turn a real value from 0 to 1 indicating how much a turn changes the topic.
Topic Segmentation Experiments
Evaluation Metrics To evaluate segmentations , we use Pk (Beeferman et al., 1999) and WindowDiff (WD) (Pevzner and Hearst, 2002).
Topic Segmentation Experiments
First, they require both hypothesized and reference segmentations to be binary.
Topic Segmentation Experiments
Many algorithms (e. g., probabilistic approaches) give non-binary segmentations where candidate boundaries have real-valued scores (e.g., probability or confidence).
segmentations is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Kruengkrai, Canasai and Uchimoto, Kiyotaka and Kazama, Jun'ichi and Wang, Yiou and Torisawa, Kentaro and Isahara, Hitoshi
Experiments
We evaluated both word segmentation (Seg) and joint word segmentation and POS tagging ( Seg & Tag).
Experiments
For Seg , a token is considered to be a correct one if the word boundary is correctly identified.
Experiments
For Seg & Tag, both the word boundary and its POS tag have to be correctly identified to be counted as a correct token.
segmentations is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Snyder, Benjamin and Barzilay, Regina
Abstract
We present a nonparametric Bayesian model that jointly induces morpheme segmentations of each language under consideration and at the same time identifies cross-lingual morpheme patterns, or abstract morphemes.
Experimental SetUp
We obtained gold standard segmentations of the Arabic translation with a handcrafted Arabic morphological analyzer which utilizes manually constructed word lists and compatibility rules and is further trained on a large corpus of hand-annotated Arabic data (Habash and Ram-bow, 2005).
Experimental SetUp
We don’t have gold standard segmentations for the English and Aramaic portions of the data, and thus restrict our evaluation to Hebrew and Arabic.
Introduction
the space of joint segmentations .
Introduction
For each language in the pair, the model favors segmentations which yield high frequency morphemes.
Model
For word 21) in language 5, we consider at once all possible segmentations , and for each segmentation all possible alignments.
Model
We are thus considering at once: all possible segmentations of 212 along with all possible alignments involving morphemes in 21) with some subset of previously sampled language-]: morphemes.3
segmentations is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Yang, Dong and Dixon, Paul and Furui, Sadaoki
Conclusions and future work
The CRF segmentation provides a list of segmentations : A : A1, A2, ..., AN, with conditional probabilities P(A1|S), P(A2|S), ..., P(AN|S).
Conclusions and future work
If we continue performing the CRF conversion to cover all N (N 2 k) segmentations , eventually we will get:
Joint optimization and its fast decoding algorithm
The joint optimization considers all the segmentation possibilities and sums the probability over all the alternative segmentations which generate the same output.
Joint optimization and its fast decoding algorithm
However, exact inference by listing all possible candidates explicitly and summing over all possible segmentations is intractable, because of the exponential computation complexity with the source word’s increasing length.
Joint optimization and its fast decoding algorithm
In the segmentation step, the number of possible segmentations is 2N , where N is the length of the source word and 2 is the size of the tagging set.
segmentations is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Goldberg, Yoav and Tsarfaty, Reut
A Generative PCFG Model
Our use of an unweighted lattice reflects our belief that all the segmentations of the given input sentence are a-priori equally likely; the only reason to prefer one segmentation over the another is due to the overall syntactic context which is modeled via the PCFG derivations.
A Generative PCFG Model
(1996) who consider the kind of probabilities a generative parser should get from a PoS tagger, and concludes that these should be P(w|t) “and nothing fancier”.3 In our setting, therefore, the Lattice is not used to induce a probability distribution on a linear context, but rather, it is used as a common-denominator of state-indexation of all segmentations possibilities of a surface form.
Experimental Setup
We use the HSPELL9 (Har’el and Kenigsberg, 2004) wordlist as a lexeme-based lexicon for pruning segmentations involving invalid segments.
Experimental Setup
To evaluate the performance on the segmentation task, we report SEG , the standard harmonic means for segmentation Precision and Recall F1 (as defined in Bar-Haim et a1.
Previous Work on Hebrew Processing
Morphological analyzers for Hebrew that analyze a surface form in isolation have been proposed by Segal (2000), Yona and Wintner (2005), and recently by the knowledge center for processing Hebrew (Itai et al., 2006).
Previous Work on Hebrew Processing
Tsarfaty (2006) used a morphological analyzer ( Segal , 2000), a PoS tagger (Bar-Haim et al., 2005), and a general purpose parser (Schmid, 2000) in an integrated framework in which morphological and syntactic components interact to share information, leading to improved performance on the joint task.
segmentations is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Srivastava, Shashank and Hovy, Eduard
Introduction
The model accounts for possible segmentations of a sentence into potential motifs, and prefers recurrent and cohesive motifs through features that capture frequency-based and statistical
Introduction
A slightly modified version of Viterbi could also be used to find segmentations that are constrained to agree with some given motif boundaries, but can segment other parts of the sentence optimally under these constraints.
Introduction
Additionally, a few feature for the segmentations model contained minor orthographic features based on word shape (length and capitalization patterns).
segmentations is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Huang, Jian and Taylor, Sarah M. and Smith, Jonathan L. and Fotiadis, Konstantinos A. and Giles, C. Lee
Experiments
We then used the SEG algorithm to learn the weight distribution model.
Methods 2.1 Document Level and Profile Based CDC
The chained entities 5 are first objectified into the relation strength matrix R using SEG , the details of which are described in the following section.
Methods 2.1 Document Level and Profile Based CDC
Algorithm 2 SEG (Freund et al., 1997) Input: Initial weight distribution p1; learning rate 77 > 0; training set {< st, 3/75 >} 1: for t=l to T do 2: Predict using:
Methods 2.1 Document Level and Profile Based CDC
We adopt the Specialist Exponentiated Gradient ( SEG ) (Freund et al., 1997) algorithm to learn the mixing weights of the specialists’ prediction (Algorithm 2) in an online manner.
segmentations is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Haffari, Gholamreza and Sarkar, Anoop
Sentence Selection: Single Language Pair
where Hx is the space of all possible segmentations for the 00V fragment X, Y)?
Sentence Selection: Single Language Pair
We let Hx to be all possible segmentations of the fragment x for which the resulting phrase lengths are not greater than the maximum length constraint for phrase extraction in the underlying SMT model.
Sentence Selection: Single Language Pair
Since we do not know anything about the segmentations a priori, we have put a uniform distribution over such segmentations .
segmentations is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Wuebker, Joern and Mauser, Arne and Ney, Hermann
Alignment
However, (DeNero et al., 2006) experienced similar over-fitting with short phrases due to the fact that the same word sequence can be segmented in different ways, leading to specific segmentations being learned for specific training sentence pairs.
Introduction
Ideally, we would produce all possible segmentations and alignments during training.
Related Work
When given a bilingual sentence pair, we can usually assume there are a number of equally correct phrase segmentations and corresponding alignments.
Related Work
As a result of this ambiguity, different segmentations are recruited for different examples during training.
segmentations is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Bendersky, Michael and Croft, W. Bruce and Smith, David A.
Experiments
SEG
Experiments
8- ....... _<>~~~~ °’ SEG <> 3- ’-—’A ‘ ' ' ‘ ' ' ‘ ' - e - - - - “A v- o, g,"’ 0 L 00 r,—A, l0_ [\ E’— O\CAP 0
Experiments
SEG F1 MQA
Joint Query Annotation
2Q 2 {CAP, TAG, SEG }.
segmentations is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Clifton, Ann and Sarkar, Anoop
Models 2.1 Baseline Models
When tested against a human-annotated gold standard of linguistic morpheme segmentations for Finnish, this algorithm outperforms competing unsupervised methods, achieving an F—score of 67.0% on a 3 million sentence corpus (Creutz and Lagus, 2006).
Models 2.1 Baseline Models
In order to get robust, common segmentations , we trained the segmenter on the 5000 most frequent words2; we then used this to segment the entire data set.
Models 2.1 Baseline Models
Of the phrases that included segmentations (‘Morph’ in Table 1), roughly a third were ‘productive’, i.e.
segmentations is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Hatori, Jun and Matsuzaki, Takuya and Miyao, Yusuke and Tsujii, Jun'ichi
Model
Figure 2 shows the F1 scores of the proposed model (SegTagDep) on CTB-Sc-l with respect to the training epoch and different parsing feature weights, where “Seg” , “Tag”, and “Dep” respectively denote the F1 scores of word segmentation, POS tagging, and dependency parsing.
Model
Beam Seg Tag Dep Speed
Model
System Seg Tag Kruengkrai ’09 97.87 93.67 Zhang ’10 97.78 93.67 Sun ’11 98.17 94.02 Wang ’11 98.11 94.18 SegTag 97.66 93.61 SegTagDep 97.73 94.46 SegTag(d) 98.18 94.08 SegTagDep(d) 98.26 94.64
segmentations is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Wang, Kun and Zong, Chengqing and Su, Keh-Yih
Experiments
To estimate the probabilities of proposed models, the corresponding phrase segmentations for bilingual sentences are required.
Experiments
As we want to check what actually happened during decoding in the real situation, cross-fold translation is used to obtain the corresponding phrase segmentations .
Experiments
Afterwards, we generate the corresponding phrase segmentations for the remaining 5% bi-
segmentations is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zeng, Xiaodong and Wong, Derek F. and Chao, Lidia S. and Trancoso, Isabel
Semi-supervised Learning via Co-regularizing Both Models
Since each of the models has its own merits, their consensuses signify high confidence segmentations .
Semi-supervised Learning via Co-regularizing Both Models
)”, the two segmentations shown in Figure 1 are the predictions from a character-based and word-based model.
Semi-supervised Learning via Co-regularizing Both Models
Figure l: The segmentations given by character-based and word-based model, Where the words in “D” refer to the segmentation agreements.
segmentations is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhang, Meishan and Zhang, Yue and Che, Wanxiang and Liu, Ting
Experiments
Pipeline Seg 97.35 98.02 97.69 Tag 93.51 94.15 93.83 Parse 81.58 82.95 82.26
Experiments
Flat word Seg 97.32 98.13 97.73 structures Tag 94.09 94.88 94.48 Parse 83.39 83.84 83.61
Experiments
Annotated Seg 97.49 98.18 97.84 word structures Tag 94.46 95.14 94.80 Parse 84.42 84.43 84.43 WS 94.02 94.69 94.35
segmentations is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Johnson, Mark and Christophe, Anne and Dupoux, Emmanuel and Demuth, Katherine
Word segmentation results
800 sample segmentations of each utterance.
Word segmentation results
The most frequent segmentation in these 800 sample segmentations is the one we score in the evaluations below.
Word segmentation results
Here we evaluate the word segmentations found by the “function word” Adaptor Grammar model described in section 2.3 and compare it to the baseline grammar with collocations and phonotactics from Johnson and Goldwater (2009).
segmentations is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Monroe, Will and Green, Spence and Manning, Christopher D.
Abstract
However, state-of-the-art Arabic word segmenters are either limited to formal Modern Standard Arabic, performing poorly on Arabic text featuring dialectal vocabulary and grammar, or rely on linguistic knowledge that is hand-tuned for each dialect.
Arabic Word Segmentation Model
Some incorrect segmentations produced by the original system could be ruled out with the knowledge of these statistics.
Error Analysis
In 36 of the 100 sampled errors, we conjecture that the presence of the error indicates a shortcoming of the feature set, resulting in segmentations that make sense locally but are not plausible given the full token.
Error Analysis
4.3 Context-sensitive segmentations and multiple word senses
segmentations is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Salameh, Mohammad and Cherry, Colin and Kondrak, Grzegorz
Experimental Setup
To generate the desegmentation table, we analyze the segmentations from the Arabic side of the parallel training data to collect mappings from morpheme sequences to surface forms.
Methods
where [prefix], [stem] and [suffix] are non-overlapping sets of morphemes, whose members are easily determined using the segmenter’s segment boundary markers.3 The second disjunct of Equation 1 covers words that have no clear stem, such as the Arabic «J lh “for him”, segmented as 1+ “for” +h “him”.
Related Work
For many segmentations , especially unsupervised ones, this amounts to simple concatenation.
Related Work
However, more complex segmentations , such as the Arabic tokenization provided by MADA (Habash et al., 2009), require further orthographic adjustments to reverse normalizations performed during segmentation.
segmentations is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Tofiloski, Milan and Brooke, Julian and Taboada, Maite
Data and Evaluation
use the coauthor’s segmentations as the gold standard.
Discussion
Also to be investigated is a quantitative study of the effects of high-precision/low-recall vs. low-precision/high-recall segmenters on the construction of discourse trees.
Results
Additionally, we compared SLSeg and SPADE to the original RST segmentations of the three RST texts taken from RST literature.
segmentations is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhao, Qiuye and Marcus, Mitch
Abstract
Since it is hard to achieve the best segmentations with tagset IB, we propose an indirect way to use these constraints in the following section, instead of applying these constraints as straightforwardly as in English POS tagging.
Abstract
wEsegGEN(c) i=1 where function segGEN maps character sequence c to the set of all possible segmentations of c. For example, W = (cl..cll)...(cn_lk+1...cn) represents a segmentation of 1:: words and the lengths of the first and last word are [1 and lk respectively.
Abstract
We transform tagged character sequences to word segmentations first, and then evaluate word segmenta-tions by F-measure, as defined in Section 5.2.
segmentations is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Li, Zhifei and Eisner, Jason and Khudanpur, Sanjeev
Abstract
That is, the probability of an output string is split among many distinct derivations (e.g., trees or segmentations ).
Background 2.1 Terminology
(2003)), where different segmentations lead to the same translation string (Figure l), and in syntax-based systems (e.g., Chiang (2007)), where different derivation trees yield the same string (Figure 2).
Background 2.1 Terminology
Figure l: Segmentation ambiguity in phrase-based MT: two different segmentations lead to the same translation string.
segmentations is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Adler, Meni and Goldberg, Yoav and Gabay, David and Elhadad, Michael
Evaluation
Our baseline proposes the most frequent tag (proper name) for all possible segmentations of the token, in a uniform distribution.
Method
We hypothesize a uniform distribution among the possible segmentations and aggregate a distribution of possible tags for the analysis.
Previous Work
(of all words in a given sentence) and the POS tagging (of the known words) is based on a Viterbi search over a lattice composed of all possible word segmentations and the possible classifications of all observed characters.
segmentations is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: