Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
Zeng, Xiaodong and Wong, Derek F. and Chao, Lidia S. and Trancoso, Isabel

Article Structure

Abstract

This paper introduces a graph-based semi-supervised joint model of Chinese word segmentation and part-of-speech tagging.

Introduction

Word segmentation and part-of-speech (POS) tagging are two critical and necessary initial procedures with respect to the majority of high-level Chinese language processing tasks such as syntax parsing, information extraction and machine translation.

Related Work

Prior supervised joint S&T models present approximate 0.2% - 1.3% improvement in F-score over supervised pipeline ones.

Background

3.1 Supervised Character-based Model

Method

The emphasis of this work is on building a joint S&T model based on two different kinds of data sources, labeled and unlabeled data.

Topics

unlabeled data

Appears in 33 sentences as: unlabeled data (34)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. One constructs a nearest-neighbor similarity graph over all trigrams of labeled and unlabeled data for propagating syntactic information, i.e., label distributions.
    Page 1, “Abstract”
  2. The derived label distributions are regarded as virtual evidences to regularize the learning of linear conditional random fields (CRFs) on unlabeled data .
    Page 1, “Abstract”
  3. Therefore, semi-supervised joint S&T appears to be a natural solution for easily incorporating accessible unlabeled data to improve the joint S&T model.
    Page 1, “Introduction”
  4. Motivated by the works in (Subramanya et al., 2010; Das and Smith, 2011), for structured problems, graph-based label propagation can be employed to infer valuable syntactic information (n-gram-level label distributions) from labeled data to unlabeled data .
    Page 1, “Introduction”
  5. labeled and unlabeled data to achieve the semi-supervised learning.
    Page 2, “Introduction”
  6. Sun and Xu (2011) enhanced a CWS model by interpolating statistical features of unlabeled data into the CRFs model.
    Page 2, “Related Work”
  7. (2011) proposed a semi-supervised pipeline S&T model by incorporating n-gram and lexicon features derived from unlabeled data .
    Page 2, “Related Work”
  8. Different from their concern, our emphasis is to learn the semi-supervised model by injecting the label information from a similarity graph constructed from labeled and unlabeled data .
    Page 2, “Related Work”
  9. (2006), extended by Mann and McCallum (2007), reported a semi-supervised CRFs model which aims to guide the learning by minimizing the conditional entropy of unlabeled data .
    Page 2, “Related Work”
  10. The emphasis of this work is on building a joint S&T model based on two different kinds of data sources, labeled and unlabeled data .
    Page 3, “Method”
  11. In essence, this learning problem can be treated as incorporating certain gainful information, e.g., prior knowledge or label constraints, of unlabeled data into the supervised model.
    Page 3, “Method”

See all papers in Proc. ACL 2013 that mention unlabeled data.

See all papers in Proc. ACL that mention unlabeled data.

Back to top.

semi-supervised

Appears in 31 sentences as: Semi-Supervised (1) semi-supervised (30)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. This paper introduces a graph-based semi-supervised joint model of Chinese word segmentation and part-of-speech tagging.
    Page 1, “Abstract”
  2. Empirical results on Chinese tree bank (CTB-7) and Microsoft Research corpora (MSR) reveal that the proposed model can yield better results than the supervised baselines and other competitive semi-supervised CRFs in this task.
    Page 1, “Abstract”
  3. Therefore, semi-supervised joint S&T appears to be a natural solution for easily incorporating accessible unlabeled data to improve the joint S&T model.
    Page 1, “Introduction”
  4. This study focuses on using a graph-based label propagation method to build a semi-supervised joint S&T model.
    Page 1, “Introduction”
  5. labeled and unlabeled data to achieve the semi-supervised learning.
    Page 2, “Introduction”
  6. There are few explorations of semi-supervised approaches for CWS or POS tagging in previous works.
    Page 2, “Related Work”
  7. (2008) described a Bayesian semi-supervised CWS model by considering the segmentation as the hidden variable in machine translation.
    Page 2, “Related Work”
  8. (2011) proposed a semi-supervised pipeline S&T model by incorporating n-gram and lexicon features derived from unlabeled data.
    Page 2, “Related Work”
  9. Different from their concern, our emphasis is to learn the semi-supervised model by injecting the label information from a similarity graph constructed from labeled and unlabeled data.
    Page 2, “Related Work”
  10. also differs from other semi-supervised CRFs algorithms.
    Page 2, “Related Work”
  11. (2006), extended by Mann and McCallum (2007), reported a semi-supervised CRFs model which aims to guide the learning by minimizing the conditional entropy of unlabeled data.
    Page 2, “Related Work”

See all papers in Proc. ACL 2013 that mention semi-supervised.

See all papers in Proc. ACL that mention semi-supervised.

Back to top.

CRFs

Appears in 30 sentences as: CRFs (32)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. The derived label distributions are regarded as virtual evidences to regularize the learning of linear conditional random fields ( CRFs ) on unlabeled data.
    Page 1, “Abstract”
  2. Empirical results on Chinese tree bank (CTB-7) and Microsoft Research corpora (MSR) reveal that the proposed model can yield better results than the supervised baselines and other competitive semi-supervised CRFs in this task.
    Page 1, “Abstract”
  3. The derived label distributions are regarded as prior knowledge to regularize the learning of a sequential model, conditional random fields ( CRFs ) in this case, on both
    Page 1, “Introduction”
  4. Section 3 reviews the background, including supervised character-based joint S&T model based on CRFs and graph-based label propagation.
    Page 2, “Introduction”
  5. Sun and Xu (2011) enhanced a CWS model by interpolating statistical features of unlabeled data into the CRFs model.
    Page 2, “Related Work”
  6. also differs from other semi-supervised CRFs algorithms.
    Page 2, “Related Work”
  7. (2006), extended by Mann and McCallum (2007), reported a semi-supervised CRFs model which aims to guide the learning by minimizing the conditional entropy of unlabeled data.
    Page 2, “Related Work”
  8. The proposed approach regularizes the CRFs by the graph information.
    Page 2, “Related Work”
  9. (2010) proposed a graph-based self-train style semi-supervised CRFs algorithm.
    Page 2, “Related Work”
  10. The first-order CRFs model (Lafferty et al., 2001) has been the most common one in this task.
    Page 2, “Background”
  11. The goal is to learn a CRFs model in the form,
    Page 2, “Background”

See all papers in Proc. ACL 2013 that mention CRFs.

See all papers in Proc. ACL that mention CRFs.

Back to top.

POS tagging

Appears in 20 sentences as: POS tag (3) POS Tagging (1) POS tagging (14) POS tags (3)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. The traditional way of segmentation and tagging is performed in a pipeline approach, first segmenting a sentence into words, and then assigning each word a POS tag .
    Page 1, “Introduction”
  2. The pipeline approach is very simple to implement, but frequently causes error propagation, given that wrong seg-mentations in the earlier stage harm the subsequent POS tagging (Ng and Low, 2004).
    Page 1, “Introduction”
  3. The joint approaches of word segmentation and POS tagging (joint S&T) are proposed to resolve these two tasks simultaneously.
    Page 1, “Introduction”
  4. In the past years, several proposed supervised joint models (Ng and Low, 2004; Zhang and Clark, 2008; Jiang et al., 2009; Zhang and Clark, 2010) achieved reasonably accurate results, but the outstanding problem among these models is that they rely heavily on a large amount of labeled data, i.e., segmented texts with POS tags .
    Page 1, “Introduction”
  5. Graph-based label propagation methods have recently shown they can outperform the state-of-the-art in several natural language processing (NLP) tasks, e.g., POS tagging (Subramanya et al., 2010), knowledge acquisition (Talukdar et al., 2008), shallow semantic parsing for unknown predicate (Das and Smith, 2011).
    Page 1, “Introduction”
  6. As far as we know, however, these methods have not yet been applied to resolve the problem of joint Chinese word segmentation (CWS) and POS tagging .
    Page 1, “Introduction”
  7. There are few explorations of semi-supervised approaches for CWS or POS tagging in previous works.
    Page 2, “Related Work”
  8. To perform segmentation and tagging simultaneously in a uniform framework, according to Ng and Low (2004), the tag is composed of a word boundary part, and a POS part, e. g., “B _N N” refers to the first character in a word with POS tag “NN”.
    Page 2, “Background”
  9. As for the POS tag , we shal-1 use the 33 tags in the Chinese tree bank.
    Page 2, “Background”
  10. In fact, the sparsity is also a common phenomenon among character-based CWS and POS tagging .
    Page 5, “Method”
  11. The performance measurement indicators for word segmentation and POS tagging (joint S&T) are balance F-score, F = 2PIU(P+R), the harmonic mean of precision (P) and recall (R), and out-of-vocabulary recall (OOV—R).
    Page 6, “Method”

See all papers in Proc. ACL 2013 that mention POS tagging.

See all papers in Proc. ACL that mention POS tagging.

Back to top.

graph-based

Appears in 15 sentences as: Graph-based (3) graph-based (12)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. This paper introduces a graph-based semi-supervised joint model of Chinese word segmentation and part-of-speech tagging.
    Page 1, “Abstract”
  2. The proposed approach is based on a graph-based label propagation technique.
    Page 1, “Abstract”
  3. This study focuses on using a graph-based label propagation method to build a semi-supervised joint S&T model.
    Page 1, “Introduction”
  4. Graph-based label propagation methods have recently shown they can outperform the state-of-the-art in several natural language processing (NLP) tasks, e.g., POS tagging (Subramanya et al., 2010), knowledge acquisition (Talukdar et al., 2008), shallow semantic parsing for unknown predicate (Das and Smith, 2011).
    Page 1, “Introduction”
  5. Motivated by the works in (Subramanya et al., 2010; Das and Smith, 2011), for structured problems, graph-based label propagation can be employed to infer valuable syntactic information (n-gram-level label distributions) from labeled data to unlabeled data.
    Page 1, “Introduction”
  6. Section 3 reviews the background, including supervised character-based joint S&T model based on CRFs and graph-based label propagation.
    Page 2, “Introduction”
  7. (2010) proposed a graph-based self-train style semi-supervised CRFs algorithm.
    Page 2, “Related Work”
  8. 3.2 Graph-based Label Propagation
    Page 3, “Background”
  9. Graph-based label propagation, a critical subclass of semi-supervised learning (SSL), has been widely used and shown to outperform other SSL methods (Chapelle et al., 2006).
    Page 3, “Background”
  10. Typically, graph-based label propagation algorithms are run in two main steps: graph construction and label propagation.
    Page 3, “Background”
  11. The proposed approach employs a transductive graph-based label propagation method to acquire such gainful information, i.e., label distributions from a similarity graph constructed over labeled and unlabeled data.
    Page 3, “Method”

See all papers in Proc. ACL 2013 that mention graph-based.

See all papers in Proc. ACL that mention graph-based.

Back to top.

joint model

Appears in 9 sentences as: joint model (6) joint models (2) jointly modeled (1)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. This paper introduces a graph-based semi-supervised joint model of Chinese word segmentation and part-of-speech tagging.
    Page 1, “Abstract”
  2. An inductive character-based joint model is obtained eventually.
    Page 1, “Abstract”
  3. In the past years, several proposed supervised joint models (Ng and Low, 2004; Zhang and Clark, 2008; Jiang et al., 2009; Zhang and Clark, 2010) achieved reasonably accurate results, but the outstanding problem among these models is that they rely heavily on a large amount of labeled data, i.e., segmented texts with POS tags.
    Page 1, “Introduction”
  4. The state-of-the-art joint models include reranking approaches (Shi and Wang, 2007), hybrid approaches (Nakagawa and Uchimoto, 2007; Jiang et al., 2008; Sun, 2011), and single-model approaches (Ng and Low, 2004; Zhang and Clark, 2008; Kruengkrai et al., 2009; Zhang and Clark, 2010).
    Page 2, “Related Work”
  5. It is directed to maximize the conditional likelihood of hidden states with the derived label distributions on unlabeled data, i.e., p(y, vlzc), where y and v are jointly modeled but
    Page 5, “Method”
  6. Firstly, as expected, for the two supervised baselines, the joint model outperforms the pipeline one, especially on segmentation.
    Page 7, “Method”
  7. This outcome verifies the commonly accepted fact that the joint model can substantially improve the pipeline one, since POS tags provide additional information to word segmentation (Ng and Low, 2004).
    Page 7, “Method”
  8. An interesting phenomenon is found among the comparisons with baselines that the supervised joint model (Baseline II) is even competitive with semi-supervised pipeline one (Wang et al., 2011).
    Page 7, “Method”
  9. A statistical analysis of the segmentation and tagging results of the supervised joint model (Baseline 11) and our model is carried out to comprehend the influence of the graph-based semi-supervised behavior.
    Page 8, “Method”

See all papers in Proc. ACL 2013 that mention joint model.

See all papers in Proc. ACL that mention joint model.

Back to top.

word segmentation

Appears in 9 sentences as: Word segmentation (1) word segmentation (8)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. This paper introduces a graph-based semi-supervised joint model of Chinese word segmentation and part-of-speech tagging.
    Page 1, “Abstract”
  2. Word segmentation and part-of-speech (POS) tagging are two critical and necessary initial procedures with respect to the majority of high-level Chinese language processing tasks such as syntax parsing, information extraction and machine translation.
    Page 1, “Introduction”
  3. The joint approaches of word segmentation and POS tagging (joint S&T) are proposed to resolve these two tasks simultaneously.
    Page 1, “Introduction”
  4. As far as we know, however, these methods have not yet been applied to resolve the problem of joint Chinese word segmentation (CWS) and POS tagging.
    Page 1, “Introduction”
  5. The performance measurement indicators for word segmentation and POS tagging (joint S&T) are balance F-score, F = 2PIU(P+R), the harmonic mean of precision (P) and recall (R), and out-of-vocabulary recall (OOV—R).
    Page 6, “Method”
  6. This outcome verifies the commonly accepted fact that the joint model can substantially improve the pipeline one, since POS tags provide additional information to word segmentation (Ng and Low, 2004).
    Page 7, “Method”
  7. Overall, for word segmentation , it obtains average improvements of 1.43% and 8.09% in F-score and OOV—R over others; for POS tagging, it achieves average improvements of 1.09% and 7.73%.
    Page 8, “Method”
  8. For word segmentation , the most significant improvement of our model is mainly concentrated on two kinds of words which are known for their difficulties in terms of CWS: a) named entities (NE), e.g., “3§7$?%” (Tianjin port) and “@131 IX” (free tax zone); and b) Chinese numbers (CN), e. g., “/ Kaila” (eight hundred and fifty million) and “E§J\Z’b+:” (seventy two percent).
    Page 8, “Method”
  9. This study introduces a novel semi-supervised approach for joint Chinese word segmentation and POS tagging.
    Page 9, “Method”

See all papers in Proc. ACL 2013 that mention word segmentation.

See all papers in Proc. ACL that mention word segmentation.

Back to top.

F-score

Appears in 7 sentences as: F-score (7)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. Experiments on the data from the Chinese tree bank (CTB-7) and Microsoft Research (MSR) show that the proposed model results in significant improvement over other comparative candidates in terms of F-score and out-of-vocabulary (OOV) recall.
    Page 2, “Introduction”
  2. Prior supervised joint S&T models present approximate 0.2% - 1.3% improvement in F-score over supervised pipeline ones.
    Page 2, “Related Work”
  3. The performance measurement indicators for word segmentation and POS tagging (joint S&T) are balance F-score , F = 2PIU(P+R), the harmonic mean of precision (P) and recall (R), and out-of-vocabulary recall (OOV—R).
    Page 6, “Method”
  4. It obtains 0.92% and 2.32% increase in terms of F-score and OOV—R respectively.
    Page 7, “Method”
  5. On the whole, for segmentation, they achieve average improvements of 1.02% and 6.8% in F-score and OOV—R; whereas for POS tagging, the average increments of F-sore and OOV—R are 0.87% and 6.45%.
    Page 7, “Method”
  6. Overall, for word segmentation, it obtains average improvements of 1.43% and 8.09% in F-score and OOV—R over others; for POS tagging, it achieves average improvements of 1.09% and 7.73%.
    Page 8, “Method”
  7. Figure 3 illustrates the curves of F-score and OOV—R for segmentation and tagging respectively, as the unlabeled data size is progressively increased in steps of 6,000 sentences.
    Page 8, “Method”

See all papers in Proc. ACL 2013 that mention F-score.

See all papers in Proc. ACL that mention F-score.

Back to top.

labeled data

Appears in 7 sentences as: labeled data (7)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. In the past years, several proposed supervised joint models (Ng and Low, 2004; Zhang and Clark, 2008; Jiang et al., 2009; Zhang and Clark, 2010) achieved reasonably accurate results, but the outstanding problem among these models is that they rely heavily on a large amount of labeled data , i.e., segmented texts with POS tags.
    Page 1, “Introduction”
  2. However, the production of such labeled data is extremely time-consuming and expensive (Jiao et al., 2006; J iang et al., 2009).
    Page 1, “Introduction”
  3. Motivated by the works in (Subramanya et al., 2010; Das and Smith, 2011), for structured problems, graph-based label propagation can be employed to infer valuable syntactic information (n-gram-level label distributions) from labeled data to unlabeled data.
    Page 1, “Introduction”
  4. It is especially helpful for the graph to make connections with trigrams that may not have been seen in labeled data but have similar label information.
    Page 4, “Method”
  5. The first term in Equation (5) is the same as Equation (2), which is the traditional CRFs leam-ing objective function on the labeled data .
    Page 5, “Method”
  6. To satisfy the characteristic of the semi-supervised learning problem, the train set, i.e., the labeled data , is formed by a relatively small amount of annotated texts sampled from CTB-7.
    Page 6, “Method”
  7. Very often, these words do not exist in the labeled data , so the supervised model is hard to learn their features.
    Page 8, “Method”

See all papers in Proc. ACL 2013 that mention labeled data.

See all papers in Proc. ACL that mention labeled data.

Back to top.

feature templates

Appears in 6 sentences as: feature templates (6)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. But overall, our approach differs in three important aspects: first, novel feature templates are defined for measuring the similarity between vertices.
    Page 2, “Related Work”
  2. the baseline feature templates of joint S&T are the ones used in (Ng and Low, 2004; Jiang et al., 2008), as shown in Table l. A = {A1A2...AK} E RK are the weight parameters to be learned.
    Page 3, “Background”
  3. Table l: The feature templates of joint S&T.
    Page 3, “Background”
  4. The feature templates are from Zhao et al.
    Page 7, “Method”
  5. The same feature templates in (Wang et al., 2011) are used, i.e., "+n-gram+cluster+leXicon".
    Page 7, “Method”
  6. The feature templates introduced in Section 3.1 are used.
    Page 7, “Method”

See all papers in Proc. ACL 2013 that mention feature templates.

See all papers in Proc. ACL that mention feature templates.

Back to top.

objective function

Appears in 6 sentences as: objective function (6)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. And third, the derived label information from the graph is smoothed into the model by optimizing a modified objective function .
    Page 2, “Related Work”
  2. This objective function can be optimized by the stochastic gradient method or other numerical optimization methods.
    Page 3, “Background”
  3. The squared-loss criterion1 is used to formulate the objective function .
    Page 5, “Method”
  4. Thus, the objective function can be optimized by L-BFGS-B (Zhu et al., 1997), a generic quasi-Newton gradient-based optimizer.
    Page 5, “Method”
  5. The first term in Equation (5) is the same as Equation (2), which is the traditional CRFs leam-ing objective function on the labeled data.
    Page 5, “Method”
  6. Thus, the objective function Equation (5) is optimized as follows: for the instances 2' = 1,2, ..., l, the parameters A are learned as the supervised manner; for the instances 2' = l+ l, l + 2, ..., u + l, in the E-step, the expected value of Q function is computed, based on the current model A9 .
    Page 6, “Method”

See all papers in Proc. ACL 2013 that mention objective function.

See all papers in Proc. ACL that mention objective function.

Back to top.

proposed model

Appears in 4 sentences as: proposed model (3) Proposed Models (1)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. Empirical results on Chinese tree bank (CTB-7) and Microsoft Research corpora (MSR) reveal that the proposed model can yield better results than the supervised baselines and other competitive semi-supervised CRFs in this task.
    Page 1, “Abstract”
  2. Experiments on the data from the Chinese tree bank (CTB-7) and Microsoft Research (MSR) show that the proposed model results in significant improvement over other comparative candidates in terms of F-score and out-of-vocabulary (OOV) recall.
    Page 2, “Introduction”
  3. 5.2 Baseline and Proposed Models
    Page 7, “Method”
  4. The proposed model will also be compared with the semi-supervised pipeline S&T model described in (Wang et al., 2011).
    Page 7, “Method”

See all papers in Proc. ACL 2013 that mention proposed model.

See all papers in Proc. ACL that mention proposed model.

Back to top.

Chinese word

Appears in 3 sentences as: Chinese word (3)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. This paper introduces a graph-based semi-supervised joint model of Chinese word segmentation and part-of-speech tagging.
    Page 1, “Abstract”
  2. As far as we know, however, these methods have not yet been applied to resolve the problem of joint Chinese word segmentation (CWS) and POS tagging.
    Page 1, “Introduction”
  3. This study introduces a novel semi-supervised approach for joint Chinese word segmentation and POS tagging.
    Page 9, “Method”

See all papers in Proc. ACL 2013 that mention Chinese word.

See all papers in Proc. ACL that mention Chinese word.

Back to top.

Chinese word segmentation

Appears in 3 sentences as: Chinese word segmentation (3)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. This paper introduces a graph-based semi-supervised joint model of Chinese word segmentation and part-of-speech tagging.
    Page 1, “Abstract”
  2. As far as we know, however, these methods have not yet been applied to resolve the problem of joint Chinese word segmentation (CWS) and POS tagging.
    Page 1, “Introduction”
  3. This study introduces a novel semi-supervised approach for joint Chinese word segmentation and POS tagging.
    Page 9, “Method”

See all papers in Proc. ACL 2013 that mention Chinese word segmentation.

See all papers in Proc. ACL that mention Chinese word segmentation.

Back to top.

hyperparameters

Appears in 3 sentences as: hyperparameters (3)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. ,u and A are two hyperparameters whose values are discussed in Section 5.
    Page 5, “Method”
  2. Based on the development data, the hyperparameters of our model were tuned among the following settings: for the graph propagation, ,u E {0.2,0.5,0.8} and A E {0.1,0.3,0.5,0.8}; for the CRFs training, 04 E {0.1,0.3,0.5,0.7,0.9}.
    Page 7, “Method”
  3. With the chosen set of hyperparameters , the test data was used to measure the final performance.
    Page 7, “Method”

See all papers in Proc. ACL 2013 that mention hyperparameters.

See all papers in Proc. ACL that mention hyperparameters.

Back to top.

machine translation

Appears in 3 sentences as: machine translation (3)
In Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
  1. Word segmentation and part-of-speech (POS) tagging are two critical and necessary initial procedures with respect to the majority of high-level Chinese language processing tasks such as syntax parsing, information extraction and machine translation .
    Page 1, “Introduction”
  2. (2008) described a Bayesian semi-supervised CWS model by considering the segmentation as the hidden variable in machine translation .
    Page 2, “Related Work”
  3. Unlike this model, the proposed approach is targeted at a general model, instead of one oriented to machine translation task.
    Page 2, “Related Work”

See all papers in Proc. ACL 2013 that mention machine translation.

See all papers in Proc. ACL that mention machine translation.

Back to top.