A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment
Yang, Fan and Zhao, Jun and Liu, Kang

Article Structure

Abstract

In this paper, we propose a novel system for translating organization names from Chinese to English with the assistance of web resources.

Introduction

The task of Named Entity (NE) translation is to translate a named entity from the source language to the target language, which plays an important role in machine translation and cross-language information retrieval (CLIR).

Related Work

In the past few years, researchers have proposed many approaches for organization translation.

The Framework of Our System

The Framework of our ON translation system shown in Figure 1 has four modules.

The Chunking-based Segmentation for Chinese ONs

In this section, we will illustrate a chunking-based Chinese ON segmentation method, which

Heuristic Query Construction

In order to use the web information to assist Chinese-English ON translation, we must firstly retrieve the bilingual web pages effectively.

Asymmetric Alignment Method for Equivalent Extraction

After we have obtained the web pages with the assistant of search engine, we extract the equivalent candidates from the bilingual web pages.

Experiments

We carried out experiments to investigate the performance improvement of ON translation under the assistance of web knowledge.

Conclusion

In this paper, we present a new approach which translates the Chinese ON into English with the assistance of web resources.

Topics

NER

Appears in 8 sentences as: NER (9)
In A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment
  1. However, the named entity recognition ( NER ) will always introduce some mistakes.
    Page 2, “Introduction”
  2. In order to avoid NER mistakes, we propose an asymmetric alignment method which align the Chinese ON with an English sentence directly and then extract the English fragment with the largest alignment score as the equivalent.
    Page 2, “Introduction”
  3. The asymmetric alignment method can avoid the influence of improper results of NER and generate an explicit matching between the source and the target phrases which can guarantee the precision of alignment.
    Page 2, “Introduction”
  4. We don’t need to implement English NER process which may make mistakes.
    Page 2, “Introduction”
  5. 1) Traditional alignment method needs the NER process in both sides, but the NER process may often bring in some mistakes.
    Page 5, “Asymmetric Alignment Method for Equivalent Extraction”
  6. The NER process is not necessary for that we align the Chinese ON with English sentences directly.
    Page 5, “Asymmetric Alignment Method for Equivalent Extraction”
  7. The asymmetric alignment method can avoid the mistakes made in the NER process and give an explicit alignment matching.
    Page 7, “Experiments”
  8. Our method can overcome the mistakes introduced in the NER process.
    Page 8, “Experiments”

See all papers in Proc. ACL 2009 that mention NER.

See all papers in Proc. ACL that mention NER.

Back to top.

translation model

Appears in 8 sentences as: Translation Model (1) translation model (6) translation models (1)
In A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment
  1. The first type of methods translates ONs by building a statistical translation model .
    Page 2, “Related Work”
  2. The statistical translation model can give an output for any input.
    Page 2, “Related Work”
  3. The performance of the statistical ON translation model is dependent on the precision of the Chinese ON segmentation to some extent.
    Page 3, “The Chunking-based Segmentation for Chinese ONs”
  4. In order to evaluate the influence of segmentation results upon the statistical ON translation system, we compare the results of two translation models .
    Page 6, “Experiments”
  5. For constructing a statistical ON translation model , we use GIZA++I to align the Chinese NEs and the English NEs in the training set.
    Page 6, “Experiments”
  6. Q2: the Chinese ON and the results of the statistical translation model .
    Page 7, “Experiments”
  7. 7.5 Comparison between Statistical ON Translation Model and Our Method
    Page 8, “Experiments”
  8. Compared with the statistical ON translation model , we can see that the performance is improved from 18.29% to 48.71% (the bold data shown in column 1 and column 3 of Table 5) by using our Chinese-English ON translation system.
    Page 8, “Experiments”

See all papers in Proc. ACL 2009 that mention translation model.

See all papers in Proc. ACL that mention translation model.

Back to top.

translation system

Appears in 6 sentences as: translation system (6)
In A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment
  1. The experimental results show that the proposed method outperforms the baseline statistical machine translation system by 30.42%.
    Page 1, “Abstract”
  2. For solving these two problems, we propose a Chinese-English organization name translation system using heuristic web mining and asymmetric alignment, which has three innovations.
    Page 1, “Introduction”
  3. The Framework of our ON translation system shown in Figure 1 has four modules.
    Page 3, “The Framework of Our System”
  4. In order to evaluate the influence of segmentation results upon the statistical ON translation system , we compare the results of two translation models.
    Page 6, “Experiments”
  5. Then the phrase-based machine translation system MOSES2 is adopted to translate the 503 Chinese NEs in testing set into English.
    Page 6, “Experiments”
  6. Compared with the statistical ON translation model, we can see that the performance is improved from 18.29% to 48.71% (the bold data shown in column 1 and column 3 of Table 5) by using our Chinese-English ON translation system .
    Page 8, “Experiments”

See all papers in Proc. ACL 2009 that mention translation system.

See all papers in Proc. ACL that mention translation system.

Back to top.

CRFs

Appears in 5 sentences as: CRFs (5)
In A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment
  1. CRFs Chunking Mode]
    Page 3, “The Framework of Our System”
  2. 4.3 The CRFs Model for Chunking
    Page 3, “The Chunking-based Segmentation for Chinese ONs”
  3. Considered as a discriminative probabilistic model for sequence joint labeling and with the advantage of flexible feature fusion ability, Conditional Random Fields ( CRFs ) [J .Lafferty et al., 2001] is believed to be one of the best probabilistic models for sequence labeling tasks.
    Page 3, “The Chunking-based Segmentation for Chinese ONs”
  4. So the CRFs model is employed for chunking.
    Page 3, “The Chunking-based Segmentation for Chinese ONs”
  5. Features used in CRFs model
    Page 4, “The Chunking-based Segmentation for Chinese ONs”

See all papers in Proc. ACL 2009 that mention CRFs.

See all papers in Proc. ACL that mention CRFs.

Back to top.

Chinese-English

Appears in 4 sentences as: Chinese-English (4)
In A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment
  1. For solving these two problems, we propose a Chinese-English organization name translation system using heuristic web mining and asymmetric alignment, which has three innovations.
    Page 1, “Introduction”
  2. In order to use the web information to assist Chinese-English ON translation, we must firstly retrieve the bilingual web pages effectively.
    Page 4, “Heuristic Query Construction”
  3. Compared with the statistical ON translation model, we can see that the performance is improved from 18.29% to 48.71% (the bold data shown in column 1 and column 3 of Table 5) by using our Chinese-English ON translation system.
    Page 8, “Experiments”
  4. It proves that our system can work well on the Chinese-English ON translation task.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2009 that mention Chinese-English.

See all papers in Proc. ACL that mention Chinese-English.

Back to top.

machine translation

Appears in 4 sentences as: machine translation (4)
In A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment
  1. The experimental results show that the proposed method outperforms the baseline statistical machine translation system by 30.42%.
    Page 1, “Abstract”
  2. The task of Named Entity (NE) translation is to translate a named entity from the source language to the target language, which plays an important role in machine translation and cross-language information retrieval (CLIR).
    Page 1, “Introduction”
  3. Then the phrase-based machine translation system MOSES2 is adopted to translate the 503 Chinese NEs in testing set into English.
    Page 6, “Experiments”
  4. First, word order determination is difficult in statistical machine translation (SMT), while search engines are insensitive to this problem.
    Page 8, “Experiments”

See all papers in Proc. ACL 2009 that mention machine translation.

See all papers in Proc. ACL that mention machine translation.

Back to top.

Chinese words

Appears in 3 sentences as: Chinese words (3)
In A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment
  1. The selection of the Chinese words to be translated will take into consideration both the translation confidence of the words and the information contents that they contain for the whole ON.
    Page 2, “Introduction”
  2. When Chinese words are aligned with English words, the mistakes made in Chinese segmentation may result in wrong alignment results.
    Page 3, “The Chunking-based Segmentation for Chinese ONs”
  3. A Chinese NE C0={CW1, CW2, ..., CWn} is a sequence of Chinese words CW, and the English
    Page 5, “Asymmetric Alignment Method for Equivalent Extraction”

See all papers in Proc. ACL 2009 that mention Chinese words.

See all papers in Proc. ACL that mention Chinese words.

Back to top.

named entity

Appears in 3 sentences as: named entities (1) Named Entity (1) named entity (2)
In A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment
  1. The task of Named Entity (NE) translation is to translate a named entity from the source language to the target language, which plays an important role in machine translation and cross-language information retrieval (CLIR).
    Page 1, “Introduction”
  2. 3) Asymmetric alignment: When we extract the translation equivalent from the web pages, the traditional method should recognize the named entities in the target language sentence first, and then the extracted NEs will be aligned with the source ON.
    Page 2, “Introduction”
  3. However, the named entity recognition (NER) will always introduce some mistakes.
    Page 2, “Introduction”

See all papers in Proc. ACL 2009 that mention named entity.

See all papers in Proc. ACL that mention named entity.

Back to top.