Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
Qian, Longhua and Hui, Haotian and Hu, Ya'nan and Zhou, Guodong and Zhu, Qiaoming

Article Structure

Abstract

Active learning (AL) has been proven effective to reduce human annotation efforts in NLP.

Topics

relation instances

Appears in 19 sentences as: relation instance (10) relation instances (11)
In Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
  1. However, there are cases when we may exploit relation extraction in multiple languages and there are corpora with relation instances annotated for more than one language, such as the ACE RDC 2005 English and Chinese corpora.
    Page 1, “Abstract”
  2. can be enhanced by relation instances translated from another language (e.g.
    Page 2, “Abstract”
  3. This demonstrates that there is some complementariness between relation instances in two languages, particularly when the training data is scarce.
    Page 2, “Abstract”
  4. This paper proposes a bilingual active leam-ing (BAL) paradigm to relation classification with a small number of labeled relation instances and a large number of unlabeled instances in two languages (nonparallel).
    Page 2, “Abstract”
  5. As far as representation of relation instances is concerned, there are fea-ture-based methods (Zhao et al., 2004; Zhou et
    Page 2, “Abstract”
  6. However, the mapping of two entities involved in a relation instance may leads to errors.
    Page 2, “Abstract”
  7. Specifically, with a sequence of K probabilities for a relation instance at some iteration, denoted as {p1492,...pK} in the descending order, the LC metric of the relation instance can be simply picked as the first one, i.e.
    Page 3, “Abstract”
  8. An important issue for bilingual learning is how to obtain two language views for relation instances from multilingual resources.
    Page 4, “Abstract”
  9. Both the mentions of relation instances and the mentions of two involved entities are first translated into the other language via machine translation.
    Page 5, “Abstract”
  10. Then, two entities in the original instance are aligned with their counterparts in the translated instance in order to form an aligned bilingual relation instance pair.
    Page 5, “Abstract”
  11. The relation instance is represented as the word sequence between two entities.
    Page 5, “Abstract”

See all papers in Proc. ACL 2014 that mention relation instances.

See all papers in Proc. ACL that mention relation instances.

Back to top.

relation extraction

Appears in 16 sentences as: relation extraction (19) relation extractors (1)
In Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
  1. In the literature, the mainstream research on relation extraction adopts statistical machine learning methods, which can be grouped into supervised learning (Zelenko et al., 2003; Culotta and Soresen, 2004; Zhou et al., 2005; Zhang et al., 2006; Qian et al., 2008; Chan and Roth, 2011), semi-supervised learning (Zhang et al., 2004; Chen et al., 2006; Zhou et al., 2008; Qian et al., 2010) and unsupervised learning (Hase-gawa et al., 2004; Zhang et al., 2005) in terms of the amount of labeled training data they need.
    Page 1, “Abstract”
  2. It is trivial to validate, as we will do later in this paper, that active learning can also alleviate the annotation burden for relation extraction in one language while retaining the extraction performance.
    Page 1, “Abstract”
  3. However, there are cases when we may exploit relation extraction in multiple languages and there are corpora with relation instances annotated for more than one language, such as the ACE RDC 2005 English and Chinese corpora.
    Page 1, “Abstract”
  4. (2013) shows that supervised relation extraction in one language (e.g.
    Page 1, “Abstract”
  5. One natural question is: Can this characteristic be made full use of so that active learning can maximally benef1t relation extraction in two languages?
    Page 2, “Abstract”
  6. Section 2 reviews the previous work on relation extraction while Section 3 describes our baseline systems.
    Page 2, “Abstract”
  7. While there are many studies in monolingual relation extraction, there are only a few on multilingual relation extraction in the literature.
    Page 2, “Abstract”
  8. Monolingual relation extraction: A wide range of studies on relation extraction focus on monolingual resources.
    Page 2, “Abstract”
  9. Both methods are also widely used in relation extraction in other languages, such as those in Chinese relation extraction (Che et al., 2005; Li et al., 2008; Yu et al., 2010).
    Page 2, “Abstract”
  10. Multilingual relation extraction: There are only two studies related to multilingual relation extraction .
    Page 2, “Abstract”
  11. While machine translation inherently deals with multilingual parallel corpora, our task focuses on relation extraction by pseudo parallel corpora in two languages.
    Page 3, “Abstract”

See all papers in Proc. ACL 2014 that mention relation extraction.

See all papers in Proc. ACL that mention relation extraction.

Back to top.

parallel corpora

Appears in 11 sentences as: parallel corpora (12)
In Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
  1. Instead of using a parallel corpus, labeled and unlabeled instances in one language are translated into ones in the other language and all instances in both languages are then fed into a bilingual active learning engine as pseudo parallel corpora .
    Page 1, “Abstract”
  2. Instead of using a parallel corpus which should have entity/relation alignment information and is thus difficult to obtain, this paper employs an off-the-shelf machine translator to translate both labeled and unlabeled instances from one language into the other language, forming pseudo parallel corpora .
    Page 2, “Abstract”
  3. (2010) propose a cross-lingual annotation projection approach which uses parallel corpora to acquire a relation detector on the target language.
    Page 2, “Abstract”
  4. Both studies transfer relation annotations via parallel corpora from the resource-rich language (English) to the resource-poor language (Korean), but not vice versa.
    Page 2, “Abstract”
  5. machine translation, which make use of multilingual corpora to decrease human annotation efforts by selecting highly informative sentences for a newly added language in multilingual parallel corpora .
    Page 3, “Abstract”
  6. While machine translation inherently deals with multilingual parallel corpora, our task focuses on relation extraction by pseudo parallel corpora in two languages.
    Page 3, “Abstract”
  7. parallel corpora (Lu et al., 2011), translated corpora (aka.
    Page 4, “Abstract”
  8. pseudo parallel corpora ) (Wan 2009), and bilingual leXicons (Oh et al., 2009).
    Page 4, “Abstract”
  9. We adopt the one with pseudo parallel corpora , using the machine translation method to generate instances from one language to the other in the BAL paradigm, as depicted in Fig.
    Page 4, “Abstract”
  10. In order to make full use of pseudo parallel corpora , translated labeled and unlabeled instances are augmented in the following two ways: 0 For labeled Chinese instances (LC) and Eng-
    Page 4, “Abstract”
  11. Given a small number of relation instances and a large number of unlabeled relation instances in both languages, we translate both the labeled and unlabeled instances in one language to the other as pseudo parallel corpora .
    Page 9, “Abstract”

See all papers in Proc. ACL 2014 that mention parallel corpora.

See all papers in Proc. ACL that mention parallel corpora.

Back to top.

machine translation

Appears in 10 sentences as: machine translation (9) machine translator (1)
In Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
  1. Instead of using a parallel corpus which should have entity/relation alignment information and is thus difficult to obtain, this paper employs an off-the-shelf machine translator to translate both labeled and unlabeled instances from one language into the other language, forming pseudo parallel corpora.
    Page 2, “Abstract”
  2. Based on a small number of labeled instances and a large number of unlabeled instances in both languages, our method differs from theirs in that we adopt a bilingual active learning paradigm via machine translation and improve the performance for both languages simultaneously.
    Page 2, “Abstract”
  3. machine translation , which make use of multilingual corpora to decrease human annotation efforts by selecting highly informative sentences for a newly added language in multilingual parallel corpora.
    Page 3, “Abstract”
  4. While machine translation inherently deals with multilingual parallel corpora, our task focuses on relation extraction by pseudo parallel corpora in two languages.
    Page 3, “Abstract”
  5. The only exception is AL for machine translation (Haffari et al., 2009; Haffari and Sarkar, 2009), whose purpose is to select the most informative sentences in the source language to be manually translated into the target language.
    Page 4, “Abstract”
  6. Previous studies (Reichart et al., 2008; Haffari and Sarkar, 2009) show that multitask active learning (MTAL) can yield promising overall results, no matter whether they are two different tasks or the task of machine translation on multiple language pairs.
    Page 4, “Abstract”
  7. We adopt the one with pseudo parallel corpora, using the machine translation method to generate instances from one language to the other in the BAL paradigm, as depicted in Fig.
    Page 4, “Abstract”
  8. Among the several off-the-shelf machine translation services, we select the Google Translator1 because of its high quality and easy accessibility.
    Page 5, “Abstract”
  9. Both the mentions of relation instances and the mentions of two involved entities are first translated into the other language via machine translation .
    Page 5, “Abstract”
  10. Our lexicon is derived from the FBIS parallel corpus (#LDC2003E14), which is widely used in machine translation between English and Chinese.
    Page 5, “Abstract”

See all papers in Proc. ACL 2014 that mention machine translation.

See all papers in Proc. ACL that mention machine translation.

Back to top.

entity mention

Appears in 8 sentences as: entity mention (7) entity mentions (2)
In Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
  1. (2005): a) Lexical features of entities and their contexts WMl: bag-of-words in the 1st entity mention HMl: headword of M1 WM2: bag-of-words in the 2nd entity mention HM2: headword of M2 HM12: combination of HMl and HMZ WBNULL: when no word in between WBFL: the only one word in between WBF: the first word in between when at least two words in between
    Page 3, “Abstract”
  2. c) Mention level ML12: combination of entity mention levels MT12: combination of LDC mention types
    Page 3, “Abstract”
  3. Put in another way, entity alignment automatically marks the entity mentions in the translated instance, thereby the feature vector corresponding to the translated instance can be constructed.
    Page 5, “Abstract”
  4. Entity alignment is vital in cross-language relation extraction whose difficulty lies in the fact that the same entity mention as an isolated phrase and as an integral phrase in the relation instance can be translated to different phrases.
    Page 5, “Abstract”
  5. For example, the Chinese entity mention “EFF—1” (oflicer) is translated to “officer” in isolation, it is, however, translated to “officials” when in the relation instance “fliljflfl ET El” (Syrian aficials).
    Page 5, “Abstract”
  6. - Me, entity mention in English
    Page 5, “Abstract”
  7. Therefore, we devise some heuristics to align entity mentions between Chinese and English.
    Page 5, “Abstract”
  8. Take entity alignment from English to Chinese as an example, given entity mention Me in relation instance Re in English and their respective translations Mct and RC in Chinese, the objective of entity alignment is to find MC, the counterpart of Me in RC.
    Page 5, “Abstract”

See all papers in Proc. ACL 2014 that mention entity mention.

See all papers in Proc. ACL that mention entity mention.

Back to top.

cross-lingual

Appears in 5 sentences as: cross-lingual (5)
In Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
  1. (2010) propose a cross-lingual annotation projection approach which uses parallel corpora to acquire a relation detector on the target language.
    Page 2, “Abstract”
  2. SL—CR (Supervised Learning with cross-lingual labeled instances): in addition to monolingual labeled instances (SL—MO), the training data for supervised learning contain labeled instances translated from the other language.
    Page 7, “Abstract”
  3. AL-CR (Active Learning with cross-lingual instances): both the manually labeled instances and their translated ones are added to the respective training data.
    Page 7, “Abstract”
  4. The table also shows the consistent utility of cross-lingual information for relation classification for both languages.
    Page 8, “Abstract”
  5. When cross-lingual information is augmented, SL-CR outperforms SL-MO and AL-CR outperforms AL-MO.
    Page 8, “Abstract”

See all papers in Proc. ACL 2014 that mention cross-lingual.

See all papers in Proc. ACL that mention cross-lingual.

Back to top.

SVM

Appears in 5 sentences as: SVM (5)
In Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
  1. Input: - L, labeled data set - U, unlabeled data set - n, batch size Output: - SVM , classifier Repeat: 1.
    Page 3, “Abstract”
  2. Train a single classifier SVM on L 2.
    Page 3, “Abstract”
  3. The objective is to learn SVM classifiers in both languages, denoted as SVMC and SVMe respectively, in a BAL fashion to improve their classification performance.
    Page 4, “Abstract”
  4. The training parameters C ( SVM ) is set to 2.4 according to our previous work on relation extraction (Qian et al., 2010).
    Page 7, “Abstract”
  5. SL—MO (Supervised Learning with monolingual labeled instances): only the monolingual labeled instances are fed to the SVM classifiers for both Chinese and English relation classification respectively.
    Page 7, “Abstract”

See all papers in Proc. ACL 2014 that mention SVM.

See all papers in Proc. ACL that mention SVM.

Back to top.

Learning Algorithm

Appears in 4 sentences as: Learning Algorithm (2) learning algorithm (2)
In Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
  1. This section first introduces the fundamental supervised learning method, and then describes a baseline active learning algorithm .
    Page 3, “Abstract”
  2. 3.2 Active Learning Algorithm
    Page 3, “Abstract”
  3. 4.4 Bilingual Active Learning Algorithm
    Page 5, “Abstract”
  4. Bilingual active learning algorithm
    Page 6, “Abstract”

See all papers in Proc. ACL 2014 that mention Learning Algorithm.

See all papers in Proc. ACL that mention Learning Algorithm.

Back to top.

Baseline Systems

Appears in 3 sentences as: baseline system (1) Baseline Systems (1) baseline systems (1)
In Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
  1. Section 2 reviews the previous work on relation extraction while Section 3 describes our baseline systems .
    Page 2, “Abstract”
  2. 3 Baseline Systems
    Page 3, “Abstract”
  3. Particularly, SL—MO is used as the baseline system against which deficiency scores for other methods are computed.
    Page 7, “Abstract”

See all papers in Proc. ACL 2014 that mention Baseline Systems.

See all papers in Proc. ACL that mention Baseline Systems.

Back to top.

human annotation

Appears in 3 sentences as: human annotation (3)
In Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
  1. Active learning (AL) has been proven effective to reduce human annotation efforts in NLP.
    Page 1, “Abstract”
  2. machine translation, which make use of multilingual corpora to decrease human annotation efforts by selecting highly informative sentences for a newly added language in multilingual parallel corpora.
    Page 3, “Abstract”
  3. For future work, on one hand, we plan to combine uncertainty sampling with diversity and informativeness measures; on the other hand, we intend to combine BAL with semi-supervised learning to further reduce human annotation efforts.
    Page 9, “Abstract”

See all papers in Proc. ACL 2014 that mention human annotation.

See all papers in Proc. ACL that mention human annotation.

Back to top.

labeled data

Appears in 3 sentences as: labeled data (3)
In Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
  1. Usually the extraction performance depends heavily on the quality and quantity of the labeled data , however, the manual annotation of a large-scale corpus is labor-intensive and time-consuming.
    Page 1, “Abstract”
  2. During iterations a batch of unlabeled instances are chosen in terms of their informativeness to the current classifier, labeled by an oracle and in turn added into the labeled data to retrain the classifier.
    Page 3, “Abstract”
  3. Input: - L, labeled data set - U, unlabeled data set - n, batch size Output: - SVM, classifier Repeat: 1.
    Page 3, “Abstract”

See all papers in Proc. ACL 2014 that mention labeled data.

See all papers in Proc. ACL that mention labeled data.

Back to top.

named entity

Appears in 3 sentences as: name entity (1) named entity (2)
In Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
  1. Moreover, the success of joint bilingual learning may lend itself to many inherent multilingual NLP tasks such as POS tagging (Yarowsky and Ngai, 2001), name entity recognition (Yarowsky et al., 2001), sentiment analysis (Wan, 2009), and semantic role labeling (Sebastian and Lapata, 2009) etc.
    Page 2, “Abstract”
  2. It has been successfully applied to many NLP applications, such as POS tagging (Engelson and Dagan, 1996; Ringger et al., 2007), word sense disambiguation (Chan and Ng, 2007; Zhu and Hovy, 2007), sentiment detection (Brew et al., 2010; Li et al., 2012), syntactical parsing (Hwa, 2004; Osborne and Baldridge, 2004), and named entity recognition (Shen et al., 2004; Tomanek et al., 2007; Tomanek and Hahn, 2009) etc.
    Page 2, “Abstract”
  3. named entity and syntactic parse tree).
    Page 2, “Abstract”

See all papers in Proc. ACL 2014 that mention named entity.

See all papers in Proc. ACL that mention named entity.

Back to top.

parallel corpus

Appears in 3 sentences as: parallel corpus (3)
In Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
  1. Instead of using a parallel corpus , labeled and unlabeled instances in one language are translated into ones in the other language and all instances in both languages are then fed into a bilingual active learning engine as pseudo parallel corpora.
    Page 1, “Abstract”
  2. Instead of using a parallel corpus which should have entity/relation alignment information and is thus difficult to obtain, this paper employs an off-the-shelf machine translator to translate both labeled and unlabeled instances from one language into the other language, forming pseudo parallel corpora.
    Page 2, “Abstract”
  3. Our lexicon is derived from the FBIS parallel corpus (#LDC2003E14), which is widely used in machine translation between English and Chinese.
    Page 5, “Abstract”

See all papers in Proc. ACL 2014 that mention parallel corpus.

See all papers in Proc. ACL that mention parallel corpus.

Back to top.

semi-supervised

Appears in 3 sentences as: semi-supervised (3)
In Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora
  1. In the literature, the mainstream research on relation extraction adopts statistical machine learning methods, which can be grouped into supervised learning (Zelenko et al., 2003; Culotta and Soresen, 2004; Zhou et al., 2005; Zhang et al., 2006; Qian et al., 2008; Chan and Roth, 2011), semi-supervised learning (Zhang et al., 2004; Chen et al., 2006; Zhou et al., 2008; Qian et al., 2010) and unsupervised learning (Hase-gawa et al., 2004; Zhang et al., 2005) in terms of the amount of labeled training data they need.
    Page 1, “Abstract”
  2. Therefore, Kim and Lee (2012) further employ a graph-based semi-supervised learning method, namely Label Propagation (LP), to indirectly propagate labels from the source language to the target language in an iterative fashion.
    Page 2, “Abstract”
  3. For future work, on one hand, we plan to combine uncertainty sampling with diversity and informativeness measures; on the other hand, we intend to combine BAL with semi-supervised learning to further reduce human annotation efforts.
    Page 9, “Abstract”

See all papers in Proc. ACL 2014 that mention semi-supervised.

See all papers in Proc. ACL that mention semi-supervised.

Back to top.