Together We Can: Bilingual Bootstrapping for WSD
Khapra, Mitesh M. and Joshi, Salil and Chatterjee, Arindam and Bhattacharyya, Pushpak

Article Structure

Abstract

Recent work on bilingual Word Sense Disambiguation (WSD) has shown that a resource deprived language (L1) can benefit from the annotation work done in a resource rich language (L2) via parameter projection.

Introduction

The high cost of collecting sense annotated data for supervised approaches (Ng and Lee, 1996; Lee et al., 2004) has always remained a matter of concern for some of the resource deprived languages of the world.

Related Work

Bootstrapping for Word Sense Disambiguation was first discussed in (Yarowsky, 1995).

Synset Aligned Multilingual Dictionary

A novel and effective method of storage and use of dictionary in a multilingual setting was proposed by Mohanty et al.

Parameter Projection

Khapra et al.

Bilingual Bootstrapping

We now come to the main contribution of our work, i. e., bilingual bootstrapping.

Experimental Setup

We used the publicly available dataset1 described in Khapra et al.

Results

The results of our experiments are summarized in Figures 1 to 4.

Discussions

In this section we discuss the important observations made from Figures 1 to 4.

Conclusion

We presented a bilingual bootstrapping algorithm for Word Sense Disambiguation which allows two resource deprived languages to mutually benefit

Topics

Wordnet

Appears in 14 sentences as: Wordnet (11) wordnet (1) Wordnets (4)
In Together We Can: Bilingual Bootstrapping for WSD
  1. This is achieved with the help of a novel synset-aligned multilingual dictionary which facilitates the projection of parameters learned from the Wordnet and annotated corpus of L1 to L2.
    Page 1, “Introduction”
  2. They showed that it is possible to project the parameters learned from the annotation work of one language to another language provided aligned Wordnets for the two languages are available.
    Page 2, “Related Work”
  3. However, they do not address situations where two resource deprived languages have aligned Wordnets but neither has sufficient annotated data.
    Page 2, “Related Work”
  4. Wordnet-dependent parameters depend on the structure of the Wordnet whereas the Corpus-dependent parameters depend on various statistics learned from a sense marked corpora.
    Page 3, “Parameter Projection”
  5. Both the tasks of (a) constructing a Wordnet from scratch and (b) collecting sense marked corpora for multiple languages are tedious and expensive.
    Page 3, “Parameter Projection”
  6. (2009) observed that by projecting relations from the Wordnet of a language and by projecting corpus statistics from the sense marked corpora of the language to those of the target language, the efi‘ort required in constructing semantic graphs for multiple Wordnets and collecting sense marked corpora for multiple languages can be avoided or reduced.
    Page 4, “Parameter Projection”
  7. By linking with the synsets of a pivot resource rich language (Hindi, in our case), the cost of building Wordnets of other languages is partly reduced (semantic relations are inherited).
    Page 4, “Parameter Projection”
  8. The Wordnet parameters of Hindi Wordnet now become projectable to other languages.
    Page 4, “Parameter Projection”
  9. degree of Wordnet polysemy for polysemous words
    Page 5, “Experimental Setup”
  10. Table 4: Average degree of Wordnet polysemy per category in the 2 domains for Hindi
    Page 5, “Experimental Setup”
  11. degree of Wordnet polysemy for polysemous words
    Page 5, “Experimental Setup”

See all papers in Proc. ACL 2011 that mention Wordnet.

See all papers in Proc. ACL that mention Wordnet.

Back to top.

synset

Appears in 13 sentences as: Synset (1) synset (9) Synsets (1) synsets (6)
In Together We Can: Bilingual Bootstrapping for WSD
  1. Section 3 describes the Synset aligned multilingual dictionary which facilitates parameter projection.
    Page 2, “Introduction”
  2. At the heart of our work lies parameter projection facilitated by a synset aligned
    Page 2, “Related Work”
  3. One important departure in this framework from the traditional dictionary is that synsets are linked, and after that the words inside the synsets are linked.
    Page 3, “Synset Aligned Multilingual Dictionary”
  4. The basic mapping is thus between synsets and thereafter between the words.
    Page 3, “Synset Aligned Multilingual Dictionary”
  5. After the synsets are linked, cross linkages are set up manually from the words of a synset to the words of a linked synset of the pivot language.
    Page 3, “Synset Aligned Multilingual Dictionary”
  6. For example, for the Marathi word 1ERITIT(mulgaa), “a youthful male person”, the correct lexical substitute from the corresponding Hindi synset is (7/3797 (ladkaa).
    Page 3, “Synset Aligned Multilingual Dictionary”
  7. The average number of such links per synset per language pair is approximately 3.
    Page 3, “Synset Aligned Multilingual Dictionary”
  8. However, since our work takes place in a semi-supervised setting, we do not assume the presence of these manual cross linkages between synset members.
    Page 3, “Synset Aligned Multilingual Dictionary”
  9. Instead, in the above example, we assume that all the words in the Hindi synset are equally probable translations of every word in the corresponding Marathi synset .
    Page 3, “Synset Aligned Multilingual Dictionary”
  10. Such cross-linkages between synset members facilitate parameter projection as explained in the next section.
    Page 3, “Synset Aligned Multilingual Dictionary”
  11. i E Candidate Synsets J 2 Set ofdisambigaated words 6i 2 BelongingnessToDominantConcept(Si) V; = P(Si|w0rd) Wij = CorpusCooccurrencdSi, Sj) >|< l/WNConceptualDistance(Si, Sj) >|< l/WNSemanticGraphDistance(Si, 33-)
    Page 3, “Parameter Projection”

See all papers in Proc. ACL 2011 that mention synset.

See all papers in Proc. ACL that mention synset.

Back to top.

F-score

Appears in 12 sentences as: F-score (14)
In Together We Can: Bilingual Bootstrapping for WSD
  1. Seed Size v/s F-score
    Page 6, “Experimental Setup”
  2. 80 70 60 go‘ 50 g 40 O (I) [L 30 20 OnlySeed éfi ' I. WFS 10 “ BiBoot ' 0 [I I I I MonoBoot 7777 ~ 0 1000 2000 3000 4000 5000 Seed Size (words) Figure 1: Comparison of BiBoot, MonoBoot, OnlySeed and WF S on Hindi Health data Seed Size v/s F-score 80 $3 9 O O (I) [L OnlySeed éfi ' WF ..= BiBoot ' 0 , I I I MonoBoot 7777 ~ 0 1000 2000 3000 4000 5000
    Page 6, “Experimental Setup”
  3. Seed Size v/s F-score
    Page 6, “Experimental Setup”
  4. Seed Size v/s F-score
    Page 6, “Experimental Setup”
  5. F-score (%)
    Page 6, “Experimental Setup”
  6. a. BiBoot: This curve represents the F-score obtained after 10 iterations by using bilingual bootstrapping with different amounts of seed data.
    Page 7, “Results”
  7. b. MonoBoot: This curve represents the F-score obtained after 10 iterations by using monolingual bootstrapping with different amounts of seed data.
    Page 7, “Results”
  8. c. OnlySeed: This curve represents the F-score obtained by training on the seed data alone without using any bootstrapping.
    Page 7, “Results”
  9. d. WF S : This curve represents the F-score obtained by simply selecting the first sense from Wordnet, a typically reported baseline.
    Page 7, “Results”
  10. For small seed sizes, the F-score of bilingual bootstrapping is consistently better than the F-score obtained by training only on the seed data without using any bootstrapping.
    Page 7, “Discussions”
  11. To further illustrate this, we take some sample points from the graph and compare the number of tagged words needed by BiBoot and OnlySeed to reach the same (or nearly the same) F-score .
    Page 7, “Discussions”

See all papers in Proc. ACL 2011 that mention F-score.

See all papers in Proc. ACL that mention F-score.

Back to top.

Labeled Data

Appears in 5 sentences as: Labeled Data (4) labeled data (3)
In Together We Can: Bilingual Bootstrapping for WSD
  1. Algorithm 1 Bilingual Bootstrapping LD1 2: Seed Labeled Data from L1 LD2 2: Seed Labeled Data from L2 UD1 := Unlabeled Data from L1 U D2 := Unlabeled Data from L2
    Page 4, “Bilingual Bootstrapping”
  2. These projected models are then applied to the untagged data of L1 and L2 and the instances which get labeled with a high confidence are added to the labeled data of the respective languages.
    Page 4, “Bilingual Bootstrapping”
  3. Algorithm 2 Monolingual Bootstrapping LD1 2: Seed Labeled Data from L1 LD2 2: Seed Labeled Data from L2 U D1 := Unlabeled Data from L1 U D2 := Unlabeled Data from L2
    Page 5, “Bilingual Bootstrapping”
  4. In each iteration only those words for which P(assigned_sense|word) > 0.6 get moved to the labeled data .
    Page 6, “Experimental Setup”
  5. Hence, we used a fixed threshold of 0.6 so that in each iteration only those words get moved to the labeled data for which the assigned sense is clearly a majority sense (P > 0.6).
    Page 6, “Experimental Setup”

See all papers in Proc. ACL 2011 that mention Labeled Data.

See all papers in Proc. ACL that mention Labeled Data.

Back to top.

Sense Disambiguation

Appears in 4 sentences as: Sense Disambiguation (4)
In Together We Can: Bilingual Bootstrapping for WSD
  1. Recent work on bilingual Word Sense Disambiguation (WSD) has shown that a resource deprived language (L1) can benefit from the annotation work done in a resource rich language (L2) via parameter projection.
    Page 1, “Abstract”
  2. Bootstrapping for Word Sense Disambiguation was first discussed in (Yarowsky, 1995).
    Page 2, “Related Work”
  3. (2009) proposed that the various parameters essential for domain-specific Word Sense Disambiguation can be broadly classified into two categories:
    Page 3, “Parameter Projection”
  4. We presented a bilingual bootstrapping algorithm for Word Sense Disambiguation which allows two resource deprived languages to mutually benefit
    Page 8, “Conclusion”

See all papers in Proc. ACL 2011 that mention Sense Disambiguation.

See all papers in Proc. ACL that mention Sense Disambiguation.

Back to top.

Word Sense

Appears in 4 sentences as: Word Sense (4)
In Together We Can: Bilingual Bootstrapping for WSD
  1. Recent work on bilingual Word Sense Disambiguation (WSD) has shown that a resource deprived language (L1) can benefit from the annotation work done in a resource rich language (L2) via parameter projection.
    Page 1, “Abstract”
  2. Bootstrapping for Word Sense Disambiguation was first discussed in (Yarowsky, 1995).
    Page 2, “Related Work”
  3. (2009) proposed that the various parameters essential for domain-specific Word Sense Disambiguation can be broadly classified into two categories:
    Page 3, “Parameter Projection”
  4. We presented a bilingual bootstrapping algorithm for Word Sense Disambiguation which allows two resource deprived languages to mutually benefit
    Page 8, “Conclusion”

See all papers in Proc. ACL 2011 that mention Word Sense.

See all papers in Proc. ACL that mention Word Sense.

Back to top.

language pair

Appears in 3 sentences as: language pair (3)
In Together We Can: Bilingual Bootstrapping for WSD
  1. Our experiments show that such a bilingual bootstrapping algorithm when evaluated on two different domains with small seed sizes using Hindi (L1) and Marathi (L2) as the language pair performs better than monolingual bootstrapping and significantly reduces annotation cost.
    Page 1, “Abstract”
  2. Such a bilingual bootstrapping strategy when tested on two domains, viz, Tourism and Health using Hindi (L1) and Marathi (L2) as the language pair , consistently does better than a baseline strategy which uses only seed data for training without performing any bootstrapping.
    Page 2, “Introduction”
  3. The average number of such links per synset per language pair is approximately 3.
    Page 3, “Synset Aligned Multilingual Dictionary”

See all papers in Proc. ACL 2011 that mention language pair.

See all papers in Proc. ACL that mention language pair.

Back to top.

model trained

Appears in 3 sentences as: model trained (5)
In Together We Can: Bilingual Bootstrapping for WSD
  1. We then use bilingual bootstrapping, wherein, a model trained using the seed annotated data of L1 is used to annotate the untagged data of L2 and vice versa using parameter projection.
    Page 1, “Abstract”
  2. repeat 61 2: model trained using LDl 62 2: model trained using LD2
    Page 4, “Bilingual Bootstrapping”
  3. repeat 61 2: model trained using LDl 62 2: model trained using LD2 for all ul E UD1 do 3 2: sense assigned by 61 to ul if confidence( S) > 6 then LD1 2= LD1 + U1 U D1 2: U D1 - U1 end if end for
    Page 5, “Bilingual Bootstrapping”

See all papers in Proc. ACL 2011 that mention model trained.

See all papers in Proc. ACL that mention model trained.

Back to top.