Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding
Bernhard, Delphine and Gurevych, Iryna

Article Structure

Abstract

Monolingual translation probabilities have recently been introduced in retrieval models to solve the lexical gap problem.

Introduction

The lexical gap (or lexical chasm) often observed between queries and documents or questions and answers is a pervasive problem both in Information Retrieval (IR) and Question Answering (QA).

Related Work

2.1 Statistical Translation Models for Retrieval

Parallel Datasets

In order to obtain parallel training data for the translation models, we collected three different datasets: manually-tagged question reformulations and question-answer pairs from the WikiAnswers social Q&A site (Section 3.1), and glosses from WordNet, Wiktionary, Wikipedia and Simple Wikipedia (Section 3.2).

Semantic Relatedness Experiments

The aim of this first experiment is to perform an intrinsic evaluation of the word translation probabilities obtained by comparing them to traditional semantic relatedness measures on the task of ranking word pairs.

Answer Finding Experiments

5.1 Retrieval based on Translation Models

Conclusion and Future Work

We have presented three datasets for training statistical word translation models for use in answer finding: question-answer pairs, manually-tagged question reformulations and glosses for the same term extracted from several lexical semantic resources.

Topics

translation models

Appears in 29 sentences as: Translation Model (1) translation model (3) Translation Models (2) translation models (26) translation models: (1)
In Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding
  1. They can be obtained by training statistical translation models on parallel monolingual corpora, such as question-answer pairs, where answers act as the “source” language and questions as the “target” language.
    Page 1, “Abstract”
  2. We compare monolingual translation models built from lexical semantic resources with two other kinds of datasets: manually-tagged question reformulations and question-answer pairs.
    Page 1, “Abstract”
  3. Berger and Lafferty (1999) have formulated a further solution to the lexical gap problem consisting in integrating monolingual statistical translation models in the retrieval process.
    Page 1, “Introduction”
  4. Monolingual translation models encode statistical word associations which are trained on parallel monolingual corpora.
    Page 1, “Introduction”
  5. While collection-specific translation models effectively encode statistical word associations for the target document collection, it also introduces a bias in the evaluation and makes it difficult to assess the quality of the translation model per se, independently from a specific task and document collection.
    Page 1, “Introduction”
  6. In this paper, we propose new kinds of datasets for training domain-independent monolingual translation models .
    Page 1, “Introduction”
  7. We use the definitions and glosses provided for the same term by different lexical semantic resources to automatically train the translation models .
    Page 1, “Introduction”
  8. Thanks to the combination of several resources, it is possible to obtain monolingual parallel corpora which are large enough to train domain-independent translation models .
    Page 1, “Introduction”
  9. We use these datasets to build further translation models .
    Page 1, “Introduction”
  10. We then use the translation models in an answer finding task based on a new question-answer dataset which is totally independent from the resources used for training the translation models .
    Page 2, “Introduction”
  11. This extrinsic evaluation shows that our translation models significantly improve the results over the query likelihood and the vector-space model.
    Page 2, “Introduction”

See all papers in Proc. ACL 2009 that mention translation models.

See all papers in Proc. ACL that mention translation models.

Back to top.

translation probabilities

Appears in 23 sentences as: Translation probabilities (1) translation probabilities (22)
In Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding
  1. Monolingual translation probabilities have recently been introduced in retrieval models to solve the lexical gap problem.
    Page 1, “Abstract”
  2. We also show that the monolingual translation probabilities obtained (i) are comparable to traditional semantic relatedness measures and (ii) significantly improve the results over the query likelihood and the vector-space model for answer finding.
    Page 1, “Abstract”
  3. To do so, we compare translation probabilities with concept vector based semantic relatedness measures with respect to human relatedness rankings for reference word pairs.
    Page 2, “Introduction”
  4. Section 3 presents the monolingual parallel datasets we used for obtaining monolingual translation probabilities .
    Page 2, “Introduction”
  5. The main drawback lies in the availability of suitable training data for the translation probabilities .
    Page 2, “Related Work”
  6. The rationale behind translation-based retrieval models is that monolingual translation probabilities encode some form of semantic knowledge.
    Page 2, “Related Work”
  7. While classical measures of semantic relatedness have been extensively studied and compared, based on comparisons with human relatedness judgements or word-choice problems, there is no comparable intrinsic study of the relatedness measures obtained through word translation probabilities .
    Page 3, “Related Work”
  8. In this study, we use the correlation with human rankings for reference word pairs to investigate how word translation probabilities compare with traditional semantic relatedness measures.
    Page 3, “Related Work”
  9. To our knowledge, this is the first time that word-to-word translation probabilities are used for ranking word-pairs with respect to their semantic relatedness.
    Page 3, “Related Work”
  10. We used the GIZA++ SMT Toolkit4 (Och and Ney, 2003) in order to obtain word-to-word translation probabilities from the parallel datasets described above.
    Page 4, “Parallel Datasets”
  11. The first method consists in a linear combination of the word-to-word translation probabilities after training:
    Page 5, “Parallel Datasets”

See all papers in Proc. ACL 2009 that mention translation probabilities.

See all papers in Proc. ACL that mention translation probabilities.

Back to top.

semantic relatedness

Appears in 13 sentences as: Semantic Relatedness (1) Semantic relatedness (1) semantic relatedness (12)
In Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding
  1. We also show that the monolingual translation probabilities obtained (i) are comparable to traditional semantic relatedness measures and (ii) significantly improve the results over the query likelihood and the vector-space model for answer finding.
    Page 1, “Abstract”
  2. To do so, we compare translation probabilities with concept vector based semantic relatedness measures with respect to human relatedness rankings for reference word pairs.
    Page 2, “Introduction”
  3. Section 2 discusses related work on semantic relatedness and statistical translation models for retrieval.
    Page 2, “Introduction”
  4. Semantic relatedness experiments are detailed in Section 4.
    Page 2, “Introduction”
  5. 2.2 Semantic Relatedness
    Page 2, “Related Work”
  6. While classical measures of semantic relatedness have been extensively studied and compared, based on comparisons with human relatedness judgements or word-choice problems, there is no comparable intrinsic study of the relatedness measures obtained through word translation probabilities.
    Page 3, “Related Work”
  7. In this study, we use the correlation with human rankings for reference word pairs to investigate how word translation probabilities compare with traditional semantic relatedness measures.
    Page 3, “Related Work”
  8. To our knowledge, this is the first time that word-to-word translation probabilities are used for ranking word-pairs with respect to their semantic relatedness .
    Page 3, “Related Work”
  9. the different kinds of data encode different types of information, including semantic relatedness and similarity, as well as morphological relatedness.
    Page 5, “Parallel Datasets”
  10. The aim of this first experiment is to perform an intrinsic evaluation of the word translation probabilities obtained by comparing them to traditional semantic relatedness measures on the task of ranking word pairs.
    Page 5, “Semantic Relatedness Experiments”
  11. Human judgements of semantic relatedness can be used to evaluate how well semantic relatedness measures reflect human rankings by correlating their ranking results with Spearman’s rank correlation coefficient.
    Page 5, “Semantic Relatedness Experiments”

See all papers in Proc. ACL 2009 that mention semantic relatedness.

See all papers in Proc. ACL that mention semantic relatedness.

Back to top.

lexical semantic

Appears in 10 sentences as: Lexical Semantic (1) lexical semantic (9)
In Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding
  1. In this paper, we propose to use as a parallel training dataset the definitions and glosses provided for the same term by different lexical semantic resources.
    Page 1, “Abstract”
  2. We compare monolingual translation models built from lexical semantic resources with two other kinds of datasets: manually-tagged question reformulations and question-answer pairs.
    Page 1, “Abstract”
  3. We use the definitions and glosses provided for the same term by different lexical semantic resources to automatically train the translation models.
    Page 1, “Introduction”
  4. This approach has been very recently made possible by the emergence of new kinds of lexical semantic and encyclopedic resources such as Wikipedia and Wiktionary.
    Page 1, “Introduction”
  5. We henceforth propose a new approach for building monolingual translation models relying on domain-independent lexical semantic resources.
    Page 2, “Related Work”
  6. Knowledge-based measures rely on lexical semantic resources such as WordNet and comprise path length based measures (Rada et al., 1989) and concept vector based measures (Qiu and Frei, 1993).
    Page 3, “Related Work”
  7. 3.2 Lexical Semantic Resources
    Page 3, “Parallel Datasets”
  8. Glosses and definitions for the same lexeme in different lexical semantic and encyclopedic resources can actually be considered as near-paraphrases, since they define the same terms and hence have
    Page 3, “Parallel Datasets”
  9. We have presented three datasets for training statistical word translation models for use in answer finding: question-answer pairs, manually-tagged question reformulations and glosses for the same term extracted from several lexical semantic resources.
    Page 8, “Conclusion and Future Work”
  10. question-answer pairs, and external knowledge, as contained in lexical semantic resources.
    Page 8, “Conclusion and Future Work”

See all papers in Proc. ACL 2009 that mention lexical semantic.

See all papers in Proc. ACL that mention lexical semantic.

Back to top.

word pairs

Appears in 10 sentences as: Word pairs (1) word pairs (10)
In Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding
  1. To do so, we compare translation probabilities with concept vector based semantic relatedness measures with respect to human relatedness rankings for reference word pairs .
    Page 2, “Introduction”
  2. In this study, we use the correlation with human rankings for reference word pairs to investigate how word translation probabilities compare with traditional semantic relatedness measures.
    Page 3, “Related Work”
  3. The aim of this first experiment is to perform an intrinsic evaluation of the word translation probabilities obtained by comparing them to traditional semantic relatedness measures on the task of ranking word pairs .
    Page 5, “Semantic Relatedness Experiments”
  4. This dataset comprises two subsets, which have been annotated by different annotators: Fin1—153, containing 153 word pairs, and Fin2—200, containing 200 word pairs .
    Page 5, “Semantic Relatedness Experiments”
  5. In order to ensure a fair evaluation, we limit the comparison to the word pairs which are contained in all resources and translation tables.
    Page 5, “Semantic Relatedness Experiments”
  6. Dataset Fin1-153 Fin2-200 Word pairs used 46 42 Concept vectors
    Page 5, “Semantic Relatedness Experiments”
  7. due to the natural absence of many word pairs in the translation tables.
    Page 6, “Semantic Relatedness Experiments”
  8. It is therefore possible to measure relatedness for a far greater number of word pairs , as long as they share some concept vector dimensions.
    Page 6, “Semantic Relatedness Experiments”
  9. The second observation is that, on the restricted subset of word pairs considered, the results obtained by word-to-word translation probabilities are most of the time better than those of concept vector measures.
    Page 6, “Semantic Relatedness Experiments”
  10. We have also provided the first intrinsic evaluation of word translation probabilities with respect to human relatedness rankings for reference word pairs .
    Page 8, “Conclusion and Future Work”

See all papers in Proc. ACL 2009 that mention word pairs.

See all papers in Proc. ACL that mention word pairs.

Back to top.

WordNet

Appears in 10 sentences as: WordNet (10) Wordnet (1)
In Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding
  1. Murdock and Croft (2005) created a first parallel corpus of synonym pairs extracted from WordNet , and an additional parallel corpus of English words translating to the same Arabic term in a parallel English-Arabic corpus.
    Page 2, “Related Work”
  2. Knowledge-based measures rely on lexical semantic resources such as WordNet and comprise path length based measures (Rada et al., 1989) and concept vector based measures (Qiu and Frei, 1993).
    Page 3, “Related Work”
  3. In order to obtain parallel training data for the translation models, we collected three different datasets: manually-tagged question reformulations and question-answer pairs from the WikiAnswers social Q&A site (Section 3.1), and glosses from WordNet , Wiktionary, Wikipedia and Simple Wikipedia (Section 3.2).
    Page 3, “Parallel Datasets”
  4. 0 Wordnet (sense I): the natural satellite of the Earth.
    Page 4, “Parallel Datasets”
  5. o WordNet (Fellbaum, 1998).
    Page 4, “Parallel Datasets”
  6. We use a freely available API for WordNet (JWNL3) to access WordNet 3.0.
    Page 4, “Parallel Datasets”
  7. Given a list of 86,584 seed lexemes extracted from WordNet , we collected the glosses for each lexeme from the four English resources described
    Page 4, “Parallel Datasets”
  8. The method consists in representing words as a concept vector, where concepts correspond to WordNet synsets, Wikipedia article titles or Wiktionary entry names.
    Page 5, “Semantic Relatedness Experiments”
  9. glosses in WordNet , the full article or the first paragraph of the article in Wikipedia or the full contents of a Wiktionary entry.
    Page 5, “Semantic Relatedness Experiments”
  10. WordNet .26 .46 Wikipedia .27 .03 WikipediaFirst .30 .38 Wiktionary .39 .58 Translation probabilities
    Page 5, “Semantic Relatedness Experiments”

See all papers in Proc. ACL 2009 that mention WordNet.

See all papers in Proc. ACL that mention WordNet.

Back to top.

parallel corpus

Appears in 7 sentences as: parallel corpus (7) parallel corpus: (1)
In Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding
  1. Murdock and Croft (2005) created a first parallel corpus of synonym pairs extracted from WordNet, and an additional parallel corpus of English words translating to the same Arabic term in a parallel English-Arabic corpus.
    Page 2, “Related Work”
  2. Question-Answer Pairs (WAQA) In this setting, question-answer pairs are considered as a parallel corpus .
    Page 3, “Parallel Datasets”
  3. (2008) has shown that the best results are obtained by pooling the question-answer pairs {(q, a)1, ..., (q, a)n} and the answer-question pairs {(a,q)1,..., (a,q)n} for training, so that we obtain the following parallel corpus : {(q, a)1, ..., (q, a)n}U{(a, (1)1, ..., (a, Overall, this corpus contains l,227,362 parallel pairs and will be referred to as WAQA (WikiAnswers Question-Answers) in the rest of the paper.
    Page 3, “Parallel Datasets”
  4. Question Reformulations (WAQ) In this setting, question and question reformulation pairs are considered as a parallel corpus , e.g.
    Page 3, “Parallel Datasets”
  5. For a given user question q1, we retrieve its stored reformulations from the WikiAnswers dataset; q11, Q12, The original question and reformulations are subsequently combined and pooled to obtain a parallel corpus of question reformula-
    Page 3, “Parallel Datasets”
  6. We use glosses and definitions contained in the following resources to build a parallel corpus:
    Page 4, “Parallel Datasets”
  7. The final pooled parallel corpus contains 307,136 pairs and is henceforth much smaller than the previous datasets extracted from WikiAnswers.
    Page 4, “Parallel Datasets”

See all papers in Proc. ACL 2009 that mention parallel corpus.

See all papers in Proc. ACL that mention parallel corpus.

Back to top.

parallel corpora

Appears in 4 sentences as: parallel corpora (4)
In Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding
  1. Thanks to the combination of several resources, it is possible to obtain monolingual parallel corpora which are large enough to train domain-independent translation models.
    Page 1, “Introduction”
  2. These models attempt to address synonymy and polysemy problems by encoding statistical word associations trained on monolingual parallel corpora .
    Page 2, “Related Work”
  3. Table 1 gives some examples of word-to-word translations obtained for the different parallel corpora used (the column ALLp001 will be described in the next section).
    Page 4, “Parallel Datasets”
  4. concatenating the parallel corpora , before training.
    Page 5, “Parallel Datasets”

See all papers in Proc. ACL 2009 that mention parallel corpora.

See all papers in Proc. ACL that mention parallel corpora.

Back to top.

significantly improve

Appears in 4 sentences as: significant improvement (1) significantly improve (3)
In Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding
  1. We also show that the monolingual translation probabilities obtained (i) are comparable to traditional semantic relatedness measures and (ii) significantly improve the results over the query likelihood and the vector-space model for answer finding.
    Page 1, “Abstract”
  2. This extrinsic evaluation shows that our translation models significantly improve the results over the query likelihood and the vector-space model.
    Page 2, “Introduction”
  3. All in all, translation models have been shown to significantly improve the retrieval results over traditional baselines for document retrieval (Berger and Lafferty, 1999), question retrieval in Question & Answer archives (Jeon et al., 2005; Lee et al., 2008; Xue et al., 2008) and for sentence retrieval (Murdock and Croft, 2005).
    Page 2, “Related Work”
  4. Moreover, models based on translation probabilities yield significant improvement over baseline approaches for answer finding, especially when different types of training data are combined.
    Page 8, “Conclusion and Future Work”

See all papers in Proc. ACL 2009 that mention significantly improve.

See all papers in Proc. ACL that mention significantly improve.

Back to top.