Large Scale Acquisition of Paraphrases for Learning Surface Patterns
Bhagat, Rahul and Ravichandran, Deepak

Article Structure

Abstract

Paraphrases have proved to be useful in many applications, including Machine Translation, Question Answering, Summarization, and Information Retrieval.

Introduction

Paraphrases are textual expressions that convey the same meaning using different surface words.

Related Work

Most recent work in paraphrase acquisition is based on automatic acquisition.

Acquiring Paraphrases

This section describes our model for acquiring paraphrases from text.

Learning Surface Patterns

Let r be a target relation.

Experimental Methodology

In this section, we describe experiments to validate the main claims of the paper.

Experimental Results

In this section, we present the results of the experiments and analyze them.

Conclusion

Paraphrases are an important technique to handle variations in language.

Topics

relation extraction

Appears in 8 sentences as: Relation Extraction (3) relation extraction (5)
In Large Scale Acquisition of Paraphrases for Learning Surface Patterns
  1. We further show that we can use these paraphrases to generate surface patterns for relation extraction .
    Page 1, “Abstract”
  2. Claim 2: These paraphrases can then be used for generating high precision surface patterns for relation extraction .
    Page 2, “Introduction”
  3. Another task related to our work is relation extraction .
    Page 2, “Related Work”
  4. 5.3 Relation Extraction
    Page 5, “Experimental Methodology”
  5. Relation Extraction
    Page 6, “Experimental Results”
  6. Relation Extraction
    Page 6, “Experimental Results”
  7. Moving to the task of relation extraction , we see from table 5 that our system has a much lower relative recall compared to the baseline.
    Page 8, “Experimental Results”
  8. While we believe that more work needs to be done to improve the system recall (some of which we are investigating), this seems to be a good first step towards developing a minimally supervised, easy to implement, and scalable relation extraction system.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2008 that mention relation extraction.

See all papers in Proc. ACL that mention relation extraction.

Back to top.

distributional similarity

Appears in 5 sentences as: Distributional Similarity (1) distributional similarity (4)
In Large Scale Acquisition of Paraphrases for Learning Surface Patterns
  1. A popular method, the so-called distributional similarity , is based on the dictum of Zelig Harris “you shall know the words by the company they keep”: given highly discriminating left and right contexts, only words with very similar meaning will be found to fit in between them.
    Page 1, “Introduction”
  2. Our method however, pre-computes paraphrases for a large set of surface patterns using distributional similarity over a large corpus and then obtains patterns for a relation by simply finding paraphrases (offline) for a few seed patterns.
    Page 2, “Related Work”
  3. Using distributional similarity avoids the problem of obtaining overly general patterns and the pre-computation of paraphrases means that we can obtain the set of patterns for any relation instantaneously.
    Page 2, “Related Work”
  4. 3.1 Distributional Similarity
    Page 3, “Acquiring Paraphrases”
  5. We have shown that high precision surface paraphrases can be obtained by using distributional similarity on a large corpus.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2008 that mention distributional similarity.

See all papers in Proc. ACL that mention distributional similarity.

Back to top.

relation instances

Appears in 5 sentences as: relation instances (4) relation instantaneously (1)
In Large Scale Acquisition of Paraphrases for Learning Surface Patterns
  1. Using distributional similarity avoids the problem of obtaining overly general patterns and the pre-computation of paraphrases means that we can obtain the set of patterns for any relation instantaneously .
    Page 2, “Related Work”
  2. While procedurally different, both methods depend heavily on the performance of the syntax parser and require complex syntax tree matching to extract the relation instances .
    Page 2, “Related Work”
  3. We first describe paraphrase acquisition, we then summarize our method for learning surface patterns, and finally describe the use of patterns for extracting relation instances .
    Page 4, “Experimental Methodology”
  4. To compare the quality of the extraction patterns, and relation instances , we use the method presented by Ravichandran and Hovy (2002) as the baseline.
    Page 5, “Experimental Results”
  5. The intuition is that applying the vague patterns for extracting target relation instances might find some good instances, but will also find many bad ones.
    Page 6, “Experimental Results”

See all papers in Proc. ACL 2008 that mention relation instances.

See all papers in Proc. ACL that mention relation instances.

Back to top.

cosine similarity

Appears in 4 sentences as: cosine similarity (4)
In Large Scale Acquisition of Paraphrases for Learning Surface Patterns
  1. We use cosine similarity , which
    Page 3, “Acquiring Paraphrases”
  2. As described in Section 3.2, we find paraphrases of a phrase p,- by finding its nearest neighbors based on cosine similarity between the feature vector of pi and other phrases.
    Page 3, “Acquiring Paraphrases”
  3. If n is the number of vectors and d is the dimensionality of the vector space, finding cosine similarity between each pair of vectors has time complexity 0(n2 d).
    Page 3, “Acquiring Paraphrases”
  4. It represents a d dimensional vector by a stream of b bits (b < d) and has the property of preserving the cosine similarity between vectors, which is exactly what we want.
    Page 4, “Acquiring Paraphrases”

See all papers in Proc. ACL 2008 that mention cosine similarity.

See all papers in Proc. ACL that mention cosine similarity.

Back to top.

gold standard

Appears in 3 sentences as: Gold Standard (1) gold standard (2)
In Large Scale Acquisition of Paraphrases for Learning Surface Patterns
  1. 6.3 Gold Standard
    Page 6, “Experimental Results”
  2. In this section, we describe the creation of gold standard for the different tasks.
    Page 6, “Experimental Results”
  3. We created the gold standard paraphrase test set by randomly selecting 50 phrases and their corresponding paraphrases from our collection of 2.5 million
    Page 6, “Experimental Results”

See all papers in Proc. ACL 2008 that mention gold standard.

See all papers in Proc. ACL that mention gold standard.

Back to top.

random sample

Appears in 3 sentences as: random sample (2) randomly sampled (1)
In Large Scale Acquisition of Paraphrases for Learning Surface Patterns
  1. We estimate the quality of paraphrases by annotating a random sample as correct/incorrect and calculating the accuracy.
    Page 6, “Experimental Results”
  2. We estimate the precision (P) of the extracted instances by annotating a random sample of instances as correct/incorrect.
    Page 6, “Experimental Results”
  3. We randomly sampled 50 instances of the “acquisition” and “birthplace” relations from the system and the baseline outputs.
    Page 6, “Experimental Results”

See all papers in Proc. ACL 2008 that mention random sample.

See all papers in Proc. ACL that mention random sample.

Back to top.