Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
Pilehvar, Mohammad Taher and Jurgens, David and Navigli, Roberto

Article Structure

Abstract

Semantic similarity is an essential component of many Natural Language Processing applications.

Introduction

Semantic similarity is a core technique for many topics in Natural Language Processing such as Textual Entailment (Berant et al., 2012), Semantic Role Labeling (Furstenau and Lapata, 2012), and Question Answering (Surdeanu et al., 2011).

A Unified Semantic Representation

We propose a representation of any lexical item as a distribution over a set of word senses, referred to as the item’s semantic signature.

Experiment 1: Textual Similarity

Measuring semantic similarity of textual items has applications in a wide variety of NLP tasks.

Experiment 2: Word Similarity

We now proceed from the sentence level to the word level.

Experiment 3: Sense Similarity

WordNet is known to be a fine-grained sense inventory with many related word senses (Palmer et al., 2007).

Related Work

Due to the wide applicability of semantic similarity, significant efforts have been made at different lexical levels.

Conclusions

This paper presents a unified approach for computing semantic similarity at multiple lexical levels, from word senses to texts.

Topics

WordNet

Appears in 26 sentences as: WordNet (29)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. As our sense inventory, we use WordNet 3.0 (Fellbaum, 1998).
    Page 2, “A Unified Semantic Representation”
  2. The WordNet ontology provides a rich network structure of semantic relatedness, connecting senses directly with their hypemyms, and providing information on semantically similar senses by virtue of their nearby locality in the network.
    Page 2, “A Unified Semantic Representation”
  3. To extend beyond a single sense, the random walk may be initialized and restarted from a set of senses (seed nodes), rather than just one; this multi-seed walk produces a multinomial distribution over all the senses in WordNet with higher probability assigned to senses that are frequently visited from the seeds.
    Page 2, “A Unified Semantic Representation”
  4. Prior work has demonstrated that multi-nomials generated from random walks over WordNet can be successfully applied to linguistic tasks such as word similarity (Hughes and Ramage,
    Page 2, “A Unified Semantic Representation”
  5. Formally, we define the semantic signature of a lexical item as the multinomial distribution generated from the random walks over WordNet 3.0 where the set of seed nodes is the set of senses present in the item.
    Page 2, “A Unified Semantic Representation”
  6. Let M be the adjacency matrix for the WordNet network, where edges connect senses according to the relations defined in WordNet (e.g., hypernymy and meronymy).
    Page 2, “A Unified Semantic Representation”
  7. 3We follow Navigli (2009) and denote with w; the i-th sense of w in WordNet with part of speech p.
    Page 3, “A Unified Semantic Representation”
  8. However, a semantic signature is, in essence, a weighted ranking of the importance of WordNet senses for each lexical item.
    Page 4, “A Unified Semantic Representation”
  9. Given that the WordNet graph has a nonuniform structure, and also given that different lexical items may be of different sizes, the magnitudes of the probabilities obtained may differ significantly between the two multinomial distributions.
    Page 4, “A Unified Semantic Representation”
  10. Additionally, because the texts often contain named entities which are not present in WordNet , we incorporated the similarity values produced by four string-based measures, which were used by other teams in the STS task: (1) longest common substring which takes into account the length of the longest overlapping contiguous sequence of characters (substring) across two strings (Gusfield, 1997), (2) longest common subsequence which, instead, finds the longest overlapping subsequence of two strings (Allison and Dix, 1986), (3) Greedy String Tiling which allows reordering in strings (Wise, 1993), and (4) the character/word n-gram similarity proposed by Barron-Cedefio et al.
    Page 5, “Experiment 1: Textual Similarity”
  11. o Explicit Semantic Analysis (Gabrilovich and Markovitch, 2007) where the high-dimensional vectors are obtained on WordNet , Wikipedia and Wiktionary.
    Page 6, “Experiment 1: Textual Similarity”

See all papers in Proc. ACL 2013 that mention WordNet.

See all papers in Proc. ACL that mention WordNet.

Back to top.

semantic similarity

Appears in 20 sentences as: semantic similarities (2) Semantic similarity (2) semantic similarity (14) semantically similar (2)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. Semantic similarity is an essential component of many Natural Language Processing applications.
    Page 1, “Abstract”
  2. However, prior methods for computing semantic similarity often operate at different levels, e.g., single words or entire documents, which requires adapting the method for each data type.
    Page 1, “Abstract”
  3. We present a unified approach to semantic similarity that operates at multiple levels, all the way from comparing word senses to comparing text documents.
    Page 1, “Abstract”
  4. Semantic similarity is a core technique for many topics in Natural Language Processing such as Textual Entailment (Berant et al., 2012), Semantic Role Labeling (Furstenau and Lapata, 2012), and Question Answering (Surdeanu et al., 2011).
    Page 1, “Introduction”
  5. Approaches to semantic similarity have often operated at separate levels: methods for word similarity are rarely applied to documents or even single sentences (Budanitsky and Hirst, 2006; Radin-sky et al., 2011; Halawi et al., 2012), while document-based similarity methods require more
    Page 1, “Introduction”
  6. Despite the potential advantages, few approaches to semantic similarity operate at the sense level due to the challenge in sense-tagging text (Navigli, 2009); for example, none of the top four systems in the recent SemEval-2012 task on textual similarity compared semantic representations that incorporated sense information (Agirre et al., 2012).
    Page 1, “Introduction”
  7. We propose a unified approach to semantic similarity across multiple representation levels from senses to documents, which offers two significant advantages.
    Page 1, “Introduction”
  8. Second, by operating at the sense level, a unified approach is able to identify the semantic similarities that exist independently of the text’s lexical forms and any semantic ambiguity therein.
    Page 1, “Introduction”
  9. The WordNet ontology provides a rich network structure of semantic relatedness, connecting senses directly with their hypemyms, and providing information on semantically similar senses by virtue of their nearby locality in the network.
    Page 2, “A Unified Semantic Representation”
  10. Measuring semantic similarity of textual items has applications in a wide variety of NLP tasks.
    Page 4, “Experiment 1: Textual Similarity”
  11. As our benchmark, we selected the recent SemEval-2012 task on Semantic Textual Similarity (STS), which was concerned with measuring the semantic similarity of sentence pairs.
    Page 4, “Experiment 1: Textual Similarity”

See all papers in Proc. ACL 2013 that mention semantic similarity.

See all papers in Proc. ACL that mention semantic similarity.

Back to top.

similarity measures

Appears in 13 sentences as: Similarity Measure (1) Similarity measure (1) similarity measure (4) similarity measurement (1) similarity measures (6)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. In order to compare semantic signatures, we adopt the Cosine similarity measure as a baseline method.
    Page 3, “A Unified Semantic Representation”
  2. The top-ranking participating systems in the SemEval-2012 task were generally supervised systems utilizing a variety of lexical resources and similarity measurement techniques.
    Page 4, “Experiment 1: Textual Similarity”
  3. 3.3 Similarity Measure Analysis
    Page 5, “Experiment 1: Textual Similarity”
  4. In addition, we present in the table correlation scores for four other similarity measures reported by B'ar et al.
    Page 6, “Experiment 1: Textual Similarity”
  5. o Pairwise Word Similarity that comprises of a set of WordNet-based similarity measures proposed by Resnik (1995), Jiang and Con-rath (1997), and Lin (1998b).
    Page 6, “Experiment 1: Textual Similarity”
  6. The aggregation strategy proposed by Corley and Mihalcea (2005) has been utilized for extending these word-to-word similarity measures for calculating text-to-text similarities.
    Page 6, “Experiment 1: Textual Similarity”
  7. Similarity measure
    Page 6, “Experiment 1: Textual Similarity”
  8. Table 3: Performance of our main-feature system with conventional WSD (DW) and with the alignment-based disambiguation approach (ADW-MF) vs. four other similarity measures , using 10-fold cross validation on the training datasets MSRpar (Mpar), MSRVid (Mvid), and SMTeuroparl (SMTe).
    Page 6, “Experiment 1: Textual Similarity”
  9. Except for the MSRpar (Mpar) dataset, our system (ADW—MF) outperforms all other similarity measures .
    Page 6, “Experiment 1: Textual Similarity”
  10. Different evaluation methods exist in the literature for evaluating the performance of a word-level semantic similarity measure ; we adopted two well-established benchmarks: synonym recognition and correlating word similarity judgments with those from human annotators.
    Page 6, “Experiment 2: Word Similarity”
  11. We adopt this task as a way of evaluating our similarity measure at the sense level.
    Page 7, “Experiment 3: Sense Similarity”

See all papers in Proc. ACL 2013 that mention similarity measures.

See all papers in Proc. ACL that mention similarity measures.

Back to top.

word senses

Appears in 12 sentences as: word sense (4) word senses (8)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. We present a unified approach to semantic similarity that operates at multiple levels, all the way from comparing word senses to comparing text documents.
    Page 1, “Abstract”
  2. Our method leverages a common probabilistic representation over word senses in order to compare different types of linguistic data.
    Page 1, “Abstract”
  3. This unified representation shows state-of-the-art performance on three tasks: semantic textual similarity, word similarity, and word sense coarsening.
    Page 1, “Abstract”
  4. Second, we propose a novel alignment-based method for word sense dis-
    Page 1, “Introduction”
  5. We propose a representation of any lexical item as a distribution over a set of word senses , referred to as the item’s semantic signature.
    Page 2, “A Unified Semantic Representation”
  6. However, traditional forms of word sense disambiguation are difficult for short texts and single words because little or no contextual information is present to perform the disambiguation task.
    Page 2, “A Unified Semantic Representation”
  7. In addition, the system utilizes techniques such as Explicit Semantic Analysis (Gabrilovich and Markovitch, 2007) and makes use of resources such as Wiktionary and Wikipedia, a lexical substitution system based on supervised word sense disambiguation (Biemann, 2013), and a statistical machine translation system.
    Page 4, “Experiment 1: Textual Similarity”
  8. that of Rapp (2003) uses word senses , an approach that is outperformed by our method.
    Page 7, “Experiment 2: Word Similarity”
  9. WordNet is known to be a fine-grained sense inventory with many related word senses (Palmer et al., 2007).
    Page 7, “Experiment 3: Sense Similarity”
  10. word senses (Agirre and Lopez, 2003; McCarthy, 2006).
    Page 8, “Experiment 3: Sense Similarity”
  11. We benchmark the accuracy of our similarity measure in grouping word senses against those of Navigli (2006) and Snow et al.
    Page 8, “Experiment 3: Sense Similarity”

See all papers in Proc. ACL 2013 that mention word senses.

See all papers in Proc. ACL that mention word senses.

Back to top.

sense disambiguation

Appears in 9 sentences as: sense disambiguated (1) Sense Disambiguation (1) sense disambiguation (7)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. However, traditional forms of word sense disambiguation are difficult for short texts and single words because little or no contextual information is present to perform the disambiguation task.
    Page 2, “A Unified Semantic Representation”
  2. alignment-based sense disambiguation that leverages the content of the paired item in order to disambiguate each element.
    Page 3, “A Unified Semantic Representation”
  3. Leveraging the paired item enables our approach to disambiguate where traditional sense disambiguation methods can not due to insufficient context.
    Page 3, “A Unified Semantic Representation”
  4. We view sense disambiguation as an alignment problem.
    Page 3, “A Unified Semantic Representation”
  5. Algorithm 1 formalizes the alignment process, which produces a sense disambiguated representation as a result.
    Page 3, “A Unified Semantic Representation”
  6. Algorithm 1 Alignment-based Sense Disambiguation
    Page 3, “A Unified Semantic Representation”
  7. In addition, the system utilizes techniques such as Explicit Semantic Analysis (Gabrilovich and Markovitch, 2007) and makes use of resources such as Wiktionary and Wikipedia, a lexical substitution system based on supervised word sense disambiguation (Biemann, 2013), and a statistical machine translation system.
    Page 4, “Experiment 1: Textual Similarity”
  8. Our alignment-based sense disambiguation transforms the task of comparing individual words into that of calculating the similarity of the best-matching sense pair across the two words.
    Page 7, “Experiment 2: Word Similarity”
  9. However, unlike our approach, their method does not perform sense disambiguation prior to building the representation and therefore potentially suffers from ambiguity.
    Page 9, “Related Work”

See all papers in Proc. ACL 2013 that mention sense disambiguation.

See all papers in Proc. ACL that mention sense disambiguation.

Back to top.

human judgments

Appears in 6 sentences as: human judges (1) human judgments (5)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. Third, we demonstrate that this single representation can achieve state-of-the-art performance on three similarity tasks, each operating at a different lexical level: (1) surpassing the highest scores on the SemEval-2012 task on textual similarity (Agirre et al., 2012) that compares sentences, (2) achieving a near-perfect performance on the TOEFL synonym selection task proposed by Landauer and Dumais (1997), which measures word pair similarity, and also obtaining state-of-the-art performance in terms of the correlation with human judgments on the RG-65 dataset (Rubenstein and Goodenough, 1965), and finally (3) surpassing the performance of Snow et al.
    Page 2, “Introduction”
  2. Each sentence pair in the datasets was given a score from 0 to 5 (low to high similarity) by human judges , with a high inter-annotator agreement of around 0.90 when measured using the Pearson correlation coefficient.
    Page 4, “Experiment 1: Textual Similarity”
  3. Three evaluation metrics are provided by the organizers of the SemEval-2012 STS task, all of which are based on Pearson correlation 7“ of human judgments with system outputs: (1) the correlation value for the concatenation of all five datasets (ALL), (2) a correlation value obtained on a concatenation of the outputs, separately normalized by least square (ALLnrm), and (3) the weighted average of Pearson correlations across datasets (Mean).
    Page 5, “Experiment 1: Textual Similarity”
  4. MSRpar (MPar) is the only dataset in which TLsim (éarié et al., 2012) achieves a higher correlation with human judgments .
    Page 5, “Experiment 1: Textual Similarity”
  5. Table 6 shows the Spearman’s p rank correlation coefficients with human judgments on the RG—65 dataset.
    Page 7, “Experiment 2: Word Similarity”
  6. Table 6: Spearman’s p correlation coefficients with human judgments on the RG—65 dataset.
    Page 8, “Experiment 3: Sense Similarity”

See all papers in Proc. ACL 2013 that mention human judgments.

See all papers in Proc. ACL that mention human judgments.

Back to top.

named entities

Appears in 4 sentences as: named entities (3) Named entity (1) named entity (1)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. The TLsyn system also uses Google Book Ngrams, as well as dependency parsing and named entity recognition.
    Page 4, “Experiment 1: Textual Similarity”
  2. Additionally, because the texts often contain named entities which are not present in WordNet, we incorporated the similarity values produced by four string-based measures, which were used by other teams in the STS task: (1) longest common substring which takes into account the length of the longest overlapping contiguous sequence of characters (substring) across two strings (Gusfield, 1997), (2) longest common subsequence which, instead, finds the longest overlapping subsequence of two strings (Allison and Dix, 1986), (3) Greedy String Tiling which allows reordering in strings (Wise, 1993), and (4) the character/word n-gram similarity proposed by Barron-Cedefio et al.
    Page 5, “Experiment 1: Textual Similarity”
  3. Named entity features used by the TLsim system could be the reason for its better performance on the MSRpar dataset, which contains a large number of named entities .
    Page 5, “Experiment 1: Textual Similarity”
  4. Specifically, we plan to investigate higher coverage inventories such as BabelNet (Navigli and Ponzetto, 2012a), which will handle texts with named entities and rare senses that are not in WordNet, and will also enable cross-lingual semantic similarity.
    Page 9, “Conclusions”

See all papers in Proc. ACL 2013 that mention named entities.

See all papers in Proc. ACL that mention named entities.

Back to top.

PageRank

Appears in 4 sentences as: PageRank (4)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. To construct each semantic signature, we use the iterative method for calculating topic-sensitive PageRank (Haveliwala, 2002).
    Page 2, “A Unified Semantic Representation”
  2. The PageRank may then be computed using:
    Page 2, “A Unified Semantic Representation”
  3. For our semantic signatures we used the UKB2 off-the-shelf implementation of topic-sensitive PageRank .
    Page 2, “A Unified Semantic Representation”
  4. As our WSD system, we used UKB, a state-of-the-art knowledge-based WSD system that is based on the same topic-sensitive PageRank algorithm used by our approach.
    Page 6, “Experiment 1: Textual Similarity”

See all papers in Proc. ACL 2013 that mention PageRank.

See all papers in Proc. ACL that mention PageRank.

Back to top.

sentence pairs

Appears in 4 sentences as: sentence pair (1) sentence pairs (3)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. Commonly, semantic comparisons are between word pairs or sentence pairs that do not have their lexical content sense-annotated, despite the potential utility of sense annotation in making semantic comparisons.
    Page 2, “A Unified Semantic Representation”
  2. As our benchmark, we selected the recent SemEval-2012 task on Semantic Textual Similarity (STS), which was concerned with measuring the semantic similarity of sentence pairs .
    Page 4, “Experiment 1: Textual Similarity”
  3. Each sentence pair in the datasets was given a score from 0 to 5 (low to high similarity) by human judges, with a high inter-annotator agreement of around 0.90 when measured using the Pearson correlation coefficient.
    Page 4, “Experiment 1: Textual Similarity”
  4. Table 1 lists the number of sentence pairs in training and test portions of each dataset.
    Page 4, “Experiment 1: Textual Similarity”

See all papers in Proc. ACL 2013 that mention sentence pairs.

See all papers in Proc. ACL that mention sentence pairs.

Back to top.

binary classification

Appears in 3 sentences as: binary classification (3)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. (2007) considered sense grouping as a binary classification task whereby for each word every possible pairing of senses has to be classified
    Page 8, “Experiment 3: Sense Similarity”
  2. We constructed a simple threshold-based classifier to perform the same binary classification .
    Page 8, “Experiment 3: Sense Similarity”
  3. For a binary classification task, we can directly calculate precision, recall and F-score by constructing a contingency table.
    Page 8, “Experiment 3: Sense Similarity”

See all papers in Proc. ACL 2013 that mention binary classification.

See all papers in Proc. ACL that mention binary classification.

Back to top.

evaluation metrics

Appears in 3 sentences as: evaluation metrics (3)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. Three evaluation metrics are provided by the organizers of the SemEval-2012 STS task, all of which are based on Pearson correlation 7“ of human judgments with system outputs: (1) the correlation value for the concatenation of all five datasets (ALL), (2) a correlation value obtained on a concatenation of the outputs, separately normalized by least square (ALLnrm), and (3) the weighted average of Pearson correlations across datasets (Mean).
    Page 5, “Experiment 1: Textual Similarity”
  2. Table 2 shows the scores obtained by ADW for the three evaluation metrics , as well as the Pearson correlation values obtained on each of the five test sets (rightmost columns).
    Page 5, “Experiment 1: Textual Similarity”
  3. As can be seen from Table 2, our system (ADW) outperforms all the 88 participating systems according to all the evaluation metrics .
    Page 5, “Experiment 1: Textual Similarity”

See all papers in Proc. ACL 2013 that mention evaluation metrics.

See all papers in Proc. ACL that mention evaluation metrics.

Back to top.

F-score

Appears in 3 sentences as: F-score (3)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. Table 7: F-score sense merging evaluation on three hand-labeled datasets: OntoNotes (Onto), Senseval-2 (SE-2), and combined (Onto+SE-2).
    Page 8, “Experiment 3: Sense Similarity”
  2. For a binary classification task, we can directly calculate precision, recall and F-score by constructing a contingency table.
    Page 8, “Experiment 3: Sense Similarity”
  3. In addition, we show in Table 7 the F-score results provided by Snow et al.
    Page 8, “Experiment 3: Sense Similarity”

See all papers in Proc. ACL 2013 that mention F-score.

See all papers in Proc. ACL that mention F-score.

Back to top.

n-grams

Appears in 3 sentences as: n-grams (3)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. 0 Character n-grams which were also used as one of our additional features.
    Page 6, “Experiment 1: Textual Similarity”
  2. Another interesting point is the high scores achieved by the Character n-grams
    Page 6, “Experiment 1: Textual Similarity”
  3. Dataset Mpar Mvid SMTe DW 0.448 0.820 0.660 ADW—MF 0.485 0.842 0.721 Explicit Semantic Analysis 0.427 0.781 0.619 Pairwise Word Similarity 0.564 0.835 0.527 Distributional Thesaurus 0.494 0.481 0.365 Character n-grams 0.658 0.771 0.554
    Page 6, “Experiment 1: Textual Similarity”

See all papers in Proc. ACL 2013 that mention n-grams.

See all papers in Proc. ACL that mention n-grams.

Back to top.

semantic representation

Appears in 3 sentences as: semantic representation (2) semantic representations (1)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. Despite the potential advantages, few approaches to semantic similarity operate at the sense level due to the challenge in sense-tagging text (Navigli, 2009); for example, none of the top four systems in the recent SemEval-2012 task on textual similarity compared semantic representations that incorporated sense information (Agirre et al., 2012).
    Page 1, “Introduction”
  2. (2009) used a similar semantic representation of short texts from random walks on WordNet, which was applied to paraphrase recognition and textual entailment.
    Page 9, “Related Work”
  3. We demonstrate that our semantic representation achieves state-of-the-art performance in three experiments using semantic similarity at different lexical levels (i.e., sense, word, and text), surpassing the performance of previous similarity measures that are often specifically targeted for each level.
    Page 9, “Conclusions”

See all papers in Proc. ACL 2013 that mention semantic representation.

See all papers in Proc. ACL that mention semantic representation.

Back to top.

word pairs

Appears in 3 sentences as: word pair (1) word pairs (2)
In Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
  1. Third, we demonstrate that this single representation can achieve state-of-the-art performance on three similarity tasks, each operating at a different lexical level: (1) surpassing the highest scores on the SemEval-2012 task on textual similarity (Agirre et al., 2012) that compares sentences, (2) achieving a near-perfect performance on the TOEFL synonym selection task proposed by Landauer and Dumais (1997), which measures word pair similarity, and also obtaining state-of-the-art performance in terms of the correlation with human judgments on the RG-65 dataset (Rubenstein and Goodenough, 1965), and finally (3) surpassing the performance of Snow et al.
    Page 2, “Introduction”
  2. Commonly, semantic comparisons are between word pairs or sentence pairs that do not have their lexical content sense-annotated, despite the potential utility of sense annotation in making semantic comparisons.
    Page 2, “A Unified Semantic Representation”
  3. The dataset contains 65 word pairs judged by 51 human subjects on a scale of 0 to 4 according to their semantic similarity.
    Page 7, “Experiment 2: Word Similarity”

See all papers in Proc. ACL 2013 that mention word pairs.

See all papers in Proc. ACL that mention word pairs.

Back to top.