Abstract | We evaluate our proposed method on two end-to-end SMT tasks (phrase table pruning and decoding with phrasal semantic similarities) which need to measure semantic similarity between a source phrase and its translation candidates. |
Experiments | With the semantic phrase embeddings and the vector space transformation function, we apply the BRAE to measure the semantic similarity between a source phrase and its translation candidates in the phrase-based SMT. |
Experiments | Two tasks are involved in the experiments: phrase table pruning that discards entries whose semantic similarity is very low and decoding with the phrasal semantic similarities as additional new features. |
Experiments | 3To avoid the situation that all the translation candidates for a source phrase are pruned, we always keep the first 10 best according to the semantic similarity . |
Introduction | With the learned model, we can accurately measure the semantic similarity between a source phrase and a translation candidate. |
Introduction | Accordingly, we evaluate the BRAE model on two end-to-end SMT tasks (phrase table pruning and decoding with phrasal semantic similarities ) which need to check whether a translation candidate and the source phrase are in the same meaning. |
Introduction | In phrase table pruning, we discard the phrasal translation rules with low semantic similarity . |
Abstract | We adopt three cohesion measures: clue words, semantic similarity and cosine similarity as the weight of the edges. |
Empirical Evaluation | In Section 3.3, we developed three ways to compute the weight of an edge in the sentence quotation graph, i.e., clue words, semantic similarity based on WordNet and cosine similarity. |
Empirical Evaluation | Table 1 shows the aggregated pyramid precision over all five summary lengths of CWS, CWS-Cosine, two semantic similarities , i.e., CWS-lesk and CWS-jcn. |
Extracting Conversations from Multiple Emails | 3.3.2 Semantic Similarity Based on WordNet |
Extracting Conversations from Multiple Emails | Based on this observation, we propose to use semantic similarity to measure the cohesion between two sentences. |
Extracting Conversations from Multiple Emails | We use the well-known lexical database WordNet to get the semantic similarity of two words. |
Introduction | (Carenini et al., 2007), semantic similarity and cosine similarity. |
Related Work | Second, we only adopted one cohesion measure (clue words that are based on stemming), and did not consider more sophisticated ones such as semantically similar words. |
Summarization Based on the Sentence Quotation Graph | In the rest of this paper, let CWS denote the Generalized ClueWordSummarizer when the edge weight is based on clue words, and let CWS-Cosine and CWS-Semantic denote the summarizer when the edge weight is cosine similarity and semantic similarity respectively. |
Abstract | As a consequence, improving such thesaurus is an important issue that is mainly tackled indirectly through the improvement of semantic similarity measures. |
Introduction | The distinction between these two interpretations refers to the distinction between the notions of semantic similarity and semantic relatedness as it was done in (Budanitsky and Hirst, 2006) or in (Zesch and Gurevych, 2010) for instance. |
Introduction | However, the limit between these two notions is sometimes hard to find in existing work as terms semantic similarity and semantic relatedness are often used interchangeably. |
Introduction | Moreover, semantic similarity is frequently considered as included into semantic relatedness and the two problems are often tackled by using the same methods. |
Principles | of its bad semantic neighbors, that is to say the neighbors of the entry that are actually not semantically similar to the entry. |
Principles | As a consequence, two words are considered as semantically similar if they occur in a large enough set of shared contexts. |
Principles | in a sentence, from all other words and more particularly, from those of its neighbors in a distributional thesaurus that are likely to be actually not semantically similar to it. |
Applications | Turney and Littman (2003) proposed a method in which the SO of a word is calculated based on its semantic similarity with seven positive words minus its similarity with seven negative words as shown in Figure 5. |
Evaluation and Results | PMI extracts the semantic similarity between words using their co—occurrences. |
Evaluation and Results | The second row of Table 4 show the results of using a popular semantic similarity measure, PMI, as the sentiment similarity (SS) measure in Figure 4. |
Hidden Emotional Model | For this purpose, we utilize the semantic similarity between each two words and create an enriched matrix. |
Hidden Emotional Model | To compute the semantic similarity between word senses, we utilize their synsets as follows: |
Introduction | Semantic similarity measures such as Latent Semantic Analysis (LSA) (Landauer et al., 1998) can effectively capture the similarity between semantically related words like "car" and "automobile", but they are less effective in relating words with similar sentiment orientation like "excellent" and "superior". |
Introduction | For example, the following relations show the semantic similarity between some sentiment words computed by LSA: |
Introduction | We show that our approach effectively outperforms the semantic similarity measures in two NLP tasks: Indirect yes/no Question Answer Pairs (IQAPs) Inference and Sentiment Orientation (S0) prediction that are described as follows: |
Related Works | Most previous works employed semantic similarity of word pairs to address SO prediction and IQAP inference tasks. |
Sentiment Similarity through Hidden Emotions | As we discussed above, semantic similarity measures are less effective to infer sentiment similarity between word pairs. |
A Unified Semantic Representation | The WordNet ontology provides a rich network structure of semantic relatedness, connecting senses directly with their hypemyms, and providing information on semantically similar senses by virtue of their nearby locality in the network. |
Abstract | Semantic similarity is an essential component of many Natural Language Processing applications. |
Abstract | However, prior methods for computing semantic similarity often operate at different levels, e.g., single words or entire documents, which requires adapting the method for each data type. |
Abstract | We present a unified approach to semantic similarity that operates at multiple levels, all the way from comparing word senses to comparing text documents. |
Experiment 1: Textual Similarity | Measuring semantic similarity of textual items has applications in a wide variety of NLP tasks. |
Experiment 1: Textual Similarity | As our benchmark, we selected the recent SemEval-2012 task on Semantic Textual Similarity (STS), which was concerned with measuring the semantic similarity of sentence pairs. |
Introduction | Semantic similarity is a core technique for many topics in Natural Language Processing such as Textual Entailment (Berant et al., 2012), Semantic Role Labeling (Furstenau and Lapata, 2012), and Question Answering (Surdeanu et al., 2011). |
Introduction | Approaches to semantic similarity have often operated at separate levels: methods for word similarity are rarely applied to documents or even single sentences (Budanitsky and Hirst, 2006; Radin-sky et al., 2011; Halawi et al., 2012), while document-based similarity methods require more |
Introduction | Despite the potential advantages, few approaches to semantic similarity operate at the sense level due to the challenge in sense-tagging text (Navigli, 2009); for example, none of the top four systems in the recent SemEval-2012 task on textual similarity compared semantic representations that incorporated sense information (Agirre et al., 2012). |
Abstract | We build a semantic similarity graph to encode lexical semantic clue, and employ a convolutional neural model to capture contextual semantic clue. |
Introduction | Then, based on the assumption that terms that are more semantically similar to the seeds are more likely to be product features, a graph which measures semantic similarities between terms is built to capture lexical semantic clue. |
Introduction | 0 It exploits semantic similarity between words to capture lexical clues, which is shown to be more effective than co-occurrence relation between words and syntactic patterns. |
Introduction | In addition, experiments show that the semantic similarity has the advantage of mining infrequent product features, which is crucial for this task. |
The Proposed Method | Then, a semantic similarity graph is created to capture lexical semantic clue, and a Convolutional Neural Network (CNN) (Collobert et al., 2011) is trained in each bootstrapping iteration to encode contextual semantic clue. |
The Proposed Method | 3.2 Capturing Lexical Semantic Clue in a Semantic Similarity Graph |
The Proposed Method | 3.2.2 Building the Semantic Similarity Graph |
Abstract | By providing an empirical measure of semantic similarity between words derived from lexical co-occurrences, distributional semantics not only reliably captures how the verbs in the distribution of a construction are related, but also enables the use of visualization techniques and statistical modeling to analyze the semantic development of a construction over time and identify the semantic determinants of syntactic productivity in naturally occurring data. |
Application of the vector-space model | One of the advantages conferred by the quantification of semantic similarity is that lexical items can be precisely considered in relation to each other, and by aggregating the similarity information for all items in the distribution, we can produce a visual representation of the structure of the semantic domain of the construction in order to observe how verbs in that domain are related to each other, and to immediately identify the regions of the semantic space that are densely populated (with tight clusters of verbs), and those that are more sparsely populated (fewer and/or more scattered verbs). |
Application of the vector-space model | With the quantification of semantic similarity provided by the distributional semantic model, it is also possible to properly test the hypothesis that productivity is tied to the structure of the semantic space. |
Conclusion | This paper reports the first attempt at using a distributional measure of semantic similarity derived from a vector-space model for the study of syntactic productivity in diachrony. |
Conclusion | Not only does distributional semantics provide an empirically-based measure of semantic similarity that appropriately captures semantic distinctions, it also enables the use of methods for which quantification is necessary, such as data visualization and statistical analysis. |
Distributional measure of semantic similarity | One benefit of the distributional semantics approach is that it allows semantic similarity between words to be quantified by measuring the similarity in their distribution. |
Distributional measure of semantic similarity | According to Sahlgren (2008), this kind of model captures to what extent words can be substituted for each other, which is a good measure of semantic similarity between verbs. |
Distributional measure of semantic similarity | In order to make sure that enough distributional information is available to reliably assess semantic similarity , verbs with less than 2,000 occurrences were excluded, which left 92 usable items (out of 105). |
Introduction | In this paper, I present a third alternative that takes advantage of advances in computational linguistics and draws on a distributionally-based measure of semantic similarity . |
The hell-construction | To answer these questions, I will analyze the distribution of the construction from a semantic point of view by using a measure of semantic similarity derived from distributional information. |
Computational Structures for RE | Combining syntax with semantics has a clear advantage: it generalizes lexical information encapsulated in syntactic parse trees, while at the same time syntax guides semantics in order to obtain an effective semantic similarity . |
Computational Structures for RE | We exploit this idea here for domain adaptation (DA): if words are generalized by semantic similarity LS, then in a hypothetical world changing LS such that it reflects the target domain would |
Computational Structures for RE | The question remains how to establish a link between the semantic similarity in the source and target domain. |
Conclusions and Future Work | We proposed syntactic tree kernels enriched by lexical semantic similarity to tackle the portability of a relation extractor to different domains. |
Introduction | In the empirical evaluation on Automatic Content Extraction (ACE) data, we evaluate the impact of convolution tree kernels embedding lexical semantic similarities . |
Results | Since we focus on evaluating the impact of semantic similarity in tree kernels, we think our system is very competitive. |
Semantic Syntactic Tree Kernels | After introducing related work, we will discuss computational structures for RE and their extension with semantic similarity . |
Framework Overview | 1) Using Tt , we obtain a set of relation translations with a semantic similarity score, T2073, To), for an English relation TE and a Chinese relation r0 (Figure 2 (b), Section 4.3) (e.g., TE =visit and r0 = iii/5]). |
Framework Overview | 2) Using TR, and Tt , we identify a set of semantically similar document pairs that describe the same event with a similarity score TE (d E, do) where d E is an English document and do is a Chinese document (Figure 2 (c), Section 4.4). |
Introduction | In particular, our approach leverages semantically similar document pairs to exclude incomparable parts that appear in one language only. |
Methods | H50”, Tj) if?” 6 RE and W E R0 H(7“i,7“j) = Hb(7“j,7“i) ifrj 6 RE and Ti 6 RC Hm(ri,rj) otherwise Intuitively, |H (7”, r3 indicates the strength of the semantic similarity of two relations Ti and N of any languages. |
Methods | However, as shown in Table 2, we cannot use this value directly to measure the similarity because the support intersection of semantically similar bilingual relations (e.g., |H (head to, i)? |
Methods | We consider that a pair of an English entity 6E and a Chinese entity ea are likely to indicate the same real world entity if they have 1) semantically similar relations to the same entity 2) under the same context. |
Related Work | Semantically similar relation mining |
Related Work | In automatically constructed knowledge bases, finding semantically similar relations can improve understanding of the Web describing content with many different expressions. |
Related Work | NELL (Mohamed et al., 2011) finds related relations using seed pairs of one given relation; then, using K—means clustering, it finds relations that are semantically similar to the given relation. |
Experimental Setup | Next, for each word we randomly selected 30 pairs under the assumption that they are representative of the full variation of semantic similarity . |
Experimental Setup | Participants were asked to rate a pair on two dimensions, visual and semantic similarity using a Likert scale of 1 (highly dissimilar) to 5 (highly similar). |
Experimental Setup | For semantic similarity , the mean correlation was 0.76 (Min 20.34, Max |
Results | We would expect the textual modality to be more dominant when modeling semantic similarity and conversely the perceptual modality to be stronger with respect to visual similarity. |
Results | The textual SAE correlates better with semantic similarity judgments (p = 0.65) than its visual equivalent (p = 0.60). |
Results | It yields a correlation coefficient of p = 0.70 on semantic similarity and p = 0.64 on visual similarity. |
Introduction | The semantic similarity of words is a longstanding topic in computational linguistics because it is theoretically intriguing and has many applications in the field. |
Introduction | A number of semantic similarity measures have been proposed based on this hypothesis (Hindle, 1990; Grefenstette, 1994; Dagan et al., 1994; Dagan et al., 1995; Lin, 1998; Dagan et al., 1999). |
Introduction | In general, most semantic similarity measures have the following form: |
Conclusion | We showed that our system outperforms two baselines and sometimes approaches human-level performance, especially because it can exploit the sequential structure of the script descriptions to separate clusters of semantically similar events. |
Evaluation | with a weighted edge; the weight reflects the semantic similarity of the nodes’ event descriptions as described in Section 5.2. |
Evaluation | Levenshtein Baseline: This system follows the same steps as our system, but using Levenshtein distance as the measure of semantic similarity for MSA and for node merging (cf. |
Evaluation | The clustering system, which can’t exploit the sequential information from the ESDs, has trouble distinguishing semantically similar phrases (high recall, low precision). |
Introduction | Crucially, our algorithm exploits the sequential structure of the ESDs to distinguish event descriptions that occur at different points in the script storyline, even when they are semantically similar . |
Temporal Script Graphs | 5.2 Semantic similarity |
Temporal Script Graphs | Intuitively, we want the MSA to prefer the alignment of two phrases if they are semantically similar , i.e. |
Abstract | We combine several graph alignment features with lexical semantic similarity measures using machine learning techniques and show that the student answers can be more accurately graded than if the semantic measures were used in isolation. |
Answer Grading System | Of these, 36 are based upon the semantic similarity |
Answer Grading System | 3.3 Lexical Semantic Similarity |
Answer Grading System | In order to address this, we combine the graph alignment scores, which encode syntactic knowledge, with the scores obtained from semantic similarity measures. |
Analysis and Discussion | Our aim in this paper is to characterize the semantic similarity of bilingual hierarchical rules. |
Experiments | The improved similarity function Alg2 makes it possible to incorporate monolingual semantic similarity on top of the bilingual semantic similarity , thus it may improve the accuracy of the similarity estimate. |
Introduction | The source and target sides of the rules with (*) at the end are not semantically equivalent; it seems likely that measuring the semantic similarity from their context between the source and target sides of rules might be helpful to machine translation. |
Related Work | Our work is different from all the above approaches in that we attempt to discriminate among hierarchical rules based on: 1) the degree of bilingual semantic similarity between source and target translation units; and 2) the monolingual semantic similarity between occurrences of source or target units as part of the given rule, and in general. |
Similarity Functions | A common way to calculate semantic similarity is by vector space cosine distance; we will also |
Similarity Functions | Therefore, on top of the degree of bilingual semantic similarity between a source and a target translation unit, we have also incorporated the monolingual semantic similarity between all occurrences of a source or target unit, and that unit’s occurrence as part of the given rule, into the sense similarity measure. |
Abstract | judging the semantic similarity of natural-language sentences), and show that PSL gives improved results compared to a previous approach based on Markov Logic Networks (MLNs) and a purely distributional approach. |
Background | Distributional models (Turney and Pantel, 2010), on the other hand, use statistics on contextual data from large corpora to predict semantic similarity of words and phrases (Landauer and Dumais, 1997; Mitchell and Lapata, 2010). |
Background | Distributional models are motivated by the observation that semantically similar words occur in similar contexts, so words can be represented as vectors in high dimensional spaces generated from the contexts in which they occur (Landauer and Dumais, 1997; Lund and Burgess, 1996). |
Background | (2013) use MLNs to represent the meaning of natural language sentences and judge textual entailment and semantic similarity , but they were unable to scale the approach beyond short sentences due to the complexity of MLN inference. |
Evaluation | More specifically, they strongly indicate that PSL is a more effective probabilistic logic for judging semantic similarity than MLNs. |
Background | (2007) compute the semantic similarity using WordNet. |
Background | The term pairs with semantic similarity higher than a predefined threshold will be grouped together. |
Our Approach 3.1 Wiki Concepts | We measure the semantic similarity between two concepts by using cosine distance between their wiki articles, which are represented as the vectors of wiki concepts as well. |
Our Approach 3.1 Wiki Concepts | For computation efficiency, we calculate semantic similarities between all promising concept pairs beforehand, and then retrieve the value in a Hash table directly. |
Our Approach 3.1 Wiki Concepts | Merge concepts whose semantic similarity is larger than predefined threshold (0.35 in our experiments) into the one with largest idf. |
Query Expansion in Axiomatic Retrieval Model | where s(q, d) is a semantic similarity function between two terms q and d, and f is a monotonically increasing function defined as |
Query Expansion in Axiomatic Retrieval Model | where 6 is a parameter that regulates the weighting of the original query terms and the semantically similar terms. |
Query Expansion in Axiomatic Retrieval Model | In our previous study (Fang and Zhai, 2006), term similarity function 3 is derived based on the mutual information of terms over collections that are constructed under the guidance of a set of term semantic similarity constraints. |
Term Similarity based on Lexical Resources | Since the definition provides valuable information about the semantic meaning of a term, we can use the definitions of the terms to measure their semantic similarity . |
Term Similarity based on Lexical Resources | Thus, we can compute the term semantic similarity based on synset definitions in the following way: |
Approach | We exploit this semantic similarity across languages by defining a bilingual (and trivially multilingual) energy as follows. |
Experiments | Even though the model did not use any parallel French-German data during training, it learns semantic similarity between these two languages using English as a pivot, and semantically clusters words across all languages. |
Related Work | Very simple composition functions have been shown to suffice for tasks such as judging bi-gram semantic similarity (Mitchell and Lapata, 2008). |
Related Work | Their architecture op-timises the cosine similarity of documents, using relative semantic similarity scores during learning. |
Introduction | Instead of procuring explicit representations, the kernel paradigm directly focuses on the larger goal of quantifying semantic similarity of larger linguistic units. |
Introduction | Figure l: Tokenwise syntactic and semantic similarities don’t imply sentential semantic similarity |
Introduction | With such neighbourhood contexts, the distributional paradigm posits that semantic similarity between a pair of motifs can be given by a sense of ‘distance’ between the two distributions. |
Conclusion and Future Work | Both of the Meta-path based and social correlation based semantic similarity measurements are proven powerful and complementary. |
Introduction | 0 We model social user behaviors and use social correlation to assist in measuring semantic similarities because the users who posted a morph and its corresponding target tend to share similar interests and opinions. |
Related Work | In this paper we exploit cross-genre information and social correlation to measure semantic similarity . |
Target Candidate Ranking | 4.2.3 Meta-Path-Based Semantic Similarity Measurements |
Experiments | (4) To understand the effect of utilizing syntactic structure and semantic similarity for constructing the summarization graph, we ran the experiments using just the unigrams and bigrams; we obtained a ROUGE-1 F-score of 37.1. |
Using the Framework | We identify similar views/opinions by computing semantic similarity rather than using standard similarity measures (such as cosine similarity based on exact lexical matches between different nodes in the graph). |
Using the Framework | For each pair of nodes (u,v) in the graph, we compute the semantic similarity score (using WordNet) between every pair of dependency relation (rel: a, b) in u and v as: s(u,v) = Z WN(a,-,aj) >< WN(b,-,bj), |
Using the Framework | Using the syntactic structure along with semantic similarity helps us identify useful (valid) nuggets of information within comments (or documents), avoid redundancies, and identify similar views in a semantic space. |
Context and Answer Detection | here, we use the product of sim(xu, Qi) and sim(:cv, {mm 62,} to estimate the possibility of being a context-answer pair for (u, v) , where sim(-, is the semantic similarity calculated on WordNet as described in Section 3.5. |
Context and Answer Detection | The similarity feature is to capture the word similarity and semantic similarity between candidate contexts and answers. |
Context and Answer Detection | The semantic similarity between words is computed based on Wu and Palmer’s measure (Wu and Palmer, 1994) using WordNet (Fellbaum, 1998).1 The similarity between contiguous sentences will be used to capture the dependency for CRFs. |
Discussion | The resulting vector is sparser but expresses more succinctly the meaning of the predicate-argument structure, and thus allows semantic similarity to be modelled more accurately. |
Evaluation Setup | 3We assessed a wide range of semantic similarity measures using the WordNet similarity package (Pedersen et a1., 2004). |
Evaluation Setup | Following previous work (Bullinaria and Levy, 2007), we optimized its parameters on a word-based semantic similarity task. |
Introduction | The appeal of these models lies in their ability to represent meaning simply by using distributional information under the assumption that words occurring within similar contexts are semantically similar (Harris, 1968). |
Introduction | 4 describe previous and our models for syntactic and semantic similarity , respectively, Sec. |
Related work | as a direct object is in fact a strong characterization for semantic similarity , as all the nouns m similar to n tend to collocate with the same verbs. |
Related work | By inducing geometrical notions of vectors and norms through corpus analysis, they provide a topological definition of semantic similarity , i.e., distance in a space. |
Structural Similarity Functions | 3.2 Tree Kernels driven by Semantic Similarity To our knowledge, two main types of tree kernels exploit lexical similarity: the syntactic semantic tree kernel defined in (Bloehdom and Moschitti, 2007a) applied to constituency trees and the smoothed partial tree kernels (SPTKs) defined in (Croce et al., 2011), which generalizes the former. |
Experiments | In general, the errors are produced by two different causes acting together: (i) imbalanced distribution of the relations, and (ii) semantic similarity between the relations. |
Experiments | The most frequent relation Elaboration tends to mislead others especially, the ones which are semantically similar (e.g., Explanation, Background) and less frequent (e.g., Summary, Evaluation). |
Experiments | The relations which are semantically similar mislead each other (e.g., Temporal:Background, Cause:Explanation). |
The Summarization Framework | Similarity to Question: Semantic similarity to the question and question context. |
The Summarization Framework | We compute the semantic similarity (Simpson and Crowe, 2005) between sentences or sub ques- |
The Summarization Framework | 2We use the semantic similarity of Equation 2 for all our similarity measurement in this paper. |
Experiments | All of these vectors capture broad semantic similarities . |
Our Model | To capture semantic similarities among words, we derive a probabilistic model of documents which learns word representations. |
Our Model | 3.1 Capturing Semantic Similarities |
Methodology 2.1 The Problem | When two sentences in 8 or T are not too short, or their content is not divergent in meaning, their semantic similarity can be estimated in terms of common words. |
Methodology 2.1 The Problem | Although semantic similarity estimation is a straightforward approach to deriving the two affinity matrices, other approaches are also feasible. |
Methodology 2.1 The Problem | To demonstrate the validity of the monolingual consistency, the semantic similarity defined by is evaluated as follows. |
Abstract | In a quantitative evaluation on the task of judging geographically informed semantic similarity between representations learned from 1.1 billion words of geo-located tweets, our joint model outperforms comparable independent models that learn meaning in isolation. |
Evaluation | We evaluate our model by confirming its face validity in a qualitative analysis and estimating its accuracy at the quantitative task of judging geographically-informed semantic similarity . |
Evaluation | As a quantitative measure of our model’s performance, we consider the task of judging semantic similarity among words whose meanings are likely to evoke strong geographical correlations. |
Experimental Evaluation | semantic similarity , reply, and quotation. |
System Design | On the one hand, the semantic similarity between two nodes can be measured with any commonly adopted metric, such as cosine similarity and J accard coefficient (Baeza—Yates and Ribeiro-Neto, 1999). |
System Design | Considering the semantic similarity between nodes, we use another variant of the PageRank algorithm to calculate the weight of comment |
Substructure Spaces for BTKs | In this section, we define seven lexical features to measure semantic similarity of a given subtree pair. |
Substructure Spaces for BTKs | baseline only assesses semantic similarity using the lexical features. |
Substructure Spaces for BTKs | In other words, to capture the semantic similarity , structure features requires lexical features to cooperate. |
Abstract | We experimentally demonstrate that the discourse structure of non-factoid answers provides information that is complementary to lexical semantic similarity between question and answer, improving performance up to 24% (relative) over a state-of-the-art model that exploits lexical semantic similarity alone. |
CR + LS + DMM + DPM 39.32* +24% 47.86* +20% | This way, the DMM and DPM features jointly capture discourse structures and semantic similarity between answer segments and question. |
CR + LS + DMM + DPM 39.32* +24% 47.86* +20% | Empirically we show that modeling answer discourse structures is complementary to modeling lexical semantic similarity and that the best performance is obtained when they are tightly integrated. |
Model | Turkers were required to type in their best guess, and the number of semantically similar guesses were counted by an average number of 6 other turkers. |
Model | A ratio of the median of semantically similar guesses to the total number of guesses was then taken as the score representing “predictability” of the word being guessed in the given context. |
Model | Turkers that judged the semantic similarity of the guesses of other turkers achieved an average Cohen’s kappa agreement of 0.44, indicating fair to poor agreement. |
Abstract | We first set up a human annotation of semantic links with or without contextual information to show the importance of the textual context in evaluating the relevance of semantic similarity , and to assess the prevalence of actual semantic relations between word tokens. |
Conclusion | We proposed a method to reliably evaluate distributional semantic similarity in a broad sense by considering the validation of lexical pairs in contexts where they both appear. |
Introduction | We hypothetize that evaluating and filtering semantic relations in texts where lexical items occur would help tasks that naturally make use of semantic similarity relations, but assessing this goes beyond the present work. |
Integrating Semantic Constraint into Surprisal | This can be achieved by turning a vector model of semantic similarity into a probabilistic language model. |
Models of Processing Difficulty | Semantic similarities are then modeled in terms of geometric similarities within the space. |
Models of Processing Difficulty | Despite its simplicity, the above semantic space (and variants thereof) has been used to successfully simulate lexical priming (e.g., McDonald 2000), human judgments of semantic similarity (Bullinaria and Levy 2007), and synonymy tests (Pado and Lapata 2007) such as those included in the Test of English as Foreign Language (TOEFL). |
Introduction | Deeper approaches leverage semantic similarity to go beyond the surface realization of definitions (Navigli, 2006; Meyer and Gurevych, 2011; Niemann and Gurevych, 2011). |
Resource Alignment | These two scores are then combined into an overall score (part (e) of Figure 1) which quantifies the semantic similarity of the two input concepts 01 and 02. |
Resource Alignment | PPR has been previously used in a wide variety of tasks such as definition similarity-based resource alignment (Niemann and Gurevych, 2011), textual semantic similarity (Hughes and Ramage, 2007; Pilehvar et al., 2013), Word Sense Disambiguation (Agirre and Soroa, 2009; Faralli and Navigli, 2012) and semantic text categorization (Navigli et al., 2011). |
System Implementation | The Pruning algorithm uses this dictionary to retrieve semantically similar questions. |
System Implementation | To retrieve answers for SMS queries that are semantically similar but lexically different from questions in the FAQ corpus we use the Synonym dictionary described in Section 5.2. |
System Implementation | Figure 4: Semantically similar SMS and questions |
Using Translation Probability | FAQ Finder (Burke et al., 1997) heuristically combines statistical similarities and semantic similarities between questions to rank FAQs. |
Using Translation Probability | Conventional vector space models are used to calculate the statistical similarity and WordNet (Fellbaum, 1998) is used to estimate the semantic similarity . |
Using Translation Probability | In contrast to that, question search retrieves answers for an unlimited range of questions by focusing on finding semantically similar questions in an archive. |