Index of papers in Proc. ACL that mention
  • distributional semantics
Cheung, Jackie Chi Kit and Penn, Gerald
Abstract
In contrast, vector space models of distributional semantics are trained on large corpora, but are typically applied to domain-general lexical disambiguation tasks.
Abstract
We introduce Distributional Semantic Hidden Markov Models, a novel variant of a hidden Markov model that integrates these two approaches by incorporating contextualized distributional semantic vectors into a generative model as observed emissions.
Distributional Semantic Hidden Markov Models
Unlike in most applications of HMMs in text processing, in which the representation of a token is simply its word or lemma identity, tokens in DSHMM are also associated with a vector representation of their meaning in context according to a distributional semantic model (Section 3.1).
Introduction
By contrast, distributional semantic models are trained on large, domain-general corpora.
Introduction
In this paper, we propose to inject contextualized distributional semantic vectors into generative probabilistic models, in order to combine their complementary strengths for domain modelling.
Introduction
There are a number of potential advantages that distributional semantic models offer.
Related Work
Our work is similar in that we assume much of the same structure within a domain and consequently in the model as well (Section 3), but whereas PROFINDER focuses on finding the “correct” number of frames, events, and slots with a nonparametric method, this work focuses on integrating global knowledge in the form of distributional semantics into a probabilistic model.
distributional semantics is mentioned in 19 sentences in this paper.
Topics mentioned in this paper:
Lazaridou, Angeliki and Marelli, Marco and Zamparelli, Roberto and Baroni, Marco
Abstract
This is a major cause of data sparseness for corpus-based approaches to lexical semantics, such as distributional semantic models of word meaning.
Abstract
Our results constitute a novel evaluation of the proposed composition methods, in which the full additive model achieves the best performance, and demonstrate the usefulness of a compositional morphology component in distributional semantics .
Composition methods
Distributional semantic models (DSMs), also known as vector-space models, semantic spaces, or by the names of famous incarnations such as Latent Semantic Analysis or Topic Models, approximate the meaning of words with vectors that record their patterns of co-occurrence with corpus context features (often, other words).
Composition methods
Since the very inception of distributional semantics , there have been attempts to compose meanings for sentences and larger passages (Landauer and Dumais, 1997), but interest in compositional DSMs has skyrocketed in the last few years, particularly since the influential work of Mitchell and Lapata (2008; 2009; 2010).
Experimental setup
4.2 Distributional semantic space6
Experimental setup
This result is of practical importance for distributional semantics , as it paves the way to address one of the main causes of data sparseness, and it confirms the usefulness of the compositional approach in a new domain.
Experimental setup
We would also like to apply composition to inflectional morphology (that currently lies outside the scope of distributional semantics ), to capture the nuances of meaning that, for example, distinguish singular and plural nouns (consider, e. g., the difference between the mass singular tea and the plural teas, which coerces the noun into a count interpretation (Katz and Zamparelli, 2012)).
Introduction
Distributional semantic models (DSMs) in particular represent the meaning of a word by a vector, the dimensions of which encode corpus-extracted co-occurrence statistics, under the assumption that words that are semantically similar will occur in similar contexts (Tumey and Pantel, 2010).
Introduction
Compositional distributional semantic models (cDSMs) of word units aim at handling, compositionally, the high productivity of phrases and consequent data sparseness.
Related work
Our system, given re- and build, predicts the ( distributional semantic ) meaning of rebuild.
Related work
Another emerging line of research uses distributional semantics to model human intuitions about the semantic transparency of morphologically derived or compound expressions and how these impact various lexical processing tasks (Kuperman, 2009; Wang et al., 2012).
distributional semantics is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Dinu, Georgiana and Baroni, Marco
Abstract
We introduce the problem of generation in distributional semantics : Given a distributional vector representing some meaning, how can we generate the phrase that best expresses that meaning?
Conclusion
In this paper we have outlined a framework for the task of generation with distributional semantic models.
Conclusion
From a more theoretical point of view, our work fills an important gap in distributional semantics , making it a bidirectional theory of the connection between language and meaning.
Conclusion
Some research has already established a connection between neural and distributional semantic vector spaces (Mitchell et al., 2008; Murphy et al., 2012).
Introduction
If distributional semantics is to be considered a proper semantic theory, then it must deal not only with synthesis (going from words to vectors), but also with generation (from vectors to words).
Introduction
Distributional semantics assumes a lexicon of atomic expressions (that, for simplicity, we take to be words), each associated to a vector.
Introduction
this paper, we introduce a more direct approach to phrase generation, inspired by the work in compositional distributional semantics .
Noun phrase generation
Compositional distributional semantic systems are often evaluated on phrase and sentence paraphrasing data sets (Bla-coe and Lapata, 2012; Mitchell and Lapata, 2010; Socher et al., 2011; Tumey, 2012).
Noun phrase generation
This is a much more challenging task and it paves the way to more realistic applications of distributional semantics in generation scenarios.
Related work
To the best of our knowledge, we are the first to explicitly and systematically pursue the generation problem in distributional semantics .
Related work
They introduce a bidirectional language-to-meaning model for compositional distributional semantics that is similar in spirit to ours.
distributional semantics is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Baroni, Marco and Dinu, Georgiana and Kruszewski, Germán
Abstract
Context-predicting models (more commonly known as embeddings or neural language models) are the new kids on the distributional semantics block.
Abstract
Despite the buzz surrounding these models, the literature is still lacking a systematic comparison of the predictive models with classic, count-vector-based distributional semantic approaches.
Conclusion
As seasoned distributional semanticists with thorough experience in developing and using count vectors, we set out to conduct this study because we were annoyed by the triumphalist overtones often surrounding predict models, despite the almost complete lack of a proper comparison to count vectors.
Conclusion
To give just one last example, distributional semanticists have looked at whether certain properties of vectors reflect semantic relations in the expected way: e.g., whether the vectors of hypemyms “distribution-ally include” the vectors of hyponyms in some mathematical precise sense.
Conclusion
Does all of this even matter, or are we on the cusp of discovering radically new ways to tackle the same problems that have been approached as we just sketched in traditional distributional semantics ?
Evaluation materials
The ESSLLI 2008 Distributional Semantic Workshop shared-task set (esslli) contains 44 concepts to be clustered into 6 categories (Baroni et al., 2008) (we ignore here the 3- and 2-way higher-level partitions coming with this set).
Introduction
In concrete, distributional semantic models (DSMs) use vectors that keep track of the contexts (e.g., co-occurring words) in which target terms appear in a large corpus as proxies for meaning representations, and apply geometric techniques to these vectors to measure the similarity in meaning of the corresponding words (Clark, 2013; Erk, 2012; Turney and Pantel, 2010).
distributional semantics is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Srivastava, Shashank and Hovy, Eduard
Abstract
Traditional models of distributional semantics suffer from computational issues such as data sparsity for individual lex-emes and complexities of modeling semantic composition when dealing with structures larger than single lexical items.
Abstract
In this work, we present a frequency-driven paradigm for robust distributional semantics in terms of semantically cohesive lineal constituents, or motifs.
Introduction
In particular, such a perspective can be especially advantageous for distributional semantics for reasons we outline below.
Introduction
Distributional semantic models (DSMs) that represent words as distributions over neighbouring contexts have been particularly effective in capturing fine-grained lexical semantics (Tumey et al., 2010).
Introduction
In this section, we define our frequency-driven framework for distributional semantics in detail.
distributional semantics is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Abend, Omri and Cohen, Shay B. and Steedman, Mark
Abstract
To our knowledge, this is the first work to address lexical relations between MWPs of varying degrees of compositionality within distributional semantics .
Background and Related Work
Compositional Distributional Semantics .
Background and Related Work
Several works have used compositional distributional semantics (CDS) representations to assess the compositionality of MWEs, such as noun compounds (Reddy et al., 2011) or verb-noun combinations (Kiela and Clark, 2013).
Discussion
Much recent work subsumed under the title Compositional Distributional Semantics addressed the distributional representation of multi-word phrases (see Section 2).
Introduction
This work addresses the modelling of MWPs within the context of distributional semantics (Tur-ney and Pantel, 2010), in which predicates are represented through the distribution of arguments they may take.
Introduction
To our knowledge, this is the first work to address lexical relations between MWPs of varying degrees of compositionality within distributional semantics .
distributional semantics is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Perek, Florent
Abstract
This paper describes an application of distributional semantics to the study of syntactic productivity in diachrony, i.e., the property of grammatical constructions to attract new lexical items over time.
Abstract
By providing an empirical measure of semantic similarity between words derived from lexical co-occurrences, distributional semantics not only reliably captures how the verbs in the distribution of a construction are related, but also enables the use of visualization techniques and statistical modeling to analyze the semantic development of a construction over time and identify the semantic determinants of syntactic productivity in naturally occurring data.
Application of the vector-space model
With the quantification of semantic similarity provided by the distributional semantic model, it is also possible to properly test the hypothesis that productivity is tied to the structure of the semantic space.
Conclusion
Not only does distributional semantics provide an empirically-based measure of semantic similarity that appropriately captures semantic distinctions, it also enables the use of methods for which quantification is necessary, such as data visualization and statistical analysis.
Distributional measure of semantic similarity
One benefit of the distributional semantics approach is that it allows semantic similarity between words to be quantified by measuring the similarity in their distribution.
Introduction
On the basis of a case study of the construction “V the hell out of NP”, I show how distributional semantics can profitably be applied to the study of syntactic productivity.
distributional semantics is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Lazaridou, Angeliki and Bruni, Elia and Baroni, Marco
Experimental Setup
For constructing the text-based vectors, we follow a standard pipeline in distributional semantics (Turney and Pantel, 2010) without tuning its parameters and collect co-occurrence statistics from the concatenation of ukWaC4 and the Wikipedia, amounting to 2.7 billion tokens in total.
Experimental Setup
Singular Value Decomposition (SVD) SVD is the most widely used dimensionality reduction technique in distributional semantics (Turney and Pantel, 2010), and it has recently been exploited to combine visual and linguistic dimensions in the multimodal distributional semantic model of Bruni et al.
Experimental Setup
The cosine has been widely used in the distributional semantic literature, and it has been shown to outperform Euclidean distance (Bullinaria and Levy, 2007).7 Parameters were estimated with standard backpropagation and L—BFGS.
Results
For the SVD model, we set the number of dimensions to 300, a common choice in distributional semantics , coherent with the settings we used for the visual and linguistic spaces.
Zero-shot learning and fast mapping
Concretely, we assume that concepts, denoted for convenience by word labels, are represented in linguistic terms by vectors in a text-based distributional semantic space (see Section 4.3).
distributional semantics is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Beltagy, Islam and Erk, Katrin and Mooney, Raymond
Background
2.2 Distributional Semantics
Background
Distributional semantic knowledge is then encoded as weighted inference rules in the MLN.
PSL for STS
Given the logical forms for a pair of sentences, a text T and a hypothesis H, and given a set of weighted rules derived from the distributional semantics (as explained in section 2.6) composing the knowledge base KB, we build a PSL model that supports determining the truth value of H in the most probable interpretation (i.e.
PSL for STS
KB: The knowledge base is a set of lexical and phrasal rules generated from distributional semantics , along with a similarity score for each rule (section 2.6).
distributional semantics is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Lippincott, Thomas and Korhonen, Anna and Ó Séaghdha, Diarmuid
Conclusions and future work
In a sense this is encouraging, as it motivates our most exciting future work: augmenting this simple model to explicitly capture complementary information such as distributional semantics (Blei et al., 2003), diathesis alternations (McCarthy, 2000) and selectional preferences (0 Séaghdha, 2010).
Conclusions and future work
By combining the syntactic classes with unsupervised POS tagging (Teichert and Daumé III, 2009) and the selectional preferences with distributional semantics (O Séaghdha, 2010), we hope to produce more accurate results on these complementary tasks while avoiding the use of any supervised learning.
Previous work
Graphical models have been increasingly popular for a variety of tasks such as distributional semantics (Blei et al., 2003) and unsupervised POS tagging (Finkel et al., 2007), and sampling methods allow efficient estimation of full joint distributions (Neal, 1993).
distributional semantics is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: