Vector spaces for historical linguistics: Using distributional semantics to study syntactic productivity in diachrony
Perek, Florent

Article Structure

Abstract

This paper describes an application of distributional semantics to the study of syntactic productivity in diachrony, i.e., the property of grammatical constructions to attract new lexical items over time.

Introduction

Language change does not exclusively consist of drastic shifts in ‘core’ aspects of grammar, such as changes in word order.

The hell-construction

The case study presented in this paper considers the syntactic pattern “V the hell out of NP”, as exemplified by the following sentences from the Corpus of Contemporary American English (COCA; Davies, 2008):

Distributional measure of semantic similarity

Drawing on the observation that words occurring in similar contexts tend to have related meanings (Miller and Charles, 1991), distributional approaches to semantics seek to capture the meaning of words through their distribution in large text corpora (Lenci, 2008; Turney and Pantel, 2010; Erk, 2012).

Application of the vector-space model

4.1 Semantic plots

Conclusion

This paper reports the first attempt at using a distributional measure of semantic similarity derived from a vector-space model for the study of syntactic productivity in diachrony.

Topics

semantic space

Appears in 14 sentences as: semantic space (14)
In Vector spaces for historical linguistics: Using distributional semantics to study syntactic productivity in diachrony
  1. Coverage relates to how the semantic domain of a construction is populated in the vicinity of a given target coinage, and in particular to the density of the semantic space .
    Page 1, “Introduction”
  2. The resulting matrix, which contains the distributional information (in 4,683 columns) for 92 verbs occurring in the hell-construction, constitutes the semantic space under consideration in this case study.
    Page 3, “Distributional measure of semantic similarity”
  3. Besides, using the same data presents the advantage that the distribution is modeled with the same semantic space in all time periods, which makes it easier to visualize changes.
    Page 3, “Distributional measure of semantic similarity”
  4. One of the advantages conferred by the quantification of semantic similarity is that lexical items can be precisely considered in relation to each other, and by aggregating the similarity information for all items in the distribution, we can produce a visual representation of the structure of the semantic domain of the construction in order to observe how verbs in that domain are related to each other, and to immediately identify the regions of the semantic space that are densely populated (with tight clusters of verbs), and those that are more sparsely populated (fewer and/or more scattered verbs).
    Page 3, “Application of the vector-space model”
  5. Outside of these two clusters, the semantic space is much more sparsely populated.
    Page 4, “Application of the vector-space model”
  6. In sum, the semantic plots show that densely populated regions of the semantic space appear to be the most likely to attract new members.
    Page 4, “Application of the vector-space model”
  7. With the quantification of semantic similarity provided by the distributional semantic model, it is also possible to properly test the hypothesis that productivity is tied to the structure of the semantic space .
    Page 4, “Application of the vector-space model”
  8. In view of the observations collected on the semantic plots and in line with previous research (especially Suttle and Goldberg’s notion of coverage), I suggest that the occurrence of a new item in the construction in a given period is related to the density of the semantic space around that item in the previous period.
    Page 4, “Application of the vector-space model”
  9. If the semantic space around the novel item is dense, i.e., if there is a high number of similar items, the coinage will be very likely.
    Page 4, “Application of the vector-space model”
  10. The sparser the semantic space around a given item, the less likely this item can be used.
    Page 4, “Application of the vector-space model”
  11. The measure of density used in this study considers the set of the N nearest neighbors of a given item in the semantic space , and is defined by the following formula:
    Page 4, “Application of the vector-space model”

See all papers in Proc. ACL 2014 that mention semantic space.

See all papers in Proc. ACL that mention semantic space.

Back to top.

semantic similarity

Appears in 10 sentences as: semantic similarity (10)
In Vector spaces for historical linguistics: Using distributional semantics to study syntactic productivity in diachrony
  1. By providing an empirical measure of semantic similarity between words derived from lexical co-occurrences, distributional semantics not only reliably captures how the verbs in the distribution of a construction are related, but also enables the use of visualization techniques and statistical modeling to analyze the semantic development of a construction over time and identify the semantic determinants of syntactic productivity in naturally occurring data.
    Page 1, “Abstract”
  2. In this paper, I present a third alternative that takes advantage of advances in computational linguistics and draws on a distributionally-based measure of semantic similarity .
    Page 1, “Introduction”
  3. To answer these questions, I will analyze the distribution of the construction from a semantic point of view by using a measure of semantic similarity derived from distributional information.
    Page 2, “The hell-construction”
  4. One benefit of the distributional semantics approach is that it allows semantic similarity between words to be quantified by measuring the similarity in their distribution.
    Page 2, “Distributional measure of semantic similarity”
  5. According to Sahlgren (2008), this kind of model captures to what extent words can be substituted for each other, which is a good measure of semantic similarity between verbs.
    Page 2, “Distributional measure of semantic similarity”
  6. In order to make sure that enough distributional information is available to reliably assess semantic similarity , verbs with less than 2,000 occurrences were excluded, which left 92 usable items (out of 105).
    Page 3, “Distributional measure of semantic similarity”
  7. One of the advantages conferred by the quantification of semantic similarity is that lexical items can be precisely considered in relation to each other, and by aggregating the similarity information for all items in the distribution, we can produce a visual representation of the structure of the semantic domain of the construction in order to observe how verbs in that domain are related to each other, and to immediately identify the regions of the semantic space that are densely populated (with tight clusters of verbs), and those that are more sparsely populated (fewer and/or more scattered verbs).
    Page 3, “Application of the vector-space model”
  8. With the quantification of semantic similarity provided by the distributional semantic model, it is also possible to properly test the hypothesis that productivity is tied to the structure of the semantic space.
    Page 4, “Application of the vector-space model”
  9. This paper reports the first attempt at using a distributional measure of semantic similarity derived from a vector-space model for the study of syntactic productivity in diachrony.
    Page 5, “Conclusion”
  10. Not only does distributional semantics provide an empirically-based measure of semantic similarity that appropriately captures semantic distinctions, it also enables the use of methods for which quantification is necessary, such as data visualization and statistical analysis.
    Page 5, “Conclusion”

See all papers in Proc. ACL 2014 that mention semantic similarity.

See all papers in Proc. ACL that mention semantic similarity.

Back to top.

distributional semantics

Appears in 6 sentences as: distributional semantic (1) distributional semantics (5)
In Vector spaces for historical linguistics: Using distributional semantics to study syntactic productivity in diachrony
  1. This paper describes an application of distributional semantics to the study of syntactic productivity in diachrony, i.e., the property of grammatical constructions to attract new lexical items over time.
    Page 1, “Abstract”
  2. By providing an empirical measure of semantic similarity between words derived from lexical co-occurrences, distributional semantics not only reliably captures how the verbs in the distribution of a construction are related, but also enables the use of visualization techniques and statistical modeling to analyze the semantic development of a construction over time and identify the semantic determinants of syntactic productivity in naturally occurring data.
    Page 1, “Abstract”
  3. On the basis of a case study of the construction “V the hell out of NP”, I show how distributional semantics can profitably be applied to the study of syntactic productivity.
    Page 1, “Introduction”
  4. One benefit of the distributional semantics approach is that it allows semantic similarity between words to be quantified by measuring the similarity in their distribution.
    Page 2, “Distributional measure of semantic similarity”
  5. With the quantification of semantic similarity provided by the distributional semantic model, it is also possible to properly test the hypothesis that productivity is tied to the structure of the semantic space.
    Page 4, “Application of the vector-space model”
  6. Not only does distributional semantics provide an empirically-based measure of semantic similarity that appropriately captures semantic distinctions, it also enables the use of methods for which quantification is necessary, such as data visualization and statistical analysis.
    Page 5, “Conclusion”

See all papers in Proc. ACL 2014 that mention distributional semantics.

See all papers in Proc. ACL that mention distributional semantics.

Back to top.

co-occurrence

Appears in 3 sentences as: co-occurrence (4)
In Vector spaces for historical linguistics: Using distributional semantics to study syntactic productivity in diachrony
  1. A wide range of distributional information can be employed in vector-based models; the present study uses the ‘bag of words’ approach, which is based on the frequency of co-occurrence of words within a given context window.
    Page 2, “Distributional measure of semantic similarity”
  2. The part-of-speech annotated lemma of each collocate within a 5-word window was extracted from the COCA data to build the co-occurrence matrix recording the frequency of co-occurrence of each verb with its collocates.
    Page 3, “Distributional measure of semantic similarity”
  3. The co-occurrence matrix was transformed by applying a Point-wise Mutual Information weighting scheme, using the DISSECT toolkit (Dinu et al., 2013), to turn the raw frequencies into weights that reflect how distinctive a collocate is for a given target word with respect to the other target words under consideration.
    Page 3, “Distributional measure of semantic similarity”

See all papers in Proc. ACL 2014 that mention co-occurrence.

See all papers in Proc. ACL that mention co-occurrence.

Back to top.

logistic regression

Appears in 3 sentences as: logistic regression (3)
In Vector spaces for historical linguistics: Using distributional semantics to study syntactic productivity in diachrony
  1. This measure of density was used as a factor in logistic regression to predict the first occurrence of a verb in the construction, coded as the binary variable OCCURRENCE, set to 1 for the first period in which the verb is attested in the construction, and to 0 for all preceding periods (later periods were discarded).
    Page 5, “Application of the vector-space model”
  2. Table 1: Summary of logistic regression results for different values of N. Model formula: OCCURRENCE ~ DENSITY + (1 + DENSITY|VERB).
    Page 5, “Application of the vector-space model”
  3. Using multidimensional scaling and logistic regression , it was shown that the occurrence of new items throughout the history of the construction can be predicted by the density of the semantic space in the neighborhood of these items in prior usage.
    Page 5, “Conclusion”

See all papers in Proc. ACL 2014 that mention logistic regression.

See all papers in Proc. ACL that mention logistic regression.

Back to top.