Index of papers in Proc. ACL 2013 that mention
  • co-occurrence
Chong, Tze Yuang and E. Banchs, Rafael and Chng, Eng Siong and Li, Haizhou
Abstract
In this paper, we explore the use of distance and co-occurrence information of word—pairs for language modeling.
Introduction
the distance is described regardless the actual frequency of the history-word, while the co-occurrence is described regardless the actual position of the history-word.
Language Modeling with TD and TO
In Eq.3, we have decoupled the observation of a word-pair into the events of distance and co-occurrence .
Language Modeling with TD and TO
The TD likelihood for a distance k given the co-occurrence of the word-pair (wi_k, wt) can be estimated from counts as follows:
Language Modeling with TD and TO
zero co-occurrence C(wi_k E h,t = w,) = O, which results in a division by zero.
Motivation of the Proposed Approach
The attributes of distance and co-occurrence are exploited and modeled differently in each language modeling approach.
Motivation of the Proposed Approach
Both, the conventional trigger model and the latent-semantic model capture the co-occurrence information while ignoring the distance information.
Motivation of the Proposed Approach
On the other hand, distant-bigram models and distance-dependent trigger models make use of both, distance and co-occurrence , information up to window sizes of ten to twenty.
co-occurrence is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Beigman Klebanov, Beata and Flor, Michael
Application to Essay Scoring
8It is also possible that some of the instances with very high PMI are pairs that contain low frequency words for which the database predicts a spuriously high PMI based on a single (and atypical) co-occurrence that happens to repeat in an essay — similar to the Schwartz eschews example in (Manning and Schiitze, 1999, Table 5.16, p. 181).
Introduction
Thus, co-occurrence of words in n-word windows, syntactic structures, sentences, paragraphs, and even whole documents is captured in vector-space models built from text corpora (Turney and Pan-tel, 2010; Basili and Pennacchiotti, 2010; Erk and Pado, 2008; Mitchell and Lapata, 2008; Bullinaria and Levy, 2007; Jones and Mewhort, 2007; Pado and Lapata, 2007; Lin, 1998; Landauer and Dumais, 1997; Lund and Burgess, 1996; Salton et al., 1975).
Introduction
However, little is known about typical profiles of texts in terms of co-occurrence behavior of their words.
Introduction
The cited approaches use topic models that are in turn estimated using word co-occurrence .
Methodology
The first decision is how to quantify the extent of co-occurrence between two words; we will use point-wise mutual information (PMI) estimated from a large and diverse corpus of texts.
Methodology
The third decision is how to represent the co-occurrence profiles; we use a histogram where each bin represents the proportion of word pairs in the given interval of PMI values.
Methodology
To obtain comprehensive information about typical co-occurrence behavior of words of English, we build a first-order co-occurrence word-space model (Turney and Pantel, 2010; Baroni and Lenci, 2010).
co-occurrence is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Maxwell, K. Tamsin and Oberlander, Jon and Croft, W. Bruce
Experimental setup
Co-occurrence features (C)
Experimental setup
Co-occurrence and IR effectiveness prediction features (CI) was the most influential class, and accounted for 70% of all features in the model.
Related work
In ad hoc IR, most models of term dependence use word co-occurrence and proximity (Song and Croft, 1999; Metzler and Croft, 2005; Srikanth and Srihari, 2002; van Rijsbergen, 1993).
Selection method for catenae
Co-occurrence features: A governor wl tends to subcategorize for its dependents wn.
Selection method for catenae
We conclude that co-occurrence is an important feature of dependency relations (Mel’cuk, 2003).
Selection method for catenae
In addition, term frequencies and inverse document frequencies calculated using word co-occurrence measures are commonly used in IR.
co-occurrence is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Ogura, Yukari and Kobayashi, Ichiro
Experiment
By this, constructing a graph based on word co-occurrence of each 3 sentences in a document works well to rank important words, taking account of the context of the word.
Introduction
In our study, we express the relation of word co-occurrence in the form of a graph.
Related studies
(2011) has detected topics in a document by constructing a graph of word co-occurrence and applied the PageRank algorithm on it.
Related studies
The graph used in our method is constructed based on word co-occurrence so that important words which are sensitive to latent information can be extracted by the PageRank algorithm.
Techniques for text classification
According to (Newman et al., 2010), topic coherence is related to word co-occurrence .
Techniques for text classification
The refined documents are composed of the important sentences extracted from a viewpoint of latent information, i.e., word co-occurrence , so they are proper to be classified based on latent information.
Techniques for text classification
In our study, we construct a graph based on word co-occurrence .
co-occurrence is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Pereira, Lis and Manguilimotan, Erlyn and Matsumoto, Yuji
Related Work
Table 2 Context of a particular noun represented as a co-occurrence vector
Related Work
Context is represented as co-occurrence vectors that are based on syntactic dependencies.
Related Work
Table 3 Context of a particular noun represented as a co-occurrence vector
co-occurrence is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Tian, Zhenhua and Xiang, Hengheng and Liu, Ziqi and Zheng, Qinghua
RSP: A Random Walk Model for SP
We initiate the links E with the raw co-occurrence counts of seen predicate-argument pairs in a given generalization data.
RSP: A Random Walk Model for SP
But in SP, the preferences between the predicates and arguments are implicit: their co-occurrence counts follow the power law distribution and vary greatly.
RSP: A Random Walk Model for SP
investigate the correlations between the co-occurrence counts (CT) C(q, a), or smoothed counts with the human plausibility judgements (Lapata et al., 1999; Lapata et al., 2001).
Related Work 2.1 WordNet-based Approach
(1999) introduce a general similarity-based model for word co-occurrence probabilities, which can be interpreted for SP.
co-occurrence is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Lazaridou, Angeliki and Marelli, Marco and Zamparelli, Roberto and Baroni, Marco
Composition methods
Distributional semantic models (DSMs), also known as vector-space models, semantic spaces, or by the names of famous incarnations such as Latent Semantic Analysis or Topic Models, approximate the meaning of words with vectors that record their patterns of co-occurrence with corpus context features (often, other words).
Experimental setup
We collect co-occurrence statistics for the top 20K content words (adjectives, adverbs, nouns, verbs)
Experimental setup
Due to differences in co-occurrence weighting schemes (we use a logarithmically scaled measure, they do not), their multiplicative model is closer to our additive one.
Introduction
Distributional semantic models (DSMs) in particular represent the meaning of a word by a vector, the dimensions of which encode corpus-extracted co-occurrence statistics, under the assumption that words that are semantically similar will occur in similar contexts (Tumey and Pantel, 2010).
Introduction
Trying to represent the meaning of arbitrarily long constructions by directly collecting co-occurrence statistics is obviously ineffective and thus methods have been developed to derive the meaning of larger constructions as a function of the meaning of their constituents (Baroni and Zamparelli, 2010; Coecke et al., 2010; Mitchell and Lapata, 2008; Mitchell and Lapata, 2010; Socher et al., 2012).
co-occurrence is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Razmara, Majid and Siahbani, Maryam and Haffari, Reza and Sarkar, Anoop
Collocational Lexicon Induction
A distributional profile (DP) of a word or phrase type is a co-occurrence vector created by combining all co-occurrence vectors of the tokens of that phrase type.
Collocational Lexicon Induction
These co-occurrence counts are converted to an association measure (Section 2.2) that encodes the relatedness of each pair of words or phrases.
Collocational Lexicon Induction
A(-, is an association measure and can simply be defined as co-occurrence counts within sliding windows.
Related work
They used a graph based on context similarity as well as co-occurrence graph in propagation process.
co-occurrence is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Huang, Hongzhao and Wen, Zhen and Yu, Dian and Ji, Heng and Sun, Yizhou and Han, Jiawei and Li, He
Experiments
Retweets and redundant web documents are filtered to ensure more reliable frequency counting of co-occurrence relations.
Introduction
Thus, the co-occurrence of a morph and its target is quite low in the vast amount of information in social media.
Target Candidate Ranking
After applying the same annotation techniques as tweets for uncensored data sets, sentence-level co-occurrence relations are extracted and integrated into the network as shown in Figure 3.
co-occurrence is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Weller, Marion and Fraser, Alexander and Schulte im Walde, Sabine
Using subcategorization information
In contrast, noun modification in noun-nounGen construction is represented by co-occurrence frequencies.7
Using subcategorization information
andalsof =0: this representation allows for a more fine- grained distinction in the low-to-mid frequency range, providing a good basis for the decision of whether a given noun-noun pair is a true noun-nounaen structure or just a random co-occurrence of two nouns.
Using subcategorization information
The word Technologie (technology) has been marked as a candidate for a genitive in a noun-nounGen constructions; the co-occurrence frequency of the tuple Einfi‘ihrung-Technologie (introduction - technology) lies in the bucket 11. .
co-occurrence is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: