Index of papers in Proc. ACL that mention
  • clusterings
Kawahara, Daisuke and Peterson, Daniel W. and Palmer, Martha
Experiments and Evaluations
We first describe our experimental settings and define evaluation metrics to evaluate induced soft clusterings of verb classes.
Experiments and Evaluations
This kind of normalization for soft clusterings was performed for other evaluation metrics as in Springorum et al.
Experiments and Evaluations
(2003) evaluated hard clusterings based on a gold standard with multiple classes per verb.
Introduction
Moreover, to the best of our knowledge, none of the following approaches attempt to quantitatively evaluate soft clusterings of verb classes induced by polysemy-aware unsupervised approaches (Korhonen et al., 2003; Lapata and Brew, 2004; Li and Brew, 2007; Schulte im Walde et al., 2008).
clusterings is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Kim, Young-Bum and Snyder, Benjamin
Analysis
Figure 4 shows MAP transition Dirichlet hyperparameters of the CLUST model, when trained
Analysis
Finally, we examine the relationship between the induced clusters and language families in Table 3, for the trigram consonant vs. vowel CLUST model with 20 clusters.
Experiments
EM 93.37 74.59 ft SYMM 95.99 80.72 MERGE 97.14 86.13 CLUST 98.85 89.37 33 EM 94.50 74.53 E SYMM 96.18 78.13 E MERGE 97.66 86.47 CLUST 98.55 89.07 .5 EM 92.93 78.26 3 SYMM 95.90 79.04 g MERGE 96.06 83.78 2 CLUST 97.03 85.79
Experiments
Finally, we consider the full version of our model, CLUST , with 20 language clusters.
Results
Figure 3: Confusion matrix for CLUST (left) and EM (right).
Results
Both MERGE and CLUST break symmetries over tags by way of the asymmetric posterior over transition Dirichlet parameters.
clusterings is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Uszkoreit, Jakob and Brants, Thorsten
Abstract
The resulting clusterings are then used in training partially class—based language models.
Distributed Clustering
The clusterings generated in each iteration as well as the initial clustering are stored as the set of words in each cluster, the total number of occurrences of each cluster in the training corpus, and the list of words preceeding each cluster.
Distributed Clustering
The quality of class-based models trained using the resulting clusterings did not differ noticeably from those trained using clusterings for which the full vocabulary was considered in each iteration.
Experiments
We trained a number of predictive class-based language models on different Arabic and English corpora using clusterings trained on the complete data of the same corpus.
Experiments
For the first experiment we trained predictive class-based 5-gram models using clusterings with 64, 128, 256 and 512 clusters1 on the eniarget data.
clusterings is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Lin, Dekang and Wu, Xiaoyun
Discussion and Related Work
One advantage of the two-stage approach is that the same clusterings may be used for different problems or different components of the same system.
Discussion and Related Work
One nagging issue with K—Means clustering is how to set k. We show that this question may not need to be answered because we can use clusterings with different k’s at the same time and let the discriminative classifier cherry-pick the clusters at different granularities according to the supervised data.
Named Entity Recognition
We can easily use multiple clusterings in feature extraction.
Query Classification
When we extract features from multiple clusterings , the selection of the top-N clusters is done separately for each clustering.
Query Classification
The best result is achieved with multiple phrasal clusterings .
clusterings is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Bamman, David and O'Connor, Brendan and Smith, Noah A.
Evaluation
We evaluate our methods in two quantitative ways by measuring the degree to which we recover two different sets of gold-standard clusterings .
Evaluation
To measure the similarity between the two clusterings of movie characters, gold clusters Q and induced latent persona clusters C, we calculate the variation of information (Meila, 2007):
Evaluation
VI measures the information-theoretic distance between the two clusterings : a lower value means greater similarity, and VI = 0 if they are identical.
clusterings is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Schütze, Hinrich
Experimental Setup
We run four different clusterings for each base set size (except for the large sets, see below).
Experimental Setup
The unique-event clusterings are motivated by the fact that in the Dupont—Rosenfeld model, frequent events are handled by discounted ML estimates.
Experimental Setup
As we will see below, rare-event clusterings perform better than all-event clusterings .
Results
When comparing all-event and unique-event clusterings , a clear tendency is apparent.
clusterings is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zollmann, Andreas and Vogel, Stephan
Clustering phrase pairs directly using the K-means algorithm
Using multiple word clusterings simultaneously, each based on a different number of classes, could turn this global, hard tradeoff into a local, soft one, informed by the number of phrase pair instances available for a given granularity.
Clustering phrase pairs directly using the K-means algorithm
In the same fashion, we can incorporate multiple tagging schemes (e.g., word clusterings of different gran-ularities) into the same feature vector.
Experiments
Figure 1 (left) shows the performance of the distributional clustering model ( ‘Clust’ ) and its morphology-sensitive extension (‘Clust—morph’) according to this score for varying values of N = l, .
Experiments
, 36 (the number Penn treebank POS tags, used for the ‘POS’ models, is 36).6 For ‘Clust’ , we see a comfortably wide plateau of nearly-identical scores from N = 7,. .
clusterings is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Titov, Ivan and Klementiev, Alexandre
Inference
For each pair of predicates, we search for clusterings to maximize the sum of the log-probability and the negated penalty term.
Introduction
For predicates present in both sides of a bitext, we guide models in both languages to prefer clusterings which maximize agreement between predicate argument structures predicted for each aligned predicate pair.
Monolingual Model
Now, when parameters and argument key clusterings are chosen, we can summarize the remainder of the generative story as follows.
Problem Definition
The objective of this work is to improve argument key clusterings by inducing them simultaneously in two languages.
clusterings is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Durrett, Greg and Hall, David and Klein, Dan
Experiments
Finally, we consider mention properties derived from unsupervised clusterings ; these properties are designed to target semantic properties of nominals that should behave more like the oracle features than the phi features do.
Experiments
We consider clusterings that take as input pairs (n, 7“) of a noun head n and a string 7“ which contains the semantic role of n (or some approximation thereof) conjoined with its governor.
Experiments
We use four different clusterings in our experiments, each with twenty clusters: dependency-parse-derived NAIVEBAYES clusters, semantic-role-derived CONDITIONAL clusters, SRL-derived NAIVEBAYES clusters generating a NOVERB token when 7“ cannot be determined, and SRL-derived NAIVEBAYES clusters with all pronoun tuples discarded.
Related Work
Their system could be extended to handle property information like we do, but our system has many other advantages, such as freedom from a pre-specified list of entity types, the ability to use multiple input clusterings , and discriminative projection of clusters.
clusterings is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Wang, Lu and Cardie, Claire
Experimental Setup
In the True Clusterings setting, we use the annotations to create perfect partitions of the DAs for input to the system; in the System
Experimental Setup
Clusterings setting, we employ a hierarchical ag-glomerative clustering algorithm used for this task in (Wang and Cardie, 2011).
Results
Table 3 indicates that, with both true clusterings and system clusterings , our system trained on out-of—domain data achieves comparable performance with the same system trained on in-domain data.
Results
We randomly select 15 decision and 15 problem DA clusters (true clusterings ).
clusterings is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Andrews, Nicholas and Eisner, Jason and Dredze, Mark
Consensus Clustering
Our model gives a distribution over phylogenies p (given observations :13 and learned parameters (ID—and thus gives a posterior distribution over clusterings e, which can be used to answer various queries.
Consensus Clustering
More similar clusterings achieve larger R, with R(e’, e) = 1 iff e’ = e. In all cases, 0 S R(e’,e) = R(e,e’) g 1.
Consensus Clustering
As explained above, the sij are coreference probabilities sij that can be estimated from a sample of clusterings 6.
Experiments
For PHYLO, the entity clustering is the result of (1) training the model using EM, (2) sampling from the posterior to obtain a distribution over clusterings , and (3) finding a consensus clustering.
clusterings is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Koo, Terry and Carreras, Xavier and Collins, Michael
Background 2.1 Dependency parsing
By using prefixes of various lengths, we can produce clusterings of different granularities (Miller et al., 2004).
Feature design
(2004), we use prefixes of the Brown cluster hierarchy to produce clusterings of varying granularity.
Feature design
One possible explanation is that the clusterings generated by the Brown algorithm can be noisy or only weakly relevant to syntax; thus, the clusters are best exploited when “anchored” to words or parts of speech.
clusterings is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Falk, Ingrid and Gardent, Claire and Lamirel, Jean-Charles
Clustering Methods, Evaluation Metrics and Experimental Setup
We make use of these processes in all our experiments and systematically compute cluster labelling and feature maximisation on the output clusterings .
Clustering Methods, Evaluation Metrics and Experimental Setup
As we shall see, this permits distinguishing between clusterings with similar F-measure but lower “linguistic plausibility” (cf.
Clustering Methods, Evaluation Metrics and Experimental Setup
Following (Sun et al., 2010), we use modified purity (mPUR); weighted class accuracy (ACC) and F-measure to evaluate the clusterings produced.
clusterings is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Faruqui, Manaal and Dyer, Chris
Word Clustering
This joint minimization for the clusterings for both languages clearly has no benefit since the two terms of the objective are independent.
Word Clustering
Using this weighted vocabulary alignment, we state an objective that encourages clusterings to have high average mutual information when alignment links are followed; that is, on average how much information does knowing the cluster of a word cc 6 E impart about the clustering of y E Q, and vice-versa?
Word Clustering
We compare two different clusterings of a two-sentence Arabic-English parallel corpus (the English half of the corpus contains the same sentence, twice, while the Arabic half has two variants with the same meaning).
clusterings is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Tian, Zhenhua and Xiang, Hengheng and Liu, Ziqi and Zheng, Qinghua
Related Work 2.1 WordNet-based Approach
The key idea is to use the latent clusterings to take the place of WordNet semantic classes.
Related Work 2.1 WordNet-based Approach
Where the latent clusterings are automatically derived from distributional data based on EM algorithm.
Related Work 2.1 WordNet-based Approach
Recently, more sophisticated methods are innovated for SP based on topic models, where the latent variables (topics) take the place of semantic classes and distributional clusterings (Seaghdha, 2010; Ritter et al., 2010).
clusterings is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: