Analysis | Figure 4 shows MAP transition Dirichlet hyperparameters of the CLUST model, when trained |
Analysis | Finally, we examine the relationship between the induced clusters and language families in Table 3, for the trigram consonant vs. vowel CLUST model with 20 clusters. |
Experiments | EM 93.37 74.59 ft SYMM 95.99 80.72 MERGE 97.14 86.13 CLUST 98.85 89.37 33 EM 94.50 74.53 E SYMM 96.18 78.13 E MERGE 97.66 86.47 CLUST 98.55 89.07 .5 EM 92.93 78.26 3 SYMM 95.90 79.04 g MERGE 96.06 83.78 2 CLUST 97.03 85.79 |
Experiments | Finally, we consider the full version of our model, CLUST , with 20 language clusters. |
Results | Figure 3: Confusion matrix for CLUST (left) and EM (right). |
Results | Both MERGE and CLUST break symmetries over tags by way of the asymmetric posterior over transition Dirichlet parameters. |
Evaluation | We evaluate our methods in two quantitative ways by measuring the degree to which we recover two different sets of gold-standard clusterings . |
Evaluation | To measure the similarity between the two clusterings of movie characters, gold clusters Q and induced latent persona clusters C, we calculate the variation of information (Meila, 2007): |
Evaluation | VI measures the information-theoretic distance between the two clusterings : a lower value means greater similarity, and VI = 0 if they are identical. |
Experiments | Finally, we consider mention properties derived from unsupervised clusterings ; these properties are designed to target semantic properties of nominals that should behave more like the oracle features than the phi features do. |
Experiments | We consider clusterings that take as input pairs (n, 7“) of a noun head n and a string 7“ which contains the semantic role of n (or some approximation thereof) conjoined with its governor. |
Experiments | We use four different clusterings in our experiments, each with twenty clusters: dependency-parse-derived NAIVEBAYES clusters, semantic-role-derived CONDITIONAL clusters, SRL-derived NAIVEBAYES clusters generating a NOVERB token when 7“ cannot be determined, and SRL-derived NAIVEBAYES clusters with all pronoun tuples discarded. |
Related Work | Their system could be extended to handle property information like we do, but our system has many other advantages, such as freedom from a pre-specified list of entity types, the ability to use multiple input clusterings , and discriminative projection of clusters. |
Experimental Setup | In the True Clusterings setting, we use the annotations to create perfect partitions of the DAs for input to the system; in the System |
Experimental Setup | Clusterings setting, we employ a hierarchical ag-glomerative clustering algorithm used for this task in (Wang and Cardie, 2011). |
Results | Table 3 indicates that, with both true clusterings and system clusterings , our system trained on out-of—domain data achieves comparable performance with the same system trained on in-domain data. |
Results | We randomly select 15 decision and 15 problem DA clusters (true clusterings ). |
Word Clustering | This joint minimization for the clusterings for both languages clearly has no benefit since the two terms of the objective are independent. |
Word Clustering | Using this weighted vocabulary alignment, we state an objective that encourages clusterings to have high average mutual information when alignment links are followed; that is, on average how much information does knowing the cluster of a word cc 6 E impart about the clustering of y E Q, and vice-versa? |
Word Clustering | We compare two different clusterings of a two-sentence Arabic-English parallel corpus (the English half of the corpus contains the same sentence, twice, while the Arabic half has two variants with the same meaning). |
Related Work 2.1 WordNet-based Approach | The key idea is to use the latent clusterings to take the place of WordNet semantic classes. |
Related Work 2.1 WordNet-based Approach | Where the latent clusterings are automatically derived from distributional data based on EM algorithm. |
Related Work 2.1 WordNet-based Approach | Recently, more sophisticated methods are innovated for SP based on topic models, where the latent variables (topics) take the place of semantic classes and distributional clusterings (Seaghdha, 2010; Ritter et al., 2010). |