Abstract | In this paper, we propose a graph model to enrich intra-sentence features with inter-sentence features from both lexical and topic perspectives. |
Abstract | Evaluation on the *SEM 2012 shared task corpus indicates the usefulness of contextual discourse information in negation focus identification and justifies the effectiveness of our graph model in capturing such global information. |
Baselines | In this paper, we first propose a graph model to gauge the importance of contextual discourse |
Baselines | 4.1 Graph Model |
Baselines | Graph models have been proven successful in many NLP applications, especially in representing the link relationships between words or sentences (Wan and Yang, 2008; Li et al., 2009). |
Introduction | In this paper, to well accommodate such contextual discourse information in negation focus identification, we propose a graph model to enrich normal intra—sentence features with various kinds of inter-sentence features from both lexical and topic perspectives. |
Introduction | Besides, the standard PageRank algorithm is employed to optimize the graph model . |
Introduction | Section 4 introduces our topic-driven word-based graph model with contextual discourse information. |
Abstract | This paper presents a graphical model that embeds two directional aligners into a single model. |
Conclusion | We have presented a graphical model that combines two classical HMM-based alignment models. |
Introduction | This result is achieved by embedding two directional HMM-based alignment models into a larger bidirectional graphical model . |
Model Definition | Our bidirectional model Q = (12,13) is a globally normalized, undirected graphical model of the word alignment for a fixed sentence pair (6, f Each vertex in the vertex set V corresponds to a model variable Vi, and each undirected edge in the edge set D corresponds to a pair of variables (W, Each vertex has an associated potential function w, that assigns a real-valued potential to each possible value v,- of 16.1 Likewise, each edge has an associated potential function gig-(vi, 213-) that scores pairs of values. |
Model Definition | Figure l: The structure of our graphical model for a simple sentence pair. |
Model Inference | In general, graphical models admit efficient, exact inference algorithms if they do not contain cycles. |
Model Inference | While the entire graphical model has loops, there are two overlapping subgraphs that are cycle-free. |
Model Inference | To describe a dual decomposition inference procedure for our model, we first restate the inference problem under our graphical model in terms of the two overlapping subgraphs that admit tractable inference. |
Related Work | Although differing in both model and inference, our work and theirs both find improvements from defining graphical models for alignment that do not admit exact polynomial-time inference algorithms. |
Abstract | In this paper, motivated by a key equivalence of two decoding algorithms, we propose a joint graph model to globally optimize PTC and typo correction for IME. |
Conclusion | In this paper, we have developed a joint graph model for pinyin-to-Chinese conversion with typo correction. |
Pinyin Input Method Model | Inspired by (Yang et al., 2012b) and (Jia et al., 2013), we adopt the graph model for Chinese spell checking for pinyin segmentation and typo correction, which is based on the shortest path word segmentation algorithm (Casey and Lecolinet, 1996). |
Pinyin Input Method Model | Figure 2: Graph model for pinyin segmentation |
Pinyin Input Method Model | Figure 3: Graph model for pinyin typo correction |
Related Works | Various approaches were made for the task including language model (LM) based methods (Chen et al., 2013), ME model (Han and Chang, 2013), CRF (Wang et al., 2013d; Wang et al., 2013a), SMT (Chiu et al., 2013; Liu et al., 2013), and graph model (Jia et al., 2013), etc. |
Abstract | We observe that NER label information can be used to correct alignment mistakes, and present a graphical model that performs bilingual NER tagging jointly with word alignment, by combining two monolingual tagging models with two unidirectional alignment models. |
Bilingual NER by Agreement | In order to model this uncertainty, we extend the two previously independent CRF models into a larger undirected graphical model , by introducing a cross-lingual edge factor gb(z', j ) for every pair of word positions (2', j) E A. |
Bilingual NER by Agreement | The way DD algorithms work in decomposing undirected graphical models is analogous to other message passing algorithms such as loopy belief propagation, but DD gives a stronger optimality guarantee upon convergence (Rush et al., 2010). |
Conclusion | We introduced a graphical model that combines two HMM word aligners and two CRF NER taggers into a joint model, and presented a dual decomposition inference method for performing efficient decoding over this model. |
Introduction | In this work, we first develop a bilingual NER model (denoted as BI-NER) by embedding two monolingual CRF-based NER models into a larger undirected graphical model , and introduce additional edge factors based on word alignment (WA). |
Introduction | previous applications of the DD method in NLP, where the model typically factors over two components and agreement is to be sought between the two (Rush et al., 2010; Koo et al., 2010; DeNero and Macherey, 2011; Chieu and Teow, 2012), our method decomposes the larger graphical model into many overlapping components where each alignment edge forms a separate factor. |
Joint Alignment and NER Decoding | We introduce a cross-lingual edge factor C (i, j) in the undirected graphical model for every pair of word indices (2', j), which predicts a binary vari- |
Abstract | We present a novel approach for building verb subcategorization leXicons using a simple graphical model . |
Abstract | We discuss the advantages of graphical models for this task, in particular the ease of integrating semantic information about verbs and arguments in a principled fashion. |
Conclusions and future work | Our initial attempt at applying graphical models to subcategorization also suggested several ways to extend and improve the method. |
Methodology | In this section we describe the basic components of our study: feature sets, graphical model , inference, and evaluation. |
Methodology | Our graphical modeling approach uses the Bayesian network shown in Figure 1. |
Previous work | Graphical models have been increasingly popular for a variety of tasks such as distributional semantics (Blei et al., 2003) and unsupervised POS tagging (Finkel et al., 2007), and sampling methods allow efficient estimation of full joint distributions (Neal, 1993). |
Results | This is an example of how bad decisions made by the parser cannot be fixed by the graphical model , and an area where pGR features have an advantage. |
Introduction | 4) Learning in Graphical Models : Michael Jordan. |
Introduction | For example, as shown in Figure l, with the background knowledge that both Learning and Graphical models are the topics related to Machine learning, while Machine learning is the sub domain of Computer science, a human can easily determine that the two Michael Jordan in the 15t and 4th observations represent the same person. |
Introduction | 4)[Learning]in [ Graphical Models } Michael Jordan |
The Structural Semantic Relatedness Measure | Researcher Graphical Model W 0.28 Computer 048 Science 041 Learning |
The Structural Semantic Relatedness Measure | For demonstration, Table 4 shows some structural semantic relatedness values of the Semantic-graph in Figure 3 (CS represents computer science and GM represents Graphical model ). |
A Multigraph Model | Figure 1: An example graph modeling relations between mentions. |
A Multigraph Model | Many graph models for coreference resolution operate on A = V x V. Our multigraph model allows us to have multiple edges with different labels between mentions. |
A Multigraph Model | In contrast to previous work on similar graph models we do not learn any edge weights from training data. |
Introduction | Our approach belongs to a class of recently proposed graph models for coreference resolution (Cai and Strube, 2010; |
Relations | The graph model described in Section 3 is based on expressing relations between pairs of mentions via edges built from such relations. |
Image Clustering with Annotated Auxiliary Data | Figure 2: Graphical model representation of P L SA model. |
Image Clustering with Annotated Auxiliary Data | The graphical model representation of PLSA is shown in Figure 2. |
Image Clustering with Annotated Auxiliary Data | 69% Figure 3: Graphical model representation of aPLSA model. |
Abstract | We associate each sentence with an undirected latent tree graphical model , which is a tree consisting of both observed variables (corresponding to the words in the sentence) and an additional set of latent variables that are unobserved in the data. |
Abstract | Unlike in phylogenetics and graphical models , where a single latent tree is constructed for all the data, in our case, each part of speech sequence is associated with its own parse tree. |
Abstract | Following this intuition, we propose to model the distribution over the latent bracketing states and words for each tag sequence a: as a latent tree graphical model , which encodes conditional inde-pendences among the words given the latent states. |
Background | Markov Logic Networks (MLN) (Richardson and Domingos, 2006) are a framework for probabilistic logic that employ weighted formulas in first-order logic to compactly encode complex undirected probabilistic graphical models (i.e., Markov networks). |
Background | It uses logical representations to compactly define large graphical models with continuous variables, and includes methods for performing efficient probabilistic inference for the resulting models. |
Background | Given a set of weighted logical formulas, PSL builds a graphical model defining a probability distribution over the continuous space of values of the random variables in the model. |
PSL for STS | Grounding is the process of instantiating the variables in the quantified rules with concrete constants in order to construct the nodes and links in the final graphical model . |
Background: Pairwise Coreference | For higher accuracy, a graphical model such as a conditional random field (CRF) is constructed from the compatibility functions to jointly reason about the pairwise decisions (McCallum and Wellner, 2004). |
Background: Pairwise Coreference | The pairwise compatibility functions become the factors in the graphical model . |
Background: Pairwise Coreference | Figure 2: Pairwise model on six mentions: Open circles are the binary coreference decision variables, shaded circles are the observed mentions, and the black boxes are the factors of the graphical model that encode the pairwise compatibility functions. |
Related Work | Techniques such as lifted inference (Singla and Domingos, 2008) for graphical models exploit redundancy in the data, but typically do not achieve any significant compression on coreference data be- |
Related Work | For example, in (Stolcke et al., 2000), Hidden Markov Models (HMMs) were used for DA tagging; in (J i and Bilmes, 2005), different types of graphical models were explored. |
Thread Structure Tagging | Linear-chain CRFs is a type of undirected graphical models . |
Thread Structure Tagging | Distribution of a set of variables in undirected graphical models can be written as |
Thread Structure Tagging | CRFs is a special case of undirected graphical model in which w are log-linear functions: |
Introduction | Other previous work attempts to address some of the above concerns by mapping coreference to inference on an undirected graphical model (Culotta et al., 2007; Poon et al., 2008; Wellner et al., 2004; Wick et al., 2009a). |
Introduction | In this work we first distribute MCMC-based inference for the graphical model representation of coreference. |
Related Work | Our representation of the problem as an undirected graphical model , and performing distributed inference on it, provides a combination of advantages not available in any of these approaches. |
Related Work | In addition to representing features from all of the related work, graphical models can also use more complex entity-wide features (Culotta et al., 2007; Wick et al., 2009a), and parameters can be learned using supervised (Collins, 2002) or semi-supervised techniques (Mann and McCallum, 2008). |
Baselines | To ensure a meaningful comparison with the joint model, our two baselines are both implemented in the same graphical model framework, and trained with the same machine-leaming algorithm. |
Baselines | The tagger is a graphical model with the WORD and TAG variables, connected by the local factors TAG-UNIGRAM, TAG-BIGRAM, and TAG-CONSISTENCY, all used in the joint model (§3). |
Experimental Setup | To illustrate the effect, the graphical model of the sentence in Table 1, whose six words are all covered by the database, has 1,866 factors; without the benefit of the database, the full model would have 31,901 factors. |
Joint Model | It will be presented as a graphical model, |
Introduction | In summary, our proposed model is based on the probabilistic inference of these random variables using graphical models . |
Prior Work | In our work we use graphical models to extract context sentences. |
Prior Work | Graphical models have a number of properties and corresponding techniques and have been used before on Information Retrieval tasks. |
Proposed Method | A particular class of graphical models known as Markov Random Fields (MRFs) are suited for solving inference problems with uncertainty in observed data. |
Bayesian Logic Programs | Bayesian logic programs (BLPs) (Kersting and De Raedt, 2007; Kersting and Raedt, 2008) can be considered as templates for constructing directed graphical models (Bayes nets). |
Related Work | Unlike BLPs, this approach does not use a well-founded probabilistic graphical model to compute coherent probabilities for inferred facts. |
Related Work | However, MLNs include all possible type-consistent groundings of the rules in the corresponding Markov net, which, for larger datasets, can result in an intractably large graphical model . |
The computational model | Figure 1 shows the graphical model for our joint Bigram model (the Unigram case is trivially recovered by generating the Ums directly from L rather than from LUi,j_1). |
The computational model | Figure 2 gives the mathematical description of the graphical model and Table 1 provides a key to the variables of our model. |
The computational model | Figure l: The graphical model for our joint model of word-final /t/-deletion and Bigram word segmentation. |
Abstract | We propose a novel graphical model to simultaneously conduct N ER and N EN on multiple tweets to address these challenges. |
Introduction | We propose jointly conducting NER and NEN on multiple tweets using a graphical model , to address these challenges. |
Introduction | We adopt a factor graph as our graphical model , which is constructed in the following manner. |
Background | Here, we define a binary indicator variable for each candidate setting of each factor in the graphical model . |
Citation Extraction Data | There are multiple previous examples of augmenting chain-structured sequence models with terms capturing global relationships by expanding the chain to a more complex graphical model with nonlocal dependencies between the outputs. |
Citation Extraction Data | Soft constraints can be implemented inefficiently using hard constraints and dual decomposition— by introducing copies of output variables and an auxiliary graphical model , as in Rush et al. |
Conclusion | Type candidates are collected from patterns and modeled as hidden variables in graphical models . |
Extending the Model | We can thus move from a sequential model to a general graphical model by adding transitions and rearranging the structure. |
Results | Moving from the HMMs to a general graphical model structure (Figures 3c and d) creates a sparser distribution and significantly improves accuracy across the board. |
Introduction | o MULTIR introduces a probabilistic, graphical model of multi-instance learning which handles overlapping relations. |
Modeling Overlapping Relations | We define an undirected graphical model that allows joint reasoning about aggregate (corpus-level) and sentence-level extraction decisions. |
Related Work | (2010), combine weak supervision and multi-instance learning in a more sophisticated manner, training a graphical model , which assumes only that at least one of the matches between the arguments of a Freebase fact and sentences in the corpus is a true relational mention. |
Relation Extraction | The unique nature of the open extraction task has led us to develop O-CRF, an open extraction system that uses the power of graphical models to identify relations in text. |
Relation Extraction | Whereas classifiers predict the label of a single variable, graphical models model multiple, in- |
Relation Extraction | Conditional Random Fields (CRFs) (Lafferty et al., 2001), are undirected graphical models trained to maximize the conditional probability of a finite set of labels Y given a set of input observations X. |