Index of papers in Proc. ACL that mention

joint model

Seen in text as:

joint model (186)
joint modeling (28)
joint models (27)
jointly model (10)
Joint Model (6)
jointly modeling (5)
jointly models (4)
JOINT model (4)
jointly modeled (3)
Joint models (3)
Joint model (3)

Seen in 269 sentences in 32 papers.

1. Hierarchical Joint Learning: Improving Joint Parsing and Named Entity Recognition with Non-Jointly Labeled Data

Finkel, Jenny Rose and Manning, Christopher D.

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	One of the main obstacles to producing high quality joint models is the lack of jointly annotated data.
Abstract	Joint modeling of multiple natural language processing tasks outperforms single-task models learned from the same data, but still under-performs compared to single-task models learned on the more abundant quantities of available single-task annotated data.
Abstract	In this paper we present a novel model which makes use of additional single-task annotated data to improve the performance of a joint model .
Introduction	Joint models can be particularly useful for producing analyses of sentences which are used as input for higher-level, more semantically-oriented systems, such as question answering and machine translation.
Introduction	However, designing joint models which actually improve performance has proven challenging.
Introduction	There have been some recent successes with joint modeling .

joint model is mentioned in 49 sentences in this paper.

Topics mentioned in this paper:

2. A Discriminative Model for Joint Morphological Disambiguation and Dependency Parsing

Lee, John and Naradowsky, Jason and Smith, David A.

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In evaluations on various highly—inflected languages, this joint model outperforms both a baseline tagger in morphological disambiguation, and a pipeline parser in head selection.
Baselines	To ensure a meaningful comparison with the joint model , our two baselines are both implemented in the same graphical model framework, and trained with the same machine-leaming algorithm.
Baselines	Roughly speaking, they divide up the variables and factors of the joint model and train them separately.
Baselines	The tagger is a graphical model with the WORD and TAG variables, connected by the local factors TAG-UNIGRAM, TAG-BIGRAM, and TAG-CONSISTENCY, all used in the joint model (§3).
Experimental Results	We compare the performance of the pipeline model (§4) and the joint model (§3) on morphological disambiguation and unlabeled dependency parsing.
Experimental Setup	The output of the joint model is the assignment to the TAG and LINK variables.
Experimental Setup	In principle, the joint model should consider every possible combination of morphological attributes for every word.
Introduction	After a description of previous work (§2), the joint model (§3) will be contrasted with the baseline pipeline model (§4).
Joint Model	If fully implemented in our joint model , these features would necessitate two separate families of link factors: 0(n3m3) factors for the POS trigrams, and 0(n2m4) factors for the POS 4-grams.
Previous Work	Goldberg and Tsarfaty (2008) propose a generative joint model .

joint model is mentioned in 19 sentences in this paper.

Topics mentioned in this paper:

3. Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese

Hatori, Jun and Matsuzaki, Takuya and Miyao, Yusuke and Tsujii, Jun'ichi

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We propose the first joint model for word segmentation, POS tagging, and dependency parsing for Chinese.
Abstract	Based on an extension of the incremental joint model for POS tagging and dependency parsing (Hatori et al., 2011), we propose an efficient character-based decoding method that can combine features from state-of-the-art segmentation, POS tagging, and dependency parsing models.
Abstract	We also perform comparison experiments with the partially joint models .
Introduction	Based on these observations, we aim at building a joint model that simultaneously processes word segmentation, POS tagging, and dependency parsing, trying to capture global interaction among
Introduction	We also perform comparison experiments with partially joint models , and investigate the tradeoff between the running speed and the model performance.
Model	(2011), we build our joint model to solve word segmentation, POS tagging, and dependency parsing within a single framework.
Model	In our joint model , the early update is invoked by mistakes in any of word segmentation, POS tagging, or dependency parsing.
Model	The list of the features used in our joint model is presented in Table 1, where $01—$05, W01—W21, and T01—05 are taken from Zhang and Clark (2010), and P01—P28 are taken from Huang and Sagae (2010).
Related Works	In contrast, we built a joint model based on a dependency-based framework, with a rich set of structural features.
Related Works	Because we found that even an incremental approach with beam search is intractable if we perform the word-based decoding, we take a character-based approach to produce our joint model .

joint model is mentioned in 21 sentences in this paper.

Topics mentioned in this paper:

4. Discriminative Learning for Joint Template Filling

Minkov, Einat and Zettlemoyer, Luke

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This paper presents a joint model for template filling, where the goal is to automatically specify the fields of target relations such as seminar announcements or corporate acquisition events.
Abstract	The approach models mention detection, unification and field extraction in a flexible, feature-rich model that allows for joint modeling of interdependencies at all levels and across fields.
Corporate Acquisitions	Unfortunately, we can not directly compare against a generative joint model evaluated on this dataset (Haghighi and Klein, 2010).7 The best results per attribute are shown in boldface.
Introduction	In this paper, we present a joint modeling and learning approach for the combined tasks of mention detection, unification, and template filling, as described above.
Introduction	We also demonstrate, through ablation studies on the feature set, the need for joint modeling and the relative importance of the different types of joint constraints.
Seminar Extraction Task	An important question to be addressed in evaluation is to what extent the joint modeling approach contributes to performance.
Seminar Extraction Task	This is largely due to erroneous assignments of named entities of other types (mainly, person) as titles; such errors are avoided in the full joint model , where tuple validity is enforced.
Seminar Extraction Task	As argued before, joint modeling is especially important for irregular fields, such as title; we provide first results on this field.
Summary and Future Work	This approach allows for joint modeling of interdependen-cies at all levels and across fields.
Summary and Future Work	Finally, it is worth exploring scaling the approach to unrestricted event extraction, and jointly model extracting more than one relation per document.

joint model is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

5. Part-of-Speech Induction in Dependency Trees for Statistical Machine Translation

Tamura, Akihiro and Watanabe, Taro and Sumita, Eiichiro and Takamura, Hiroya and Okumura, Manabu

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In particular, we extend the monolingual infinite tree model (Finkel et al., 2007) to a bilingual scenario: each hidden state (POS tag) of a source-side dependency tree emits a source word together with its aligned target word, either jointly ( joint model ), or independently (independent model).
Abstract	Our independent model gains over 1 point in BLEU by resolving the sparseness problem introduced in the joint model .
Bilingual Infinite Tree Model	This paper proposes two types of models that differ in their processes for generating observations: the joint model and the independent model.
Bilingual Infinite Tree Model	3.1 Joint Model
Bilingual Infinite Tree Model	The joint model is a simple application of the infinite tree model under a bilingual scenario.
Introduction	We investigate two types of models: (i) a joint model and (ii) an independent model.
Introduction	In the joint model , each hidden state jointly emits both a source word and its aligned target word as an observation.
Related Work	Figure 4: An Example of the Joint Model

joint model is mentioned in 17 sentences in this paper.

Topics mentioned in this paper:

6. Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora

Lu, Bin and Tan, Chenhao and Cardie, Claire and K. Tsou, Benjamin

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Joint Model with Unlabeled Parallel Text	3.2 The Joint Model
A Joint Model with Unlabeled Parallel Text	Since previous work (Banea et al., 2008; 2010; Wan, 2009) has shown that it could be useful to automatically translate the labeled data from the source language into the target language, we can further incorporate such translated labeled data into the joint model by adding the following component into Equation 6:
Conclusion	In this paper, we study bilingual sentiment classification and propose a joint model to simultaneously learn better monolingual sentiment classifiers for each language by exploiting an unlabeled parallel corpus together with the labeled data available for each language.
Experimental Setup 4.1 Data Sets and Preprocessing	In our experiments, the proposed joint model is compared with the following baseline methods.
Introduction	In Section 3, the proposed joint model is described.
Related Work	Another notable approach is the work of Boyd-Graber and Resnik (2010), which presents a generative model --- supervised multilingual latent Dirichlet allocation --- that jointly models topics that are consistent across languages, and employs them to better predict sentiment ratings.
Results and Analysis	We first compare the proposed joint model (Joint) with the baselines in Table 2.
Results and Analysis	Overall, the unlabeled parallel data improves classification accuracy for both languages when using our proposed joint model and Co-SVM.
Results and Analysis	The joint model makes better use of the unlabeled parallel data than Co-SVM or TSVMs presumably because of its attempt to jointly optimize the two monolingual models via soft (probabilistic) assignments of the unlabeled instances to classes in each iteration, instead of the hard assignments in Co-SVM and TSVMs.

joint model is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

7. Low-Resource Semantic Role Labeling

Gormley, Matthew R. and Mitchell, Margaret and Van Durme, Benjamin and Dredze, Mark

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Approaches	Figure 2 shows the factor graph for this joint model .
Discussion and Future Work	We find that we can outperform prior work in the low-resource setting by coupling the selection of feature templates based on information gain with a joint model that marginalizes over latent syntax.
Discussion and Future Work	Our discriminative joint models treat latent syntax as a structured-feature to be optimized for the end-task of SRL, while our other grammar induction techniques optimize for unlabeled data likelihood—optionally with distant supervision.
Experiments	This highlights an important advantage of the pipeline trained model: the features can consider any part of the syntax (e. g., arbitrary sub-trees), whereas the joint model is limited to those features over which it can efficiently marginalize (e.g., short dependency paths).
Experiments	In the low-resource setting of the CoNLL-2009 Shared task without syntactic supervision, our joint model (Joint) with marginalized syntax obtains state-of-the-art results with features IGC described in § 4.2.
Experiments	These results begin to answer a key research question in this work: The joint models outperform the pipeline models in the low-resource setting.
Introduction	0 Comparison of pipeline and joint models for SRL.
Introduction	The joint models use a non-loopy conditional random field (CRF) with a global factor constraining latent syntactic edge variables to form a tree.
Introduction	Even at the expense of no dependency path features, the joint models best pipeline-trained models for state-of-the-art performance in the low-resource setting (§ 4.4).
Related Work	In both pipeline and joint models , we use features adapted from state-of-the-art approaches to SRL.

joint model is mentioned in 10 sentences in this paper.

Topics mentioned in this paper:

8. Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging

Zeng, Xiaodong and Wong, Derek F. and Chao, Lidia S. and Trancoso, Isabel

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This paper introduces a graph-based semi-supervised joint model of Chinese word segmentation and part-of-speech tagging.
Abstract	An inductive character-based joint model is obtained eventually.
Introduction	In the past years, several proposed supervised joint models (Ng and Low, 2004; Zhang and Clark, 2008; Jiang et al., 2009; Zhang and Clark, 2010) achieved reasonably accurate results, but the outstanding problem among these models is that they rely heavily on a large amount of labeled data, i.e., segmented texts with POS tags.
Method	It is directed to maximize the conditional likelihood of hidden states with the derived label distributions on unlabeled data, i.e., p(y, vlzc), where y and v are jointly modeled but
Method	Firstly, as expected, for the two supervised baselines, the joint model outperforms the pipeline one, especially on segmentation.
Method	This outcome verifies the commonly accepted fact that the joint model can substantially improve the pipeline one, since POS tags provide additional information to word segmentation (Ng and Low, 2004).
Related Work	The state-of-the-art joint models include reranking approaches (Shi and Wang, 2007), hybrid approaches (Nakagawa and Uchimoto, 2007; Jiang et al., 2008; Sun, 2011), and single-model approaches (Ng and Low, 2004; Zhang and Clark, 2008; Kruengkrai et al., 2009; Zhang and Clark, 2010).

joint model is mentioned in 9 sentences in this paper.

Topics mentioned in this paper:

9. A Linear-Time Bottom-Up Discourse Parser with Constraints and Post-Editing

Feng, Vanessa Wei and Hirst, Graeme

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Bottom-up tree-building	However, the major distinction between our models and theirs is that we do not jointly model the structure and the relation; rather, we use two linear-
Bottom-up tree-building	Although joint modeling has shown to be effective in various NLP and computer Vision applications (Sutton et al., 2007; Yang et al., 2009; Wojek and Schiele, 2008), our choice of using two separate models is for the following reasons:
Bottom-up tree-building	Then, in the tree-building process, we will have to deal with the situations where the joint model yields conflicting predictions: it is possible that the model predicts Sj = 1 and RJ- 2 NO-REL, or Vice versa, and we will have to decide which node to trust (and thus in some sense, the structure and the relation is no longer jointly modeled ).
Related work	2.2 Joty et al.’s joint model
Related work	Second, they jointly modeled the structure and the relation for a given pair of discourse units.
Related work	The strength of J oty et al.’s model is their joint modeling of the structure and the relation, such that information from each aspect can interact with the other.

joint model is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

10. Incremental Joint Extraction of Entity Mentions and Relations

Li, Qi and Ji, Heng

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Experiments on Automatic Content Extraction (ACE)1 corpora demonstrate that our joint model significantly outperforms a strong pipelined baseline, which attains better performance than the best-reported end-to-end system.
Conclusions and Future Work	In addition, we aim to incorporate other IE components such as event extraction into the joint model .
Experiments	We compare our proposed method (Joint w/ Global) with the pipelined system (Pipeline), the joint model with only local features (Joint w/ Local), and two human annotators who annotated 73 documents in ACE’OS corpus.
Experiments	Our joint model correctly identified the entity mentions and their relation.
Experiments	Figure 7 shows the details when the joint model is applied to this sentence.
Introduction	This is the first work to incrementally predict entity mentions and relations using a single joint model (Section 3).

joint model is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

11. Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection

Sun, Xu and Wang, Houfeng and Li, Wenjie

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We present a joint model for Chinese word segmentation and new word detection.
Abstract	We present high dimensional new features, including word-based features and enriched edge (label-transition) features, for the joint modeling .
Introduction	In this paper, we present high dimensional new features, including word-based features and enriched edge (label-transition) features, for the joint modeling of Chinese word segmentation (CWS) and new word detection (NWD).
Introduction	While most of the state-of-the-art CWS systems used semi-Markov conditional random fields or latent variable conditional random fields, we simply use a single first-order conditional random fields (CRFs) for the joint modeling .
Introduction	0 We propose a joint model for Chinese word segmentation and new word detection.
System Architecture	3.1 A Joint Model Based on CRFs
System Architecture	In this paper, we presented a joint model for Chinese word segmentation and new word detection.
System Architecture	We presented new features, including word-based features and enriched edge features, for the joint modeling .

joint model is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

12. A Joint Model for Discovery of Aspects in Utterances

Celikyilmaz, Asli and Hakkani-Tur, Dilek

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We describe a joint model for understanding user actions in natural language utterances.
Background	Only recent research has focused on the joint modeling of SLU (Jeong and Lee, 2008; Wang, 2010) taking into account the dependencies at learning time.
Background	Our joint model can discover domain D, and user’s act A as higher layer latent concepts of utterances in relation to lower layer latent semantic topics (slots) S such as named-entities (”New York”) or context bearing non-named entities (”vegan ”).
Data and Approach Overview	Here we define several abstractions of our joint model as depicted in Fig.
Experiments	* Tri—CRF: We used Triangular Chain CRF (J eong and Lee, 2008) as our supervised joint model baseline.
Experiments	We evaluate the performance of our joint model on two experiments using two metrics.
Experiments	The results show that our joint modeling approach has an advantage over the other joint models (i.e., Tri—CRF) in that it can leverage unlabeled NL utterances.
Introduction	Recent work on SLU (Jeong and Lee, 2008; Wang, 2010) presents joint modeling of two components, i.e., the domain and slot or dialog act and slot components together.

joint model is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

13. Fast and Robust Neural Network Joint Models for Statistical Machine Translation

Devlin, Jacob and Zbib, Rabih and Huang, Zhongqiang and Lamar, Thomas and Schwartz, Richard and Makhoul, John

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Here, we present a novel formulation for a neural network joint model (NNJM), which augments the NNLM with a source context window.
Introduction	Specifically, we introduce a novel formulation for a neural network joint model (NNJ M), which augments an n-gram target language model with an m-word source window.
Introduction	Unlike previous approaches to joint modeling (Le et al., 2012), our feature can be easily integrated into any statistical machine translation (SMT) decoder, which leads to substantially larger improvements than k-best rescoring only.
Model Variations	Although there has been a substantial amount of past work in lexicalized joint models (Marino et al., 2006; Crego and Yvon, 2010), nearly all of these papers have used older statistical techniques such as Kneser-Ney or Maximum Entropy.
Model Variations	This is consistent with our rescoring-only result, which indicates that k-best rescoring is too shallow to take advantage of the power of a joint model .
Model Variations	We have described a novel formulation for a neural network-based machine translation joint model , along with several simple variations of this model.
Neural Network Joint Model (NNJ M)	To make this a joint model , we also condition on source context vector 81-:

joint model is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

14. A joint model of word segmentation and phonological variation for English word-final /t/-deletion

Börschinger, Benjamin and Johnson, Mark and Demuth, Katherine

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We extend a nonparametric model of word segmentation by adding phonological rules that map from underlying forms to surface forms to produce a mathematically well-defined joint model as a first step towards handling variation and segmentation in a single model.
Abstract	We analyse how our model handles /t/-deletion on a large corpus of transcribed speech, and show that the joint model can perform word segmentation and recover underlying /t/s.
Background and related work	However, as they point out, combining the segmentation and the variation model into one joint model is not straightforward and usual inference procedures are infeasible, which requires the use of several heuristics.
Background and related work	They do not aim for a joint model that also handles word segmentation, however, and rather than training their model on an actual corpus, they evaluate on constructed lists of examples, mimicking frequencies of real data.
Conclusion and outlook	We presented a joint model for word segmentation and the learning of phonological rule probabilities from a corpus of transcribed speech.
The computational model	Figure l: The graphical model for our joint model of word-final /t/-deletion and Bigram word segmentation.
The computational model	(2009) segmentation models, exact inference is infeasible for our joint model .

joint model is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

15. Distributed Representations of Geographically Situated Language

Bamman, David and Dyer, Chris and Smith, Noah A.

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In a quantitative evaluation on the task of judging geographically informed semantic similarity between representations learned from 1.1 billion words of geo-located tweets, our joint model outperforms comparable independent models that learn meaning in isolation.
Evaluation	To illustrate how the model described above can learn geographically-informed semantic representations of words, table 1 displays the terms with the highest cosine similarity to wicked in Kansas and Massachusetts after running our joint model on the full 1.1 billion words of Twitter data; while wicked in Kansas is close to other evaluative terms like evil and pure and religious terms like gods and spirit, in Massachusetts it is most similar to other intensifiers like super, ridiculously and insanely.
Evaluation	As one concrete example of these differences between individual data points, the cosine similarity between city and seattle in the —GEO model is 0.728 (seattle is ranked as the 188th most similar term to city overall); in the INDIVIDUAL model using only tweets from Washington state, 6WA(city,seattle) = 0.780 (rank #32); and in the JOINT model , using information from the entire United States with deviations for Washington, 6WA(city, seattle) = 0858 (rank #6).
Evaluation	While the two models that include geographical information naturally outperform the model that does not, the JOINT model generally far outperforms the INDIVIDUAL models trained on state-specific subsets of the data.1 A model that can exploit all of the information in the data, learning core vector-space representations for all words along with deviations for each contextual variable, is able to learn more geographically-informed representations for this task than strict geographical models alone.
Model	A joint model has three a priori advantages over independent models: (i) sharing data across variable values encourages representations across those values to be similar; e.g., while city may be closer to Boston in Massachusetts and Chicago in Illinois, in both places it still generally connotes a municipality; (ii) such sharing can mitigate data sparseness for less-witnessed areas; and (iii) with a joint model , all representations are guaranteed to

joint model is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

16. A Joint Graph Model for Pinyin-to-Chinese Conversion with Typo Correction

Jia, Zhongye and Zhao, Hai

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	In addition, the joint model is efficient enough for practical use.
Experiments	3'Other evaluation metrics are also proposed by (Zheng et al., 2011a) which is only suitable for their system since our system uses a joint model
Experiments	The selection of K also directly guarantees the running time of the joint model .
Experiments	using the proposed joint model are shown in Table 3 and Table 4.
Pinyin Input Method Model	To make typo correction better, we consider to integrate it with FTC conversion using a joint model .
Related Works	As we will propose a joint model

joint model is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

17. Improved Lexical Acquisition through DPP-based Verb Clustering

Reichart, Roi and Korhonen, Anna

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions and Future Work	A natural extension of our unified framework is to construct a joint model in which the predictions for all three tasks inform each other at all stages of the prediction process.
Introduction	(2012) presented a joint model for inducing simple syntactic frames and VCs.
Introduction	(2012) introduced a joint model for SCF and SP acquisition.
Previous Work	Joint Modeling A small number of works have recently investigated joint approaches to SCFs, SPs and VCs.
Previous Work	Although evaluation of these recent joint models has been partial, the results have been encouraging and fur-
The Unified Framework	DPPs are particularly suitable for joint modeling as they come with various simple and intuitive ways to combine individual model kernel matrices into a joint kernel.

joint model is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

18. Subtree Extractive Summarization via Submodular Maximization

Morita, Hajime and Sasano, Ryohei and Takamura, Hiroya and Okumura, Manabu

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Joint models of sentence extraction and compression have a great benefit in that they have a large degree of freedom as far as controlling redundancy goes.
Introduction	In contrast, conventional two-stage approaches (Za-jic et al., 2006), which first generate candidate compressed sentences and then use them to generate a summary, have less computational complexity than joint models .
Introduction	Joint models can prune unimportant or redundant descriptions without resorting to enumeration.
Joint Model of Extraction and Compression	Therefore, the joint model can extract an arbitrarily compressed sentence as a subtree without enumerating all candidates.
Joint Model of Extraction and Compression	The joint model can remove the redundant part as well as the irrelevant part of a sentence, because the model simultaneously extracts and compresses sentences.
Joint Model of Extraction and Compression	In this joint model , we generate a compressed sentence by extracting an arbitrary subtree from a dependency tree of a sentence.

joint model is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

19. Two-Neighbor Orientation Model with Cross-Boundary Global Contexts

Setiawan, Hendra and Zhou, Bowen and Xiang, Bing and Shen, Libin

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	In this paper, we propose Two-Neighbor Orientation (TNO) model that jointly models the orientation decisions between anchors and two neighboring multi-unit chunks which may cross phrase or rule boundaries.
Conclusion	Our approach, which we formulate as a Two-Neighbor Orientation model, includes the joint modeling of two orientation decisions and the modeling of the maximal span of the reordered chunks through the concept of Maximal Orientation Span.
Introduction	Then, we jointly model the orientations of chunks that immediately precede and follow the anchors (hence, the name “two-neighbor”) along with the maximal span of these chunks, to which we refer as Maximal Orientation Span (MOS).
Introduction	To show the effectiveness of our model, we integrate our TNO model into a state-of-the-art syntax-based SMT system, which uses synchronous context-free grammar (SCFG) rules to jointly model reordering and lexical translation.
Two-Neighbor Orientation Model	Our Two-Neighbor Orientation model (TNO) designates A C A(@) as anchors and jointly models the orientation of chunks that appear immediately to the left and to the right of the anchors as well as the identities of these chunks.

joint model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

20. Which Words Are Hard to Recognize? Prosodic, Lexical, and Disfluency Factors that Increase ASR Error Rates

Goldwater, Sharon and Jurafsky, Dan and Manning, Christopher D.

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Analysis using a joint model	According to our joint model , these effects still hold even after controlling for other features.
Analysis using a joint model	Our joint model controls for the first two of these factors, suggesting that the third factor or some other explanation must account for the remaining differences between males and females.
Analysis using a joint model	In the joint model , we see the same effect of pitch mean and an even stronger effect for intensity, with the predicted odds of an error dramatically higher for extreme intensity values.
Conclusion	Using IWER, we analyzed the effects of various word-level lexical and prosodic features, both individually and in a joint model .

joint model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

21. Mining Entity Types from Query Logs via User Intent Modeling

Pantel, Patrick and Lin, Thomas and Gamon, Michael

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We jointly model the interplay between latent user intents that govern queries and unobserved entity types, leveraging observed signals from query formulations and document clicks.
Conclusion	Jointly modeling the interplay between the underlying user intents and entity types in web search queries shows significant improvements over the current state of the art on the task of resolving entity types in head queries.
Evaluation Methodology	In order to learn type distributions by jointly modeling user intents and a large number of types, we require a large set of training examples containing tagged entities and their potential types.
Introduction	We show that jointly modeling user intent and entity type significantly outperforms the current state of the art on the task of entity type resolution in queries.
Related Work	Our models also expand upon theirs by jointly modeling

joint model is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

22. Joint Event Extraction via Structured Prediction with Global Features

Li, Qi and Ji, Heng and Huang, Liang

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	We propose a novel joint event extraction algorithm to predict the triggers and arguments simultaneously, and use the structured perceptron (Collins, 2002) to train the joint model .
Joint Framework for Event Extraction	Unfortunately, it is intractable to perform the exact search in our framework because: (1) by jointly modeling the trigger labeling and argument labeling, the search space becomes much more complex.
Related Work	To the best of our knowledge, our work is the first attempt to jointly model these two ACE event subtasks.
Related Work	There has been some previous work on joint modeling for biomedical events (Riedel and McCallum, 2011a; Riedel et al., 2009; McClosky et al., 2011; Riedel and McCallum, 2011b).

joint model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

23. Aspect Extraction through Semi-Supervised Modeling

Mukherjee, Arjun and Liu, Bing

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Our models also jointly model both aspects and aspect specific sentiments.
Introduction	Our models are related to topic models in general (Blei et al., 2003) and joint models of aspects and sentiments in sentiment analysis in specific (e.g., Zhao et al., 2010).
Introduction	First of all, we jointly model aspect and sentiment, while DF-LDA is only for topics/aspects.

joint model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

24. Labeling Documents with Timestamps: Learning from their Time Expressions

Chambers, Nathanael

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments and Results	Finally, the Joint model is the combined document and year mention classifiers as described in Section 4.3.
Experiments and Results	Table 4 shows the F1 scores of the Joint model by year.
Experiments and Results	Table 4: Yearly results for the Joint model .
Learning Time Constraints	Finally, given the document classifiers of Section 3 and the constraint classifier just defined in Section 4, we create a joint model combining the two with the following linear interpolation:

joint model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

MaxEnt (15)
unigrams (15)
NER (7)

25. Chinese Parsing Exploiting Characters

Zhang, Meishan and Zhang, Yue and Che, Wanxiang and Liu, Ting

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We can see that both character-level joint models outperform the pipelined system; our model with annotated word structures gives an improvement of 0.97% in tagging accuracy and 2.17% in phrase-structure parsing accuracy.
Experiments	The results also demonstrate that the annotated word structures are highly effective for syntactic parsing, giving an absolute improvement of 0.82% in phrase-structure parsing accuracy over the joint model with flat word structures.
Experiments	(2011), which additionally uses the Chinese Gigaword Corpus; Li ’11 denotes a generative model that can perform word segmentation, POS tagging and phrase-structure parsing jointly (Li, 2011); Li+ ’12 denotes a unified dependency parsing model that can perform joint word segmentation, POS tagging and dependency parsing (Li and Zhou, 2012); Li ’11 and Li+ ’12 exploited annotated morphological-level word structures for Chinese; Hatori+ ’12 denotes an incremental joint model for word segmentation, POS tagging and dependency parsing (Hatori et al., 2012); they use external dictionary resources including HowNet Word List and page names from the Chinese Wikipedia; Qian+ ’12 denotes a joint segmentation, POS tagging and parsing system using a unified framework for decoding, incorporating a word segmentation model, a POS tagging model and a phrase-structure parsing model together (Qian and Liu, 2012); their word segmentation model is a combination of character-based model and word-based model.
Related Work	Their work demonstrates that a joint model can improve the performance of the three tasks, particularly for POS tagging and dependency parsing.

joint model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

26. Jointly Learning to Extract and Compress

Berg-Kirkpatrick, Taylor and Gillick, Dan and Klein, Dan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We learn a joint model of sentence extraction and compression for multi-document summarization.
Efficient Prediction	By solving the following ILP we can compute the arg max required for prediction in the joint model:
Experiments	Figure 4: Example summaries produced by our learned joint model of extraction and compression.
Joint Model	Learning weights for Objective 2 where Y(x) is the set of compressive summaries, and C (y) the set of broken edges that produce subtree deletions, gives our LEARNED COMPRESSIVE system, which is our joint model of extraction and compression.

joint model is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

27. Character-Level Chinese Dependency Parsing

Zhang, Meishan and Zhang, Yue and Che, Wanxiang and Liu, Ting

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Character-Level Dependency Tree	(2012) proposed a joint model for Chinese word segmentation, POS-tagging and dependency parsing, studying the influence of joint model and character features for parsing, Their model is extended from the arc-standard transition-based model, and can be regarded as an alternative to the arc-standard model of our work when pseudo intra-word dependencies are used.
Character-Level Dependency Tree	(2012) investigate a joint model using pseudo intra-word dependencies.
Character-Level Dependency Tree	To our knowledge, we are the first to apply the arc-eager system to joint models and achieve comparative performances to the arc-standard model.

joint model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

28. FrameNet on the Way to Babel: Creating a Bilingual FrameNet Using Wiktionary as Interlingual Connection

Hartmann, Silvana and Gurevych, Iryna

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

FrameNet — Wiktionary Alignment	In Table 2, we report on the results of the best single models and the best joint model .
FrameNet — Wiktionary Alignment	For the joint model , we employed the best single PPR configuration, and a COS configuration that uses sense gloss extended by Wiktionary hypernyms, synonyms and FrameNet frame name and frame definition, to achieve the highest score, an F1-score of 0.739.
FrameNet — Wiktionary Alignment	The BEST JOINT model performs well on nouns, slightly better on adjectives, and worse on verbs, see Table 2.

joint model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

29. Learning Hierarchical Translation Structure with Linguistic Annotations

Mylonakis, Markos and Sima'an, Khalil

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Joint Translation Model	Phrase-pairs are emitted jointly and the over-11 probabilistic SCFG is a joint model over parallel trings.
Joint Translation Model	By splitting the joint model in a hierarchical structure model and a lexical emission one we facilitate estimating the two models separately.
Related Work	We show that a translation system based on such a joint model can perform competitively in comparison with conditional probability models, when it is augmented with a rich latent hierarchical structure trained adequately to avoid overfitting.

joint model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

30. A Latent Dirichlet Allocation Method for Selectional Preferences

Ritter, Alan and Mausam and Etzioni, Oren

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Previous Work	It also focuses on jointly modeling the generation of both predicate and argument, and evaluation is performed on a set of human-plausibility judgments obtaining impressive results against Keller and Lapata’s (2003) Web hit-count based system.
Topic Models for Selectional Prefs.	One weakness of IndependentLDA is that it doesn’t jointly model a1 and a2 together.
Topic Models for Selectional Prefs.	On the one hand, J ointLDA jointly models the generation of both arguments in an extracted tuple.

joint model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

topic models (12)
WordNet (10)
LDA (8)

31. Towards Open-Domain Semantic Role Labeling

Croce, Danilo and Giannone, Cristina and Annesi, Paolo and Basili, Roberto

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

A Distributional Model for Argument Classification	3.2 A Joint Model for Argument Classification
Related Work	It incorporates strong dependencies within a comprehensive statistical joint model with a rich set of features over multiple argument phrases.
Related Work	First local models are applied to produce role labels over individual arguments, then the joint model is used to decide the entire argument sequence among the set of the n-best competing solutions.

joint model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

32. Modeling Latent Biographic Attributes in Conversational Genres

Garera, Nikesh and Yarowsky, David

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Corpus Details	gender/age) based on the prior and joint modeling of the partner speaker’s gender/age in the same discourse.
Corpus Details	We employ several varieties of classifier stacking and joint modeling to be effectively sensitive to these differences.
Corpus Details	A novel partner-sensitve model shows performance gains from the joint modeling of speaker attributes along with partner speaker attributes, given the differences in lexical usage and discourse style such as observed between same-gender and mixed-gender conversations.

joint model is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: