Index of papers in Proc. ACL 2011 that mention
  • joint model
Lee, John and Naradowsky, Jason and Smith, David A.
Abstract
In evaluations on various highly—inflected languages, this joint model outperforms both a baseline tagger in morphological disambiguation, and a pipeline parser in head selection.
Baselines
To ensure a meaningful comparison with the joint model , our two baselines are both implemented in the same graphical model framework, and trained with the same machine-leaming algorithm.
Baselines
Roughly speaking, they divide up the variables and factors of the joint model and train them separately.
Baselines
The tagger is a graphical model with the WORD and TAG variables, connected by the local factors TAG-UNIGRAM, TAG-BIGRAM, and TAG-CONSISTENCY, all used in the joint model (§3).
Experimental Results
We compare the performance of the pipeline model (§4) and the joint model (§3) on morphological disambiguation and unlabeled dependency parsing.
Experimental Setup
The output of the joint model is the assignment to the TAG and LINK variables.
Experimental Setup
In principle, the joint model should consider every possible combination of morphological attributes for every word.
Introduction
After a description of previous work (§2), the joint model (§3) will be contrasted with the baseline pipeline model (§4).
Joint Model
If fully implemented in our joint model , these features would necessitate two separate families of link factors: 0(n3m3) factors for the POS trigrams, and 0(n2m4) factors for the POS 4-grams.
Previous Work
Goldberg and Tsarfaty (2008) propose a generative joint model .
joint model is mentioned in 19 sentences in this paper.
Topics mentioned in this paper:
Lu, Bin and Tan, Chenhao and Cardie, Claire and K. Tsou, Benjamin
A Joint Model with Unlabeled Parallel Text
3.2 The Joint Model
A Joint Model with Unlabeled Parallel Text
Since previous work (Banea et al., 2008; 2010; Wan, 2009) has shown that it could be useful to automatically translate the labeled data from the source language into the target language, we can further incorporate such translated labeled data into the joint model by adding the following component into Equation 6:
Conclusion
In this paper, we study bilingual sentiment classification and propose a joint model to simultaneously learn better monolingual sentiment classifiers for each language by exploiting an unlabeled parallel corpus together with the labeled data available for each language.
Experimental Setup 4.1 Data Sets and Preprocessing
In our experiments, the proposed joint model is compared with the following baseline methods.
Introduction
In Section 3, the proposed joint model is described.
Related Work
Another notable approach is the work of Boyd-Graber and Resnik (2010), which presents a generative model --- supervised multilingual latent Dirichlet allocation --- that jointly models topics that are consistent across languages, and employs them to better predict sentiment ratings.
Results and Analysis
We first compare the proposed joint model (Joint) with the baselines in Table 2.
Results and Analysis
Overall, the unlabeled parallel data improves classification accuracy for both languages when using our proposed joint model and Co-SVM.
Results and Analysis
The joint model makes better use of the unlabeled parallel data than Co-SVM or TSVMs presumably because of its attempt to jointly optimize the two monolingual models via soft (probabilistic) assignments of the unlabeled instances to classes in each iteration, instead of the hard assignments in Co-SVM and TSVMs.
joint model is mentioned in 10 sentences in this paper.
Topics mentioned in this paper:
Berg-Kirkpatrick, Taylor and Gillick, Dan and Klein, Dan
Abstract
We learn a joint model of sentence extraction and compression for multi-document summarization.
Efficient Prediction
By solving the following ILP we can compute the arg max required for prediction in the joint model:
Experiments
Figure 4: Example summaries produced by our learned joint model of extraction and compression.
Joint Model
Learning weights for Objective 2 where Y(x) is the set of compressive summaries, and C (y) the set of broken edges that produce subtree deletions, gives our LEARNED COMPRESSIVE system, which is our joint model of extraction and compression.
joint model is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Mylonakis, Markos and Sima'an, Khalil
Joint Translation Model
Phrase-pairs are emitted jointly and the over-11 probabilistic SCFG is a joint model over parallel trings.
Joint Translation Model
By splitting the joint model in a hierarchical structure model and a lexical emission one we facilitate estimating the two models separately.
Related Work
We show that a translation system based on such a joint model can perform competitively in comparison with conditional probability models, when it is augmented with a rich latent hierarchical structure trained adequately to avoid overfitting.
joint model is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: