Abstract | Most current data selection methods solely use language models trained on a small scale in-domain data to select domain-relevant sentence pairs from general-domain parallel corpus. |
Experiments | The in-domain data is collected from CWMT09, which consists of spoken dialogues in a travel setting, containing approximately 50,000 parallel sentence pairs in English and Chinese. |
Experiments | Bilingual Cor- #sentence #token P113 Eng Chn Eng Chn In-domain 50K 50K 360K 310K General-domain 1 6M 1 6M 3933M 3602M |
Experiments | Our work relies on the use of in-domain language models and translation models to rank the sentence pairs from the general-domain bilingual training set. |
Introduction | Current data selection methods mostly use language models trained on small scale in-domain data to measure domain relevance and select domain-relevant parallel sentence pairs to expand training corpora. |
Related Work | (2010) ranked the sentence pairs in the general-domain corpus according to the perplexity scores of sentences, which are computed with respect to in-domain language models. |
Training Data Selection Methods | These methods are based on language model and translation model, which are trained on small in-domain parallel data. |
Training Data Selection Methods | t(ej|fi) is the translation probability of word 61- conditioned on word fiand is estimated from the small in-domain parallel data. |
Training Data Selection Methods | The sentence pair with higher score is more likely to be generated by in-domain translation model, thus, it is more relevant to the in-domain corpus and will be remained to expand the training data. |
Experiments | We test our system on two scenarios: In-domain : the system is trained and evaluated on the source domain (bn+nw, 5-fold cross validation); Out-of-domain: the system is trained on the source domain and evaluated on the target development set of bc (bc dev). |
Experiments | 5All the in-domain improvement in rows 2, 6, 7 of Table 2 are significant at confidence levels 2 95%. |
Experiments | HLBL embeddings of 50 and 100 dimensions, the most effective way to introduce word embeddings is to add embeddings to the heads of the two mentions (row 2; both in-domain and out-of-domain) although it is less pronounced for HLBL embedding with 50 dimensions. |
Language model adaptation | Many methods (Lin et al., 1997; Gao et al., 2002; Klakow, 2000; Moore and Lewis, 2010; AX-elrod et al., 2011) rank sentences in the general-domain data according to their similarity to the in-domain data and select only those with score higher than some threshold. |
Language model adaptation | However, sometimes it is hard to say whether a sentence is totally in-domain or out-of-domain; for example, quoted speech in a news report might be partly in-domain if the domain of interest is broadcast conversation. |
Language model adaptation | They first train two language models, pin on a set of in-domain data, and pout on a set of general-domain data. |
Experiments | In-domain multiclass classifier This is Support-vector-machine (Fan et al., 2008, SVM) using the one-versus-rest decoding without removing positive labeled data (Jiang and Zhai, 2007b) from the target domain. |
Experiments | From Table 3 and Table 5, we see that the proposed method has the best F1 among all the other methods, except for the supervised upper bound ( In-domain ). |
Experiments | Performance Gap From Tables 2 to 4, we observe that the smallest performance gap between RDA and the in-domain settings is still high (about 12% with k = 5) on ACE 2004. |
Experiments | We start off by presenting the results for the traditional in-domain setting, where both TRAIN and TEST come from the same domain, e.g., AUTO or TABLETS. |
Experiments | 5.3.1 In-domain experiments |
Experiments | Figure 2: In-domain learning curves. |
CR + LS + DMM + DPM 39.32* +24% 47.86* +20% | The ensemble model without LS (third line) has a nearly identical P@1 score as the equivalent in-domain model (line 13 in Table 1), while slightly surpassing in-domain MRR performance. |
CR + LS + DMM + DPM 39.32* +24% 47.86* +20% | The in-domain performance of the ensemble model is similar to that of the single classifier in both YA and Bio HOW so we omit these results here for simplicity. |
Related Work | Inspired by this previous work and recent work in discourse parsing (Feng and Hirst, 2012), our work is the first to systematically explore structured discourse features driven by several discourse representations, combine discourse with lexical semantic models, and evaluate these representations on thousands of questions using both in-domain and cross-domain experiments. |
Introduction | While most previous work focus on in-domain sequential labelling or cross-domain classification tasks, we are the first to learn representations for web-domain structured prediction. |
Introduction | Our results suggest that while both strategies improve in-domain tagging accuracies, keeping the learned representation unchanged consistently results in better cross-domain accuracies. |
Related Work | (2010) learn word embeddings to improve the performance of in-domain POS tagging, named entity recognition, chunking and semantic role labelling. |