Index of papers in Proc. ACL that mention

in-domain

Seen in text as:

in-domain (181)
In-domain (14)

Seen in 184 sentences in 25 papers.

1. When Specialists and Generalists Work Together: Overcoming Domain Dependence in Sentiment Tagging

Andreevskaia, Alina and Bergler, Sabine

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	This study presents a novel approach to the problem of system portability across different domains: a sentiment annotation system that integrates a corpus-based classifier trained on a small set of annotated in-domain data and a lexicon-based system trained on WordNet.
Abstract	The paper explores the challenges of system portability across domains and text genres (movie reviews, news, blogs, and product reviews), highlights the factors affecting system performance on out-of-domain and small—set in-domain data, and presents a new system consisting of the ensemble of two classifiers with precision-based vote weighting, that provides significant gains in accuracy and recall over the corpus-based classifier and the lexicon-based system taken individually.
Domain Adaptation in Sentiment Research	There are two alternatives to supervised machine learning that can be used to get around this problem: on the one hand, general lists of sentiment clues/features can be acquired from domain-independent sources such as dictionaries or the Internet, on the other hand, unsupervised and weakly-supervised approaches can be used to take advantage of a small number of annotated in-domain examples and/or of unlabelled in-domain data.
Domain Adaptation in Sentiment Research	But such general word lists were shown to perform worse than statistical models built on sufficiently large in-domain training sets of movie reviews (Pang et al., 2002).
Domain Adaptation in Sentiment Research	For instance, Aue and Gamon (2005) proposed training on a samll number of labeled examples and large quantities of unlabelled in-domain data.
Introduction	Many applications require reliable processing of heterogeneous corpora, such as the World Wide Web, where the diversity of genres and domains present in the Internet limits the feasibility of in-domain training.
Introduction	A number of methods has been proposed in order to overcome this system portability limitation by using out-of-domain data, unlabelled in-domain corpora or a combination of in-domain and out-of-domain examples (Aue and Gamon, 2005; Bai et al., 2005; Drezde et al., 2007; Tan et al., 2007).
Introduction	The information contained in lexicographical sources, such as WordNet, reflects a lay person’s general knowledge about the world, while domain-specific knowledge can be acquired through classifier training on a small set of in-domain data.

in-domain is mentioned in 26 sentences in this paper.

Topics mentioned in this paper:

in-domain (26)
unigrams (14)
SVM (10)

2. Effective Selection of Translation Model Training Data

Liu, Le and Hong, Yu and Liu, Hao and Wang, Xing and Yao, Jianmin

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Most current data selection methods solely use language models trained on a small scale in-domain data to select domain-relevant sentence pairs from general-domain parallel corpus.
Experiments	The in-domain data is collected from CWMT09, which consists of spoken dialogues in a travel setting, containing approximately 50,000 parallel sentence pairs in English and Chinese.
Experiments	Bilingual Cor- #sentence #token P113 Eng Chn Eng Chn In-domain 50K 50K 360K 310K General-domain 1 6M 1 6M 3933M 3602M
Experiments	Our work relies on the use of in-domain language models and translation models to rank the sentence pairs from the general-domain bilingual training set.
Introduction	Current data selection methods mostly use language models trained on small scale in-domain data to measure domain relevance and select domain-relevant parallel sentence pairs to expand training corpora.
Related Work	(2010) ranked the sentence pairs in the general-domain corpus according to the perplexity scores of sentences, which are computed with respect to in-domain language models.
Training Data Selection Methods	These methods are based on language model and translation model, which are trained on small in-domain parallel data.
Training Data Selection Methods	t(ej\|fi) is the translation probability of word 61- conditioned on word fiand is estimated from the small in-domain parallel data.
Training Data Selection Methods	The sentence pair with higher score is more likely to be generated by in-domain translation model, thus, it is more relevant to the in-domain corpus and will be remained to expand the training data.

in-domain is mentioned in 22 sentences in this paper.

Topics mentioned in this paper:

3. Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation

Zhang, Jiajun and Zong, Chengqing

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We have introduced the out-of-domain data and the electronic in-domain lexicon in Section 3.
Experiments	Besides the in-domain lexicon, we have collected respectively 1 million monolingual sentences in electronic area from the web.
Experiments	We construct two kinds of phrase-based models using Moses (Koehn et al., 2007): one uses out-of-domain data and the other uses in-domain data.
Introduction	Finally, they used the learned translation model directly to translate unseen data (Ravi and Knight, 2011; Nuhn et al., 2012) or incorporated the learned bilingual lexicon as a new in-domain translation resource into the phrase-based model which is trained with out-of-domain data to improve the domain adaptation performance in machine translation (Dou and Knight, 2012).
Introduction	Since many researchers have studied the bilingual lexicon induction, in this paper, we mainly concentrate ourselves on phrase pair induction given a probabilistic bilingual lexicon and two in-domain large monolingual data (source and target language).
Probabilistic Bilingual Lexicon Acquisition	In order to induce the phrase pairs from the in-domain monolingual data for domain adaptation, the probabilistic bilingual lexicon is essential.
Probabilistic Bilingual Lexicon Acquisition	In this paper, we acquire the probabilistic bilingual lexicon from two approaches: 1) build a bilingual lexicon from large-scale out-of-domain parallel data; 2) adopt a manually collected in-domain lexicon.
Probabilistic Bilingual Lexicon Acquisition	This paper uses Chinese-to-English translation as a case study and electronic data is the in-domain data we focus on.
Related Work	One is using an in-domain probabilistic bilingual lexicon to extract sub-sentential parallel fragments from comparable corpora (Munteanu and Marcu, 2006; Quirk et al., 2007; Cettolo et al., 2010).
Related Work	Munteanu and Marcu (2006) first extract the candidate parallel sentences from the comparable corpora and further extract the accurate sub-sentential bilingual fragments from the candidate parallel sentences using the in-domain probabilistic bilingual lexicon.

in-domain is mentioned in 20 sentences in this paper.

Topics mentioned in this paper:

4. Towards Robust Abstractive Multi-Document Summarization: A Caseframe Analysis of Centrality and Domain

Cheung, Jackie Chi Kit and Penn, Gerald

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	However, our results are also positive in that we find that nearly all model summary caseframes can be found in the source text together with some in-domain documents.
Conclusion	Domain inference, on the other hand, and a greater use of in-domain documents as a knowledge source for domain inference, are very promising indeed.
Experiments	Traditional systems that perform semantic inference do so from a set of known facts about the domain in the form of a knowledge base, but as we have seen, most extractive summarization systems do not make much use of in-domain corpora.
Experiments	We examine adding in-domain text to the source text to see how this would affect coverage.
Experiments	As shown in Table 6, the effect of adding more in-domain text on caseframe coverage is substantial, and noticeably more than using out-of-domain text.
Introduction	Third, we consider how domain knowledge may be useful as a resource for an abstractive system, by showing that key parts of model summaries can be reconstructed from the source plus related in-domain documents.
Related Work	They were originally proposed in the context of single-document summarization, where they were calculated using in-domain (relevant) vs. out-of-domain (irrelevant) text.
Related Work	In multi-document summarization, the in-domain text has been replaced by the source text cluster (Conroy et al., 2006), thus they are now

in-domain is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

5. Employing Word Representations and Regularization for Domain Adaptation of Relation Extraction

Nguyen, Thien Huu and Grishman, Ralph

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We test our system on two scenarios: In-domain : the system is trained and evaluated on the source domain (bn+nw, 5-fold cross validation); Out-of-domain: the system is trained on the source domain and evaluated on the target development set of bc (bc dev).
Experiments	5All the in-domain improvement in rows 2, 6, 7 of Table 2 are significant at confidence levels 2 95%.
Experiments	HLBL embeddings of 50 and 100 dimensions, the most effective way to introduce word embeddings is to add embeddings to the heads of the two mentions (row 2; both in-domain and out-of-domain) although it is less pronounced for HLBL embedding with 50 dimensions.

in-domain is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

6. Open-Domain Semantic Role Labeling by Modeling Word Spans

Huang, Fei and Yates, Alexander

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	In recent semantic role labeling (SRL) competitions such as the shared tasks of CoNLL 2005 and CoNLL 2008, supervised SRL systems have been trained on newswire text, and then tested on both an in-domain test set (Wall Street Journal text) and an out-of-domain test set (fiction).
Introduction	We aim to build an open-domain supervised SRL system; that is, one whose performance on out-of-domain tests approaches the same level of performance as that of state-of-the-art systems on in-domain tests.
Introduction	Experiments on out-of-domain test sets show that our learned representations can dramatically improve out—of-domain performance, and narrow the gap between in-domain and out-of-domain performance by half.

in-domain is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

7. Automatically Extracting Polarity-Bearing Topics for Cross-Domain Sentiment Classification

He, Yulan and Lin, Chenghua and Alani, Harith

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We study the polarity-bearing topics extracted by J ST and show that by augmenting the original feature space with polarity-bearing topics, the in-domain supervised classifiers learned from augmented feature representation achieve the state-of-the-art performance of 95% on the movie review data and an average of 90% on the multi-domain sentiment dataset.
Introduction	We study the polarity-bearing topics extracted by the JST model and show that by augmenting the original feature space with polarity-bearing topics, the performance of in-domain supervised classifiers learned from augmented feature representation improves substantially, reaching the state-of-the-art results of 95% on the movie review data and an average of 90% on the multi-domain sentiment dataset.
Joint Sentiment-Topic (J ST) Model	The adaptation loss is calculated with respect to the in-domain gold standard classification result.
Joint Sentiment-Topic (J ST) Model	For example, the in-domain goal standard for the Book domain is 79.96%.
Joint Sentiment-Topic (J ST) Model	Table 3: Adaptation loss with respect to the in-domain gold standard.

in-domain is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

8. Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition

Rüd, Stefan and Ciaramita, Massimiliano and Müller, Jens and Schütze, Hinrich

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	We achieve strong gains in NER performance on news, in-domain and out-of-domain, and on web queries.
Conclusion	Even in-domain , we were able to get a smaller, but still noticeable improvement of 4.2% due to piggyback features.
Experimental data	In our experiments, we train an NER classifier on an in-domain data set and test it on two different out-of-domain data sets.
Experimental data	3A reviewer points out that we use the terms in-domain and out-of-domain somewhat liberally.
Experimental setup	For our in-domain evaluation, we tune T on a 10% development sample of the CoNLL data and test on the remaining 10%.
Related work	Another source of world knowledge for NER is Wikipedia: Kazama and Torisawa (2007) show that pseudocategories extracted from Wikipedia help for in-domain NER.
Results and discussion	Even though the emphasis of this paper is on cross-domain robustness, we can see that our approach also has clear in-domain benefits.
Results and discussion	ment due to piggyback features increases as out-of-domain data become more different from the in-domain training set, performance declines in absolute terms from .930 (CoNLL) to .681 (IEER) and .438 (KDD-T).

in-domain is mentioned in 8 sentences in this paper.

Topics mentioned in this paper:

NER (40)
CoNLL (23)
in-domain (8)

9. Kneser-Ney Smoothing on Expected Counts

Zhang, Hui and Chiang, David

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Language model adaptation	Many methods (Lin et al., 1997; Gao et al., 2002; Klakow, 2000; Moore and Lewis, 2010; AX-elrod et al., 2011) rank sentences in the general-domain data according to their similarity to the in-domain data and select only those with score higher than some threshold.
Language model adaptation	However, sometimes it is hard to say whether a sentence is totally in-domain or out-of-domain; for example, quoted speech in a news report might be partly in-domain if the domain of interest is broadcast conversation.
Language model adaptation	They first train two language models, pin on a set of in-domain data, and pout on a set of general-domain data.

in-domain is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

10. A computational approach to politeness with application to social factors

Danescu-Niculescu-Mizil, Cristian and Sudhof, Moritz and Jurafsky, Dan and Leskovec, Jure and Potts, Christopher

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Predicting politeness	Classification results We evaluate the classifiers both in an in-domain setting, with a standard leave-one-out cross validation procedure, and in a cross-domain setting, where we train on one domain and test on the other (Table 4).
Predicting politeness	For both our development and our test domains, and in both the in-domain and cross-domain settings, the linguistically informed features give 3-4% absolute improvement over the bag of words model.
Predicting politeness	While the in-domain results are within 3% of human performance, the greater room for improvement in the cross-domain setting motivates further research on linguistic cues of politeness.
Relation to social factors	Encouraged by the close-to-human performance of our in-domain classifiers, we use them to assign politeness labels to our full dataset and then compare these labels to independent measures of power and status in our data.
Relation to social factors	In-domain Cross-domain Train Wiki SE Wiki SE Test Wiki SE SE Wiki
Relation to social factors	Table 4: Accuracies of our two classifiers for Wikipedia (Wiki) and Stack Exchange (SE), for in-domain and cross-domain settings.

in-domain is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

in-domain (6)
SVM (3)

11. Cross-Domain Dependency Parsing Using a Deep Linguistic Grammar

Zhang, Yi and Wang, Rui

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	Pure statistical parsing systems achieves high in-domain accuracy but performs poorly out-domain.
Dependency Parsing with HPSG	With the extra features, we hope that the training of the statistical model will not overfit the in-domain data, but be able to deal with domain independent linguistic phenomena as well.
Experiment Results & Error Analyses	With both parsers, we see slight performance drops with both HP SG feature models on in-domain tests (WSJ), compared with the original models.
Experiment Results & Error Analyses	When we look at the performance difference between in-domain and out-domain tests for each feature model, we observe that the drop is significantly smaller for the extended models with HP SG features.
Experiment Results & Error Analyses	Admittedly the results on PCHEMTB are lower than the best reported results in CoNLL 2007 Shared Task, we shall note that we are not using any in-domain unlabeled data.

in-domain is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

12. Vector Space Model for Adaptation in Statistical Machine Translation

Chen, Boxing and Kuhn, Roland and Foster, George

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Abstract	The general idea is first to create a vector profile for the in-domain development (“dev”) set.
Experiments	The other is a linear combination of TMs trained on each subcorpus, with the weights of each model learned with an EM algorithm to maximize the likelihood of joint empirical phrase pair counts for in-domain deV data.
Introduction	In transductive learning, an MT system trained on general domain data is used to translate in-domain monolingual data.
Introduction	Data selection approaches (Zhao et al., 2004; Hildebrand et al., 2005; Lu et al., 2007; Moore and Lewis, 2010; Axelrod et al., 2011) search for bilingual sentence pairs that are similar to the in-domain “dev” data, then add them to the training data.
Vector space model adaptation	For the in-domain dev set, we first run word alignment and phrases extracting in the usual way for the dev set, then sum the distribution of each phrase pair (fj, 6k) extracted from the dev data across subcorpora to represent its domain information.

in-domain is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

13. Domain-Independent Abstract Generation for Focused Meeting Summarization

Wang, Lu and Cardie, Claire

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	The resulting systems yield results comparable to those from the same system trained on in-domain data, and statistically significantly outperform supervised extractive summarization approaches trained on in-domain data.
Results	Table 3 indicates that, with both true clusterings and system clusterings, our system trained on out-of—domain data achieves comparable performance with the same system trained on in-domain data.
Results	for OUR SYSTEM trained on IN-domain data
Results	and OUT-of-domain data, and for the utterance-level extraction system (SVM-DA) trained on in-domain data.

in-domain is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

14. In-domain Relation Discovery with Meta-constraints via Posterior Regularization

Chen, Harr and Benson, Edward and Naseem, Tahira and Barzilay, Regina

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusions	This paper has presented a constraint-based approach to in-domain relation discovery.
Experimental Setup	methods ultimately aim to capture domain-specific relations expressed with varying verbalizations, and both operate over in-domain input corpora supplemented with syntactic information.
Introduction	In this paper, we introduce a novel approach for the unsupervised learning of relations and their instantiations from a set of in-domain documents.
Introduction	Clusters of similar in-domain documents are
Model	Our work performs in-domain relation discovery by leveraging regularities in relation expression at the lexical, syntactic, and discourse levels.

in-domain is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

15. Robust Domain Adaptation for Relation Extraction via Clustering Consistency

Nguyen, Minh Luan and Tsang, Ivor W. and Chai, Kian Ming A. and Chieu, Hai Leong

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	In-domain multiclass classifier This is Support-vector-machine (Fan et al., 2008, SVM) using the one-versus-rest decoding without removing positive labeled data (Jiang and Zhai, 2007b) from the target domain.
Experiments	From Table 3 and Table 5, we see that the proposed method has the best F1 among all the other methods, except for the supervised upper bound ( In-domain ).
Experiments	Performance Gap From Tables 2 to 4, we observe that the smallest performance gap between RDA and the in-domain settings is still high (about 12% with k = 5) on ACE 2004.

in-domain is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

16. Prediction of Learning Curves in Machine Translation

Kolachina, Prasanth and Cancedda, Nicola and Dymetman, Marc and Venkatapathy, Sriram

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Inferring a learning curve from mostly monolingual data	Given a small “seed” parallel corpus, the translation system can be used to train small in-domain models and the evaluation score can be measured at a few initial sample sizes {($1,y1), ($2, yg)...(acp, yp)}.
Inferring a learning curve from mostly monolingual data	For the cases where a slightly larger in-domain “seed” parallel corpus is available, we introduced an extrapolation method and a combined method yielding high-precision predictions: using models trained on up to 20K sentence pairs we can predict performance on a given test set with a root mean squared error in the order of l BLEU point at 75K sentence pairs, and in the order of 2-4 BLEU points at 500K.
Introduction	This prediction, or more generally the prediction of the learning curve of an SMT system as a function of available in-domain parallel data, is the objective of this paper.
Introduction	In the second scenario (S2), an additional small seed parallel corpus is given that can be used to train small in-domain models and measure (with some variance) the evaluation score at a few points on the initial portion of the learning curve.

in-domain is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

17. Domain Adaptation by Constraining Inter-Domain Variability of Latent Feature Representation

Titov, Ivan

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Empirical Evaluation	We compare them with two supervised methods: a supervised model (Base) which is trained on the source domain data only, and another supervised model ( In-domain ) which is learned on the labeled data from the target domain.
Empirical Evaluation	The Base model can be regarded as a natural baseline model, whereas the In-domain model is essentially an upper-bound for any domain-adaptation method.
Empirical Evaluation	First, observe that the total drop in the accuracy when moving to the target domain is 8.9%: from 84.6% demonstrated by the In-domain classifier to 75.6% shown by the non-adapted Base classifier.
Related Work	5 The drop in accuracy for the SCL method in Table 1 is is computed with respect to the less accurate supervised in-domain classifier considered in Blitzer et a1.

in-domain is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

18. GlossBoot: Bootstrapping Multilingual Domain Glossaries from the Web

De Benedictis, Flavio and Faralli, Stefano and Navigli, Roberto

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Comparative Evaluation	Next, for each domain and language, we manually calculated the fraction of terms for which an in-domain definition was provided by Google Define and GlossBoot.
Results and Discussion	In Table 5 we show examples of the possible scenarios for terms: in-domain extracted terms
Results and Discussion	which are also found in the gold standard (column 2), in-domain extracted terms but not in the gold standard (column 3), out-of—domain extracted terms (column 4), and domain terms in the gold standard but not extracted by our approach (column 5).
Results and Discussion	Table 6), because the retrieved glosses of domain terms are usually in-domain too, and follow a definitional style because they come from glossaries.

in-domain is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

19. Towards Open-Domain Semantic Role Labeling

Croce, Danilo and Giannone, Cristina and Annesi, Paolo and Basili, Roberto

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Empirical Analysis	The aim of the evaluation is to measure the reachable accuracy of the simple model proposed and to compare its impact over in-domain and out-of-domain semantic role labeling tasks.
Empirical Analysis	The in-domain test has been run over the FrameNet annotated corpus, derived from the British National Corpus (BNC).
Empirical Analysis	training FN—BNC 134,697 271,560 test in-domain FN—BNC 14,952 30,173 t d .

in-domain is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

20. Opinion Mining on YouTube

Severyn, Aliaksei and Moschitti, Alessandro and Uryupina, Olga and Plank, Barbara and Filippova, Katja

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We start off by presenting the results for the traditional in-domain setting, where both TRAIN and TEST come from the same domain, e.g., AUTO or TABLETS.
Experiments	5.3.1 In-domain experiments
Experiments	Figure 2: In-domain learning curves.

in-domain is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

21. Mixing Multiple Translation Models in Statistical Machine Translation

Razmara, Majid and Foster, George and Sankaran, Baskaran and Sarkar, Anoop

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Domain adaptation techniques aim at finding ways to adjust an out-of-domain (OUT) model to represent a target domain ( in-domain or IN).
Introduction	In addition to the basic approach of concatenation of in-domain and out-of-domain data, we also trained a log-linear mixture model (Foster and Kuhn, 2007)
Related Work 5.1 Domain Adaptation	Other methods include using self-training techniques to exploit monolingual in-domain data (Ueffing et al., 2007;

in-domain is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

22. Translating Dialectal Arabic to English

Sajjad, Hassan and Darwish, Kareem and Belinkov, Yonatan

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	— We used phrase-table merging (Nakov and Ng, 2009) to utilize MSA/English parallel data with the available in-domain parallel data.
Previous Work	In contrast, we showed that training on in-domain dialectal data irrespective of its small size is better than training on large MSA/English data.
Previous Work	Our LM experiments also affirmed the importance of in-domain English LMs.

in-domain is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

BLEU (13)
parallel data (9)
LM (8)

23. Discourse Complements Lexical Semantics for Non-factoid Answer Reranking

Jansen, Peter and Surdeanu, Mihai and Clark, Peter

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

CR + LS + DMM + DPM 39.32* +24% 47.86* +20%	The ensemble model without LS (third line) has a nearly identical P@1 score as the equivalent in-domain model (line 13 in Table 1), while slightly surpassing in-domain MRR performance.
CR + LS + DMM + DPM 39.32* +24% 47.86* +20%	The in-domain performance of the ensemble model is similar to that of the single classifier in both YA and Bio HOW so we omit these results here for simplicity.
Related Work	Inspired by this previous work and recent work in discourse parsing (Feng and Hirst, 2012), our work is the first to systematically explore structured discourse features driven by several discourse representations, combine discourse with lexical semantic models, and evaluate these representations on thousands of questions using both in-domain and cross-domain experiments.

in-domain is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

24. Tagging The Web: Building A Robust Web Tagger with Neural Network

Ma, Ji and Zhang, Yue and Zhu, Jingbo

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	While most previous work focus on in-domain sequential labelling or cross-domain classification tasks, we are the first to learn representations for web-domain structured prediction.
Introduction	Our results suggest that while both strategies improve in-domain tagging accuracies, keeping the learned representation unchanged consistently results in better cross-domain accuracies.
Related Work	(2010) learn word embeddings to improve the performance of in-domain POS tagging, named entity recognition, chunking and semantic role labelling.

in-domain is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

25. Learning a Compositional Semantic Parser using an Existing Syntactic Parser

Ge, Ruifang and Mooney, Raymond

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Evaluation	Listed together with their PARSEVAL F-measures these are: gold-standard parses from the treebank (GoldSyn, 100%), a parser trained on WSJ plus a small number of in-domain training sentences required to achieve good performance, 20 for CLANG (Syn20, 88.21%) and 40 for GEOQUERY (Syn40, 91.46%), and a parser trained on no in-domain data (Syn0, 82.15% for CLANG and 76.44% for GEOQUERY).
Experimental Evaluation	ones trained on more in-domain data) improved our approach.
Experimental Evaluation	Table 5: Performance on GEO25 0 (20 in-domain sentences are used in SYN20 to train the syntactic parser).

in-domain is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: