Index of papers in Proc. ACL 2013 that mention
  • in-domain
Cheung, Jackie Chi Kit and Penn, Gerald
Conclusion
However, our results are also positive in that we find that nearly all model summary caseframes can be found in the source text together with some in-domain documents.
Conclusion
Domain inference, on the other hand, and a greater use of in-domain documents as a knowledge source for domain inference, are very promising indeed.
Experiments
Traditional systems that perform semantic inference do so from a set of known facts about the domain in the form of a knowledge base, but as we have seen, most extractive summarization systems do not make much use of in-domain corpora.
Experiments
We examine adding in-domain text to the source text to see how this would affect coverage.
Experiments
As shown in Table 6, the effect of adding more in-domain text on caseframe coverage is substantial, and noticeably more than using out-of-domain text.
Introduction
Third, we consider how domain knowledge may be useful as a resource for an abstractive system, by showing that key parts of model summaries can be reconstructed from the source plus related in-domain documents.
Related Work
They were originally proposed in the context of single-document summarization, where they were calculated using in-domain (relevant) vs. out-of-domain (irrelevant) text.
Related Work
In multi-document summarization, the in-domain text has been replaced by the source text cluster (Conroy et al., 2006), thus they are now
in-domain is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Zhang, Jiajun and Zong, Chengqing
Experiments
We have introduced the out-of-domain data and the electronic in-domain lexicon in Section 3.
Experiments
Besides the in-domain lexicon, we have collected respectively 1 million monolingual sentences in electronic area from the web.
Experiments
We construct two kinds of phrase-based models using Moses (Koehn et al., 2007): one uses out-of-domain data and the other uses in-domain data.
Introduction
Finally, they used the learned translation model directly to translate unseen data (Ravi and Knight, 2011; Nuhn et al., 2012) or incorporated the learned bilingual lexicon as a new in-domain translation resource into the phrase-based model which is trained with out-of-domain data to improve the domain adaptation performance in machine translation (Dou and Knight, 2012).
Introduction
Since many researchers have studied the bilingual lexicon induction, in this paper, we mainly concentrate ourselves on phrase pair induction given a probabilistic bilingual lexicon and two in-domain large monolingual data (source and target language).
Probabilistic Bilingual Lexicon Acquisition
In order to induce the phrase pairs from the in-domain monolingual data for domain adaptation, the probabilistic bilingual lexicon is essential.
Probabilistic Bilingual Lexicon Acquisition
In this paper, we acquire the probabilistic bilingual lexicon from two approaches: 1) build a bilingual lexicon from large-scale out-of-domain parallel data; 2) adopt a manually collected in-domain lexicon.
Probabilistic Bilingual Lexicon Acquisition
This paper uses Chinese-to-English translation as a case study and electronic data is the in-domain data we focus on.
Related Work
One is using an in-domain probabilistic bilingual lexicon to extract sub-sentential parallel fragments from comparable corpora (Munteanu and Marcu, 2006; Quirk et al., 2007; Cettolo et al., 2010).
Related Work
Munteanu and Marcu (2006) first extract the candidate parallel sentences from the comparable corpora and further extract the accurate sub-sentential bilingual fragments from the candidate parallel sentences using the in-domain probabilistic bilingual lexicon.
in-domain is mentioned in 20 sentences in this paper.
Topics mentioned in this paper:
Danescu-Niculescu-Mizil, Cristian and Sudhof, Moritz and Jurafsky, Dan and Leskovec, Jure and Potts, Christopher
Predicting politeness
Classification results We evaluate the classifiers both in an in-domain setting, with a standard leave-one-out cross validation procedure, and in a cross-domain setting, where we train on one domain and test on the other (Table 4).
Predicting politeness
For both our development and our test domains, and in both the in-domain and cross-domain settings, the linguistically informed features give 3-4% absolute improvement over the bag of words model.
Predicting politeness
While the in-domain results are within 3% of human performance, the greater room for improvement in the cross-domain setting motivates further research on linguistic cues of politeness.
Relation to social factors
Encouraged by the close-to-human performance of our in-domain classifiers, we use them to assign politeness labels to our full dataset and then compare these labels to independent measures of power and status in our data.
Relation to social factors
In-domain Cross-domain Train Wiki SE Wiki SE Test Wiki SE SE Wiki
Relation to social factors
Table 4: Accuracies of our two classifiers for Wikipedia (Wiki) and Stack Exchange (SE), for in-domain and cross-domain settings.
in-domain is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Chen, Boxing and Kuhn, Roland and Foster, George
Abstract
The general idea is first to create a vector profile for the in-domain development (“dev”) set.
Experiments
The other is a linear combination of TMs trained on each subcorpus, with the weights of each model learned with an EM algorithm to maximize the likelihood of joint empirical phrase pair counts for in-domain deV data.
Introduction
In transductive learning, an MT system trained on general domain data is used to translate in-domain monolingual data.
Introduction
Data selection approaches (Zhao et al., 2004; Hildebrand et al., 2005; Lu et al., 2007; Moore and Lewis, 2010; Axelrod et al., 2011) search for bilingual sentence pairs that are similar to the in-domain “dev” data, then add them to the training data.
Vector space model adaptation
For the in-domain dev set, we first run word alignment and phrases extracting in the usual way for the dev set, then sum the distribution of each phrase pair (fj, 6k) extracted from the dev data across subcorpora to represent its domain information.
in-domain is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Wang, Lu and Cardie, Claire
Introduction
The resulting systems yield results comparable to those from the same system trained on in-domain data, and statistically significantly outperform supervised extractive summarization approaches trained on in-domain data.
Results
Table 3 indicates that, with both true clusterings and system clusterings, our system trained on out-of—domain data achieves comparable performance with the same system trained on in-domain data.
Results
for OUR SYSTEM trained on IN-domain data
Results
and OUT-of-domain data, and for the utterance-level extraction system (SVM-DA) trained on in-domain data.
in-domain is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
De Benedictis, Flavio and Faralli, Stefano and Navigli, Roberto
Comparative Evaluation
Next, for each domain and language, we manually calculated the fraction of terms for which an in-domain definition was provided by Google Define and GlossBoot.
Results and Discussion
In Table 5 we show examples of the possible scenarios for terms: in-domain extracted terms
Results and Discussion
which are also found in the gold standard (column 2), in-domain extracted terms but not in the gold standard (column 3), out-of—domain extracted terms (column 4), and domain terms in the gold standard but not extracted by our approach (column 5).
Results and Discussion
Table 6), because the retrieved glosses of domain terms are usually in-domain too, and follow a definitional style because they come from glossaries.
in-domain is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Sajjad, Hassan and Darwish, Kareem and Belinkov, Yonatan
Introduction
— We used phrase-table merging (Nakov and Ng, 2009) to utilize MSA/English parallel data with the available in-domain parallel data.
Previous Work
In contrast, we showed that training on in-domain dialectal data irrespective of its small size is better than training on large MSA/English data.
Previous Work
Our LM experiments also affirmed the importance of in-domain English LMs.
in-domain is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: