This paper presents an approach to query construction to detect multilingual dictionaries for predetermined language combinations on the web, based on the identification of terms which are likely to occur in bilingual dictionaries but not in general web documents.
Translation dictionaries and other multilingual lexical resources are valuable in a myriad of contexts, from language preservation (Thieberger and Berez, 2012) to language learning (Laufer and Hadar, 1997), cross-language information retrieval (Nie, 2010) and machine translation (Munteanu and Marcu, 2005; Soderland et al., 2009).
This research seeks to identify documents of a particular type on the web, namely multilingual dictionaries.
Our method is based on a query formulation approach, and querying against a preexisting index of a document collection (e.g.
We evaluate our proposed methodology in two ways:
First, we present results over the synthetic dataset in Table 3.
We have described initial results for a method designed to automatically detect multilingual dictionaries on the web, and attained highly credible results over both a synthetic dataset and an experiment over the open web using a web search engine.
See all papers in Proc. ACL 2014 that mention language pairs.
See all papers in Proc. ACL that mention language pairs.
Back to top.