Automatic Detection of Multilingual Dictionaries on the Web
Grigonyte, Gintare and Baldwin, Timothy

Article Structure


This paper presents an approach to query construction to detect multilingual dictionaries for predetermined language combinations on the web, based on the identification of terms which are likely to occur in bilingual dictionaries but not in general web documents.


Translation dictionaries and other multilingual lexical resources are valuable in a myriad of contexts, from language preservation (Thieberger and Berez, 2012) to language learning (Laufer and Hadar, 1997), cross-language information retrieval (Nie, 2010) and machine translation (Munteanu and Marcu, 2005; Soderland et al., 2009).

Related Work

This research seeks to identify documents of a particular type on the web, namely multilingual dictionaries.


Our method is based on a query formulation approach, and querying against a preexisting index of a document collection (e.g.

Experimental methodology

We evaluate our proposed methodology in two ways:


First, we present results over the synthetic dataset in Table 3.


We have described initial results for a method designed to automatically detect multilingual dictionaries on the web, and attained highly credible results over both a synthetic dataset and an experiment over the open web using a web search engine.


