Index of papers in Proc. ACL 2011 that mention
  • word pairs
Sajjad, Hassan and Fraser, Alexander and Schmid, Helmut
We also apply our method to English/Hindi and English/Arabic parallel corpora and compare the results with manually built gold standards which mark transliterated word pairs .
Extraction of Transliteration Pairs
Initially, we extract a list of word pairs from a word-aligned parallel corpus using GIZA++.
Extraction of Transliteration Pairs
The extracted word pairs are either transliterations, other kinds of translations, or misalignments.
We first align a bilingual corpus at the word level using GIZA++ and create a list of word pairs containing a mix of non-transliterations and transliterations.
tistical transliterator on the list of word pairs .
We then filter out a few word pairs (those which have the lowest transliteration probabilities according to the trained transliteration system) which are likely to be non-transliterations.
The training data is a list of word pairs (a source word and its presumed transliteration) extracted from a word-aligned parallel corpus.
g2p builds a joint sequence model on the character sequences of the word pairs and infers m-to-n alignments between source and target characters with Expectation Maximization (EM) training.
For training Moses as a transliteration system, we treat each word pair as if it were a parallel sentence, by putting spaces between the characters of each word.
word pairs is mentioned in 37 sentences in this paper.
Topics mentioned in this paper:
Wang, Ziqi and Xu, Gu and Li, Hang and Zhang, Ming
Experimental Results
5.1 Word Pair Mining
Experimental Results
Table 1 shows some examples of the mined word pairs .
Experimental Results
Table 1: Examples of Word Pairs
Model for Candidate Generation
Figure 1: Example of rule extraction from word pair
Model for Candidate Generation
If we can apply a set of rules to transform the misspelled word mm to a correct word we in the vocabulary, then we call the rule set a “transformation” for the word pair mm and we.
Model for Candidate Generation
Note that for a given word pair , it is likely that there are multiple possible transformations for it.
word pairs is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Subotin, Michael
For word pairs whose source-side word is a verb, we add a feature marking the number of its subject, with separate features for noun and pronoun subjects.
For word pairs whose source side is an adjective, we add a feature marking the number of the head of the smallest noun phrase that contains it.
Modeling unobserved target inflections
For greater speed we estimate the probabilities for the other two models using interpolated Kneser-Ney smoothing (Chen and Goodman, 1998), where the surface form of a rule or an aligned word pair plays to role of a trigram, the pairing of the source surface form with the lemmatized target form plays the role of a bigram, and the source surface form alone plays the role of a unigram.
word pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhou, Guangyou and Zhao, Jun and Liu, Kang and Cai, Li
Web page hits for word pairs and trigrams are obtained using a simple heuristic query to the search engine Google.11 Inflected queries are performed by expanding a bigram or trigram into all its morphological forms.
The idea is very simple: web-scale data have large coverage for word pair acquisition.
Related Work
Several previous studies have exploited the web-scale data for word pair acquisition.
word pairs is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: