Experiments | They are manually translated into the other language to produce 7,000 sentence pairs , which are split into two parts: 2,000 pairs as development set (dev) and the other 5,000 pairs as test set (web test). |
Experiments | After removing duplicates, we have about 18 million sentence pairs , which contain about 270 millions of English tokens and 320 millions of Japanese tokens. |
Experiments | As we do not have access to a golden reordered sentence set, we decide to use the alignment crossing-link numbers between aligned sentence pairs as the measure for reorder performance. |
Ranking Model Training | For a sentence pair (6, f, a) with syntax tree Te on the source side, we need to determine which reordered tree Té, best represents the word order in target sentence f. For a tree node 75 in T6, if its children align to disjoint target spans, we can simply arrange them in the order of their corresponding target |
Ranking Model Training | Figure 2: Fragment of a sentence pair . |
Ranking Model Training | Figure 2 shows a fragment of one sentence pair in our training data. |
Word Reordering as Syntax Tree Node Ranking | Figure 1: An English-to-Japanese sentence pair . |
Data and task | A total of 13,410 English-Bulgarian and 8,832 English-Korean sentence pairs were extracted. |
Data and task | Of these, we manually annotated 91 English-Bulgarian and 79 English-Korean sentence pairs with source and target named entities as well as word-alignment links among named entities in the two languages. |
Data and task | Figure 1 illustrates a Bulgarian-English sentence pair with alignment. |
Introduction | Our results show that the semi-CRF model improves on the performance of projection models by more than 10 points in F—measure, and that we can achieve tagging F-measure of over 91 using a very small number of annotated sentence pairs . |
Experiments and Results | The training data contains 81k sentence pairs , 655k Chinese words and 806 English words. |
Experiments and Results | The training data contains 354k sentence pairs , 8M Chinese words and 10M English words. |
Experiments and Results | re-ranking methods are performed in the same way as for IWSLT data, but for consensus-based decoding, the data set contains too many sentence pairs to be held in one graph for our machine. |
Features and Training | For the nodes representing the training sentence pairs , this posterior is fixed. |
Graph Construction | If there are sentence pairs with the same source sentence but different translations, all the translations will be assigned as labels to that source sentence, and the corresponding probabilities are estimated by MLE. |
Graph Construction | There is no edge between training nodes, since we suppose all the sentences of the training data are correct, and it is pointless to reestimate the confidence of those sentence pairs . |
Graph Construction | Forced alignment performs phrase segmentation and alignment of each sentence pair of the training data using the full translation system as in decoding (Wuebker et al., 2010). |
Experiments | After tokenization and filtering, this bilingual corpus contained 319,694 sentence pairs (7.9M tokens on |
Extraction of Paraphrase Rules | 3.2 Selecting Paraphrase Sentence Pairs |
Extraction of Paraphrase Rules | If the sentence in T 2 has a higher BLEU score than the aligned sentence in T1, the corresponding sentences in S0 and S1 are selected as candidate paraphrase sentence pairs , which are used in the following steps of paraphrase extractions. |
Extraction of Paraphrase Rules | From the word-aligned sentence pairs , we then extract a set of rules that are consistent with the word alignments. |
Forward-Translation vs. Back-Translation | The aligned sentence pairs in (S0, S1) can be considered as paraphrases. |
Evaluation for SS | The two data sets we know of for SS are: l. human-rated sentence pair similarity data set (Li et al., 2006) [L106]; 2. the Microsoft Research Paraphrase Corpus (Dolan et al., 2004) [MSR04]. |
Evaluation for SS | On the other hand, the MSR04 data set comprises a much larger set of sentence pairs : 4,076 training and 1,725 test pairs. |
Evaluation for SS | This is not a problem per se, however the issue is that it is very strict in its assignment of a positive label, for example the following sentence pair as cited in (Islam and Inkpen, 2008) is rated not semantically similar: Ballmer has been vocal in the past warning that Linux is a threat to Microsoft. |
Experiments and Results | Note that 7“ and p are much lower for 35 pairs set, since most of the sentence pairs have a very low similarity (the average similarity value is 0.065 in 35 pairs set and 0.367 in 30 pairs set) and SS models need to identify the tiny difference among them, thereby rendering this set much harder to predict. |
Experiments and Results | We use the same parameter setting used for the L106 evaluation setting since both sets are human-rated sentence pairs (A = 20,10m = 0.01,K = 100). |
Experiments | We train our model on a dataset with ~1.5M sentence pairs from the LDC dataset.2 We use the 2002 NIST MT evaluation test data (878 sentence pairs) as the development data, and the 2003, 2004, 2005, 2006-news NIST MT evaluation test data (919, 1788, 1082, and 616 sentence pairs , respectively) as the test data. |
Head-Driven HPB Translation Model | For rule extraction, we first identify initial phrase pairs on word-aligned sentence pairs by using the same criterion as most phrase-based translation models (Och and Ney, 2004) and Chiang’s HPB model (Chiang, 2005; Chiang, 2007). |
Introduction | Figure 1: An example word alignment for a Chinese-English sentence pair with the dependency parse tree for the Chinese sentence. |