Conclusion and Future Work | We enrich contexts of parallel sentence pairs with topic related monolingual data |
Experiments | These documents are built in the format of inverted index using Lucene2, which can be efficiently retrieved by the parallel sentence pairs. |
Experiments | In the fine-tuning phase, for each parallel sentence pair, we randomly select other ten sentence pairs which satisfy the criterion as negative instances. |
Experiments | This is not simply coincidence since we can interpret their approach as a special case in our neural network method: when a parallel sentence pair has |
Related Work | In addition, our method directly maximizes the similarity between parallel sentence pairs, which is ideal for SMT decoding. |
Topic Similarity Model with Neural Network | Parallel sentence |
Topic Similarity Model with Neural Network | Given a parallel sentence pair ( f, e) , the first step is to treat f and e as queries, and use IR methods to retrieve relevant documents to enrich contextual information for them. |
Topic Similarity Model with Neural Network | Therefore, in this stage, parallel sentence pairs are used to help connecting the vectors from different languages because they express the same topic. |
Approach | The idea is that, given enough parallel data, a shared representation of two parallel sentences would be forced to capture the common elements between these two sentences. |
Approach | What parallel sentences share, of course, are their semantics. |
Approach | For every pair of parallel sentences (a, b) we sample a number of additional sentence pairs (-, n) E C, where n—with high probability—is not semantically equivalent to a. |
Experiments | The ADD+ model uses an additional 500k parallel sentences from the English-French corpus, resulting in one million English sentences, each paired up with either a German or a French sentence, with BI and BI+ trained accordingly. |
Experiments | We train our parsing model with different numbers of parallel sentences to analyze the influence of the amount of parallel data on the parsing performance of our approach. |
Experiments | The parallel data sets contain 500, 1000, 2000, 5000, 10000 and 20000 parallel sentences , respectively. |
Experiments | We randomly extract parallel sentences from each corpora, and smaller data sets are subsets of larger ones. |
Data | (2013), consisting of 1.8M parallel sentences from the NTCIR-7 J PEN PatentMT subtask (Fujii et al., 2008) and 2k parallel sentences for parameter development from the NTCIR-8 test collection. |
Data | For Wikipedia, we trained a DE-EN system on 4.1M parallel sentences from Europarl, Common Crawl, and News-Commentary. |
Data | Parameter tuning was done on 3k parallel sentences from the WMT’ 11 test set. |