SciSurf: Index of 'Refinements to Interactive Translation Prediction Based on Search Graphs'

Refinements to Interactive Translation Prediction Based on Search Graphs

Koehn, Philipp and Tsoukala, Chara and Saint-Amand, Herve

Published in Proc. ACL, 2014

Article Structure

Abstract

We propose a number of refinements to the canonical approach to interactive translation prediction.

Introduction

As machine translation enters the workflow of professional translators, the exact nature of this human-computer interaction is currently an open challenge.

Related Work

The interactive machine translation paradigm was first explored in the TransType and TransType2 projects (Langlais et al., 2000a; Foster et al., 2002; Bender et al., 2005; Barrachina et al., 2009).

Properties of Core Algorithm

Our implementation of the core algorithm follows closely Koehn (2009).

Refinements

We now introduce a number of refinements over the core method.

Word Completion

Besides word prediction, word completion is also a useful feature in an interactive translation tool.

Conclusion and Future Work

We observe most improvements by a focus on the last word of the user prefix and approximate word matching.

Topics

machine translation

Appears in 10 sentences as: machine translation (10) machine translations (1)

In Refinements to Interactive Translation Prediction Based on Search Graphs

As machine translation enters the workflow of professional translators, the exact nature of this human-computer interaction is currently an open challenge.
Page 1, “Introduction”
Instead of tasking translators to post-edit the output of machine translation systems, a more interactive approach may be more fruitful.
Page 1, “Introduction”
The standard approach to this problem uses the search graph of the machine translation system.
Page 1, “Introduction”
The interactive machine translation paradigm was first explored in the TransType and TransType2 projects (Langlais et al., 2000a; Foster et al., 2002; Bender et al., 2005; Barrachina et al., 2009).
Page 1, “Related Work”
We predict translations that were crafted by manual post-editing of machine translation output.
Page 2, “Properties of Core Algorithm”
We also use the search graphs of the system that produced the original machine translation output.
Page 2, “Properties of Core Algorithm”
In the project’s first field trialz, professional translators corrected machine translations of news stories from a competitive English—Spanish machine translation system (Koehn and Haddow, 2012).
Page 2, “Properties of Core Algorithm”
Analysis of the data suggests that gains mainly come from large length mismatches between user translation and machine translation , even in the case of first pass searches.
Page 3, “Refinements”
For instance, if the user prefix differs only in casing from the machine translation (say, University instead of university), then we may still want to treat that as a word match in our algorithm.
Page 3, “Refinements”
When the machine translation system decides for college over university, but the user types the letter u, it should change its prediction.
Page 4, “Word Completion”

See all papers in Proc. ACL 2014 that mention machine translation.

See all papers in Proc. ACL that mention machine translation.

edit distance

Appears in 8 sentences as: edit distance (8)

In Refinements to Interactive Translation Prediction Based on Search Graphs

If the user prefix cannot be found in the search graph, approximate string matching is used by finding a path with minimal string edit distance , i.e., a path in the graph with the minimal number of insertions, deletions and substitutions to match the user prefix.
Page 1, “Introduction”
Cost is measured primarily in terms of string edit distance (number of deletions, insertions and substitutions), and secondary in terms of translation model score for the matched path in the graph.
Page 1, “Properties of Core Algorithm”
ms 5 1015 20 25 30 35 40pX Figure 1: Average response time of baseline method based on length of the prefix and number of edits: The main bottleneck is the string edit distance between prefix and path.
Page 2, “Properties of Core Algorithm”
the length of the user prefix and the string edit distance between the user prefix and the search graph.
Page 2, “Properties of Core Algorithm”
To guarantee a response in 100ms, the algorithms aborts when this time is exceeded and relies on a prediction based on string edit distance against the best path in the graph.
Page 2, “Properties of Core Algorithm”
We attempt to find the last word in the predicted path either before or after the optimal matching position according to string edit distance .
Page 3, “Refinements”
tion output is a better fallback than computing optimal string edit distance .
Page 3, “Refinements”
Dissimilarity is measured as letter edit distance
Page 4, “Refinements”

See all papers in Proc. ACL 2014 that mention edit distance.

See all papers in Proc. ACL that mention edit distance.

translation system

Appears in 4 sentences as: translation system (3) translation systems (1)

In Refinements to Interactive Translation Prediction Based on Search Graphs

Instead of tasking translators to post-edit the output of machine translation systems , a more interactive approach may be more fruitful.
Page 1, “Introduction”
The standard approach to this problem uses the search graph of the machine translation system .
Page 1, “Introduction”
In the project’s first field trialz, professional translators corrected machine translations of news stories from a competitive English—Spanish machine translation system (Koehn and Haddow, 2012).
Page 2, “Properties of Core Algorithm”
When the machine translation system decides for college over university, but the user types the letter u, it should change its prediction.
Page 4, “Word Completion”

See all papers in Proc. ACL 2014 that mention translation system.

See all papers in Proc. ACL that mention translation system.