Models of Translation Competitions
Hopkins, Mark and May, Jonathan

Article Structure

Abstract

What do we want to learn from a translation competition and how do we learn it with confidence?

The WMT Translation Competition

Every year, the Workshop on Machine Translation (WMT) conducts a competition between machine translation systems.

A Ranking Problem

For several years, WMT used the following heuristic for ranking the translation systems:

A Problem with Rankings

Suppose four systems participate in a translation competition.

From Rankings to Relative Ability

Ostensibly the purpose of a translation competition is to determine the relative ability of a set of translation systems.

Baselines

Let’s begin then, and create some simple preference models to serve as baselines.

Simple Bayesian Models 6.1 Independent Pairs

Another simple model is the direct estimation of each relative ability P(7r|31, 32) independently.

Item-Response Theoretic (IRT) Models

Let’s revisit (Lopez, 2012)’s objection to the B0-JAR ranking heuristic: “...couldn’t a system still be penalized simply by being compared to [good systems] more frequently than its competitors?” The official WMT 2012 findings (Callison-Burch et al., 2012) echoes this concern in justifying the exclusion of reference translations from the 2012 competition:

Experiments

We organized the competition data as described at the end of Section 4.

Conclusion

WMT has faced a crisis of confidence lately, with researchers raising (real and conjectured) issues with its analytical methodology.

Topics

machine translation

Appears in 5 sentences as: Machine Translation (2) machine translation (4)
In Models of Translation Competitions
  1. We then use this framework to compare several analytical models on data from the Workshop on Machine Translation (WMT).
    Page 1, “Abstract”
  2. Every year, the Workshop on Machine Translation (WMT) conducts a competition between machine translation systems.
    Page 1, “The WMT Translation Competition”
  3. For each track, the organizers also assemble a panel of judges, typically machine translation specialists.1 The role of a judge is to repeatedly rank five different translations of the same source text.
    Page 1, “The WMT Translation Competition”
  4. 6One could argue that it specifies a space of machine translation specialists, but likely these individuals are thought to be a representative sample of a broader community.
    Page 3, “From Rankings to Relative Ability”
  5. 14I.e., machine translation specialists.
    Page 8, “Experiments”

See all papers in Proc. ACL 2013 that mention machine translation.

See all papers in Proc. ACL that mention machine translation.

Back to top.

translation systems

Appears in 5 sentences as: translation systems (4) translation systems: (1)
In Models of Translation Competitions
  1. Every year, the Workshop on Machine Translation (WMT) conducts a competition between machine translation systems .
    Page 1, “The WMT Translation Competition”
  2. The WMT organizers invite research groups to submit translation systems in eight different tracks: Czech to/from English, French to/from English, German to/from English, and Spanish to/from English.
    Page 1, “The WMT Translation Competition”
  3. For several years, WMT used the following heuristic for ranking the translation systems:
    Page 1, “A Ranking Problem”
  4. Ostensibly the purpose of a translation competition is to determine the relative ability of a set of translation systems .
    Page 2, “From Rankings to Relative Ability”
  5. Let 8 be the space of all translation systems .
    Page 2, “From Rankings to Relative Ability”

See all papers in Proc. ACL 2013 that mention translation systems.

See all papers in Proc. ACL that mention translation systems.

Back to top.