Abstract | Hiero translation models have two limitations compared to phrase-based models: 1) Limited hypothesis space; 2) No lexicalized reordering model. |
Abstract | Phrasal-Hiero still has the same hypothesis space as the original Hiero but incorporates a phrase-based distance cost feature and lexicalized reodering features into the chart decoder. |
Experiment Results | In all experiments we use phrase-orientation lexicalized reordering (Galley and Manning, 2008)2 which models monotone, swap, discontinuous orientations from both reordering with preVious phrase pair and with the next phrase pair. |
Introduction | Most phrase-based systems are equipped with a distance reordering cost feature to tune the system towards the right amount of reordering, but then also a lexicalized reordering |
Introduction | It does not have the expressive lexicalized reordering model and distance cost features of the phrase-based system. |
Introduction | If we look at the leaves of a Hiero derivation tree, the lexicals also form a segmentation of the source and target sentence, thus also form a discontinuous phrase-based translation path. |
Phrasal-Hiero Model | 2.2 Training: Lexicalized Reordering Table |
Phrasal-Hiero Model | Phrasal-Hiero needs a phrase-based lexicalized reordering table to calculate the features. |
Phrasal-Hiero Model | The lexicalized reordering table could be from a discontinuous phrase-based system. |
Architecture of BRAINSUP | A beam search in the space of all possible lexicalizations of a syntactic pattern promotes the words with the highest likelihood of satisfying the user specification. |
Architecture of BRAINSUP | With the compatible patterns selected, we can initiate a beam search in the space of all possible lexicalizations of the patterns, i.e., the space of all sentences that can be generated by respecting the syntactic constraints encoded by each pattern. |
Architecture of BRAINSUP | Figure 2: A partially lexicalized sentence with a highlighted empty slot marked with X. |
Comparative Study | 4.1 Moses lexicalized reordering model |
Comparative Study | Figure 2: lexicalized reordering model illustration. |
Comparative Study | Our implementation is same with the default behaVior of Moses lexicalized reordering model. |
Introduction | The classifier can be trained with maximum likelihood like Moses lexicalized reordering (Koehn et al., 2007) and hierarchical lexicalized reordering model (Galley and Manning, 2008) or be trained under maximum entropy framework (Zens and Ney, 2006). |
Abstract | Natural language parsing has typically been done with small sets of discrete categories such as NP and VP, but this representation does not capture the full syntactic nor semantic richness of linguistic phrases, and attempts to improve on this by lexicalizing phrases or splitting categories only partly address the problem at the cost of huge feature spaces and sparseness. |
Introduction | Second, lexicalized parsers (Collins, 2003; Charniak, 2000) associate each category with a lexical item. |
Introduction | However, this approach necessitates complex shrinkage estimation schemes to deal with the sparsity of observations of the lexicalized categories. |
Introduction | Another approach is lexicalized parsers (Collins, 2003; Chamiak, 2000) that describe each category with a lexical item, usually the head word. |
Generation Systems | This set subdivides into non-lexicalized and lexicalized transformations. |
Generation Systems | Most transformation rules (335 out of 374 on average) are lexicalized for a specific verb lemma and mostly transform nominalizations as in rule (4-b) and particles (see Section 3.2). |
Introduction | Applying a strictly sequential pipeline on our data, we observe incoherent system output that is related to an interaction of generation levels, very similar to the interleaving between sentence planning and lexicalization in Example (1). |
The Data Set | Nominalizations are mapped to their verbal base forms on the basis of lexicalized rules for the nominalized lemmas observed in the corpus. |
Adaptive Online MT | For example, simple indicator features like lexicalized reordering classes are potentially useful yet bloat the the feature set and, in the worst case, can negatively impact |
Experiments | The baseline “dense” model contains 19 features: the nine Moses baseline features, the hierarchical lexicalized reordering model of Galley and Manning (2008), the (log) count of each rule, and an indicator for unique rules. |
Experiments | Discriminative reordering (LO): indicators for eight lexicalized reordering classes, including the six standard mono-tone/swap/discontinuous classes plus the two simpler Moses monotone/non-monotone classes. |
DNN for word alignment | For the distortion td, we could use a lexicalized distortion model: |
DNN for word alignment | But we found in our initial experiments on small scale data, lexicalized distortion does not produce better alignment over the simple jump-distance based model. |
DNN for word alignment | So we drop the lexicalized |