Abstract | First, a sequence of weak translation systems is generated from a baseline system in an iterative manner. |
Abstract | We evaluate our method on Chinese-to-English Machine Translation (MT) tasks in three baseline systems , including a phrase-based system, a hierarchical phrase-based system and a syntax-based system. |
Abstract | The experimental results on three NIST evaluation test sets show that our method leads to significant improvements in translation accuracy over the baseline systems . |
Background | 5.1 Baseline Systems |
Background | In this work, baseline system refers to the system produced by the boosting-based system combination when the number of iterations (i.e. |
Background | To obtain satisfactory baseline performance, we train each SMT system for 5 times using MERT with different initial values of feature weights to generate a group of baseline candidates, and then select the best-performing one from this group as the final baseline system (i.e. |
Introduction | In this method, a sequence of weak translation systems is generated from a baseline system in an iterative manner. |
Introduction | Experimental results show that our method leads to significant improvements in translation accuracy over the baseline systems . |
Experiments | Our baseline system is a state-of-the-art forest-based constituency-to-string model (Mi et al., 2008), or forest 625 for short, which translates a source forest into a target string by pattern-matching the |
Experiments | The baseline system extracts 31.9M 625 rules, 77.9M 525 rules respectively and achieves a BLEU score of 34.17 on the test set3. |
Experiments | At first, we investigate the influence of different rule sets on the performance of baseline system . |
Introduction | We use an open source CRF software package to implement our CRF models.1 We use words, POS tags, chunk labels, and the predicate label at the preceding and following nodes as features for our Baseline system . |
Introduction | For predicates that never or rarely appear in training, the HMM features increase Fl by 4.2, and they increase the overall F1 of the system by 3.5 to 93.5, which approaches the F1 of 94.7 that the Baseline system achieves on the in-domain WSJ test set. |
Introduction | Table 2 shows the performance of our three baseline systems . |
Conclusion | The fastest model parsed sentences 1.85 times as fast and was as accurate as the baseline system . |
Data | Both sets of annotations were produced by manually correcting the output of the baseline system . |
Introduction | By increasing the ambiguity level of the adaptive models to match the baseline system , we can also slightly increase supertagging accuracy, which can lead to higher parsing accuracy. |
Introduction | Using an adapted supertagger with ambiguity levels tuned to match the baseline system , we were also able to increase F-score on labelled grammatical relations by 0.75%. |
Results | As Table 8 shows, in all cases the use of supertagger-annotated data led to poorer performance than the baseline system , while the use of parser-annotated data led to an improvement in F-score. |
Results | However, on the corpus of the extra data, the performance of the adapted models is comparable to the baseline model, which means the parser is probably still be receiving the same categories that it used from the sets provided by the baseline system . |
Cross-event Approach | 5.1 Sentence-level Baseline System |
Cross-event Approach | To use document-level information, we need to collect information based on the sentence-level baseline system . |
Cross-event Approach | To this end, we set different thresholds from 0.1 to 1.0 in the baseline system output, and only evaluate triggers, arguments or roles whose confidence score is above the threshold. |
Motivation | The sentence level baseline system finds event triggers like “founded” (trigger of Start-Org), “elected” (trigger of Elect), and “appointment” (trigger of Start-Position), which are easier to identify because these triggers have more specific meanings. |
Abstract | As compared to baseline systems , we achieve absolute improvements of 2.40 BLEU score on a phrase-based SMT system and 1.76 BLEU score on a parsing-based SMT system. |
Conclusion | When we also used phrase collocation probabilities as additional features, the phrase-based SMT performance is finally improved by 2.40 BLEU score as compared with the baseline system . |
Experiments on Phrase-Based SMT | From the results of Table 4, it can be seen that the systems using the improved bidirectional alignments achieve higher quality of translation than the baseline system . |
Experiments on Phrase-Based SMT | Figure 3 shows an example: T1 is generated by the system where the phrase collocation probabilities are used and T2 is generated by the baseline system . |
Experiments on Phrase-Based SMT | As compared with the baseline system , an absolute improvement of 2.40 BLEU score is achieved. |
Abstract | Experiments compare this with two baseline systems , namely an acoustic hidden Markov model and a dynamic Bayes network augmented with discretized representations of the vocal tract. |
Baseline systems | We examine two baseline systems . |
Baseline systems | Figure 3: Baseline systems : (a) acoustic hidden Markov model and (b) articulatory dynamic Bayes network. |
Experiments | For each of our baseline systems , we calculate the phoneme-error—rate (PER) and word-error-rate (WER) after training. |
Experiments | Table 1: Phoneme- and Word-Error-Rate (PER and WER) for different parameterizations of the baseline systems . |
Experimental Setup and Results | 3.2.1 The Baseline Systems |
Experimental Setup and Results | As a baseline system , we built a standard phrase-based system, using the surface forms of the words without any transformations, and with a 3—gram LM in the decoder. |
Experimental Setup and Results | We also built a second baseline system with a factored model. |
Abstract | Results show that the system using the phrase-based error model outperforms significantly its baseline systems . |
Clickthrough Data and Spelling Correction | One possible reason is that our baseline system , which does not use any error model learned from the clickthrough data, is already able to correct these basic, obvious spelling mistakes. |
Introduction | In particular, the speller system incorporating a phrase-based error model significantly outperforms its baseline systems . |