Index of papers in Proc. ACL that mention
  • significantly outperforms
Zhang, Hui and Zhang, Min and Li, Haizhou and Aw, Aiti and Tan, Chew Lim
Abstract
Experimental results on the NIST MT-2003 Chinese-English translation task show that our method statistically significantly outperforms the four baseline systems.
Experiment
1) FTS2S significantly outperforms (p<0.05) FT2S.
Experiment
3) Our model statistically significantly outperforms all the baselines system.
Experiment
4) All the four syntax-based systems show better performance than Moses and three of them significantly outperforms (p<0.05) Moses.
Introduction
Experimental results show that our method significantly outperforms the two individual methods and other baseline methods.
significantly outperforms is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Lin, Ziheng and Ng, Hwee Tou and Kan, Min-Yen
Abstract
The experimental results demonstrate that our model is able to significantly outperform the state-of-the-art coherence model by Barzilay and Lapata (2005), reducing the error rate of the previous approach by an average of 29% over three data sets against human upper bounds.
Analysis and Discussion
From the curves, our model consistently performs better than the baseline with a significant gap, and the combined model also consistently and significantly outperforms the other two.
Conclusion
When applied to distinguish a source text from a sentence-reordered permutation, our model significantly outperforms the previous state-of-the-art,
Experiments
Double (**) and single (*) asterisks indicate that the respective model significantly outperforms the baseline at p < 0.01 and p < 0.05, respectively.
Experiments
Comparing these accuracies to the baseline, our model significantly outperforms the baseline with p < 0.01 in the WSJ and Earthquakes data sets with accuracy increments of 2.35% and 2.91%, respectively.
Experiments
The combined model in all three data sets gives the highest performance in comparison to all single models, and it significantly outperforms the baseline model with p < 0.01.
significantly outperforms is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Wang, William Yang and Hua, Zhenhao
Abstract
In experiments, we show that our model significantly outperforms strong linear and nonlinear discriminative baselines on three datasets under various settings.
Conclusion
Focusing on the three financial crisis related datasets, the proposed model significantly outperform the standard linear regression method in statistics and strong discriminative support vector regression baselines.
Introduction
By varying different experimental settings on three datasets concerning different periods of the Great Recession from 2006-2013, we empirically show that our approach significantly outperforms the baselines by a wide margin.
Introduction
0 Our results significantly outperform standard linear regression and strong SVM baselines.
significantly outperforms is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Liu, Changsong and She, Lanbo and Fang, Rui and Chai, Joyce Y.
Abstract
Our empirical results have shown the probabilistic labeling approach significantly outperforms a previous graph-matching approach for referential grounding.
Evaluation and Discussion
significantly outperforms state-space search (S.S.S.
Evaluation and Discussion
Although probabilistic labeling significantly outperforms the state-space search, the grounding performance is still rather poor (less than 50%)
Introduction
Our empirical results have shown that the probabilistic labeling approach significantly outperforms the state-space search approach in both grounding accuracy and efficiency.
significantly outperforms is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Li, Zhenghua and Zhang, Min and Chen, Wenliang
Abstract
Experimental results on benchmark data show that our method significantly outperforms the baseline supervised parser and other entire-tree based semi-supervised methods, such as self-training, co-training and tri-training.
Experiments and Analysis
Using unlabeled data with the results of ZPar (“Unlabeled <— Z”) significantly outperforms the baseline GParser by 0.30% (93.15-82.85) on English.
Experiments and Analysis
However, we find that although the parser significantly outperforms the supervised GParser on English, it does not gain significant improvement over co-training with ZPar (“Unlabeled <— Z”) on both English and Chinese.
Experiments and Analysis
(2012) and Bohnet and Nivre (2012) use joint models for POS tagging and dependency parsing, significantly outperforming their pipeline counterparts.
significantly outperforms is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Rieser, Verena and Lemon, Oliver
Abstract
Our results show that RL significantly outperforms Supervised Learning when interacting in simulation as well as for interactions with real users.
Conclusion
Our results show that RL significantly outperforms SL in simulation as well as in interactions with real users.
Simulated Learning Environment
For learning presentation modality, both classifiers significantly outperform the baseline.
Simulated Learning Environment
The results show that simulation-based RL with an environment bootstrapped from WOZ data allows learning of robust strategies which significantly outperform the strategies contained in the initial data set.
significantly outperforms is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Lazaridou, Angeliki and Marelli, Marco and Zamparelli, Roberto and Baroni, Marco
Experimental setup
fulladd and lexfunc significantly outperform stem also in the HR subset (p<.001).
Experimental setup
However, the stemploitation hypothesis is dispelled by the observation that both models significantly outperform the stem baseline (p<.001), despite the fact that the latter, again, has good performance, significantly outperforming the corpus-derived vectors (p < .001).
Experimental setup
Indeed, if we focus on the third row of Table 5, reporting performance on low stemderived relatedness (LR) items (annotated as described in Section 4.1), fulladd and wadd still significantly outperform the corpus representations (p< .001), whereas the quality of the stem representations of LR items is not significantly different form that of the corpus-derived ones.
significantly outperforms is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Pantel, Patrick and Lin, Thomas and Gamon, Michael
Experimental Results
On head queries, the addition of the empty context parameter 0 and click signal (2 together (Model M1) significantly outperforms both the baseline and the state-of-the-art model Guo’ 09.
Experimental Results
We observe a different behavior on tail queries where all models significantly outperform the baseline BFB, but are not significantly different from each other.
Introduction
We show that jointly modeling user intent and entity type significantly outperforms the current state of the art on the task of entity type resolution in queries.
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Yang, Bishan and Cardie, Claire
Experiments
that PR significantly outperforms all other baselines in both the CR dataset and the MD dataset (average accuracy across domains is reported).
Experiments
In contrast, both PR1“ and PR significantly outperform CRF, which implies that incorporating lexical and discourse constraints as posterior constraints is much more effective.
Experiments
We can see that both PR and Ple significantly outperform all other baselines in all domains.
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Morin, Emmanuel and Hazem, Amir
Experiments and Results
We can see that the Unbalanced approach significantly outperforms the baseline (Balanced).
Experiments and Results
We can also notice that the prediction model applied to the balanced corpus (Balanced + Prediction) slightly outperforms the baseline while the Unbalanced + Prediction approach significantly outperforms the three other approaches (moreover the variation observed with the Unbalanced approach are lower than the Unbalanced —|— Prediction approach).
Experiments and Results
As for the previous experiment, we can see that the Unbalanced approach significantly outperforms the Balanced approach.
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Mehdad, Yashar and Carenini, Giuseppe and Ng, Raymond T.
Abstract
Automatic and manual evaluation results over meeting, chat and email conversations show that our approach significantly outperforms baselines and previous extractive models.
Experimental Setup
Results indicate that our system significantly outperforms baselines in overall quality and responsiveness, for both meeting and email datasets.
Introduction
Automatic evaluation on the chat dataset and manual evaluation over the meetings and emails show that our system uniformly and statistically significantly outperforms baseline systems, as well as a state-of-the-art query-based extractive summarization system.
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Li, Qi and Ji, Heng
Abstract
Experiments on Automatic Content Extraction (ACE)1 corpora demonstrate that our joint model significantly outperforms a strong pipelined baseline, which attains better performance than the best-reported end-to-end system.
Conclusions and Future Work
Experiments demonstrated our approach significantly outperformed pipelined approaches for both tasks and dramatically advanced state-of-the-art.
Experiments
We can see that our approach significantly outperforms the pipelined approach for both tasks.
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Bollegala, Danushka and Weir, David and Carroll, John
Abstract
In both tasks, our method significantly outperforms competitive baselines and returns results that are statistically comparable to current state-of-the-art methods, while requiring no task-specific customisations.
Experiments and Results
Except for the DE setting in which Proposed method significantly outperforms both SFA and SCL, the performance of the Proposed method is not statistically significantly different to that of SFA or SCL.
Introduction
Without requiring any task specific customisations, systems based on our distribution prediction method significantly outperform competitive baselines in both tasks.
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhou, Guangyou and Liu, Fang and Liu, Yang and He, Shizhu and Zhao, Jun
Conclusions and Future Work
Experiments conducted on a real CQA data show some promising findings: (1) the proposed method significantly outperforms the previous work for question retrieval; (2) the proposed matrix factorization can significantly improve the performance of question retrieval, no matter whether considering the translation languages or not; (3) considering more languages can further improve the performance but it does not seem to produce significantly better performance; (4) different languages contribute unevenly for question retrieval; (5) our proposed method can be easily adapted to the large-scale information retrieval task.
Experiments
(l) Monolingual translation models significantly outperform the VSM and LM (row 1 and
Experiments
(3) Our proposed method (leveraging statistical machine translation via matrix factorization, SMT + MF) significantly outperforms the bilingual translation model of Zhou et al.
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wang, Lu and Cardie, Claire
Introduction
Automatic evaluation (using ROUGE (Lin and Hovy, 2003) and BLEU (Papineni et al., 2002)) against manually generated focused summaries shows that our sum-marizers uniformly and statistically significantly outperform two baseline systems as well as a state-of-the-art supervised extraction-based system.
Introduction
The resulting systems yield results comparable to those from the same system trained on in-domain data, and statistically significantly outperform supervised extractive summarization approaches trained on in-domain data.
Results
In most experiments, it also significantly outperforms the baselines and the extract-based approaches (p < 0.05).
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Kozareva, Zornitsa
Introduction
0 We have conducted in depth experimental evaluation and showed that the developed methods significantly outperform baseline methods.
Task A: Polarity Classification
As we can see from Figure 5 that all classifiers significantly outperform the majority base-
Task A: Polarity Classification
The learned lessons from this study are: (1) for n-gram usage, the larger the context of the metaphor, the better the classification accuracy becomes; (2) if present source and target information can further boost the performance of the classifiers; (3) LIWC is a useful resource for polarity identification in metaphor-rich texts; (4) analyzing the usages of tense like past vs. present and pronouns are important triggers for positive and negative polarity of metaphors; (5) some categories like family, social presence indicate positive polarity, while others like inhibition, anger and swear words are indicative of negative affect; (6) the built models significantly outperform majority baselines.
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Liu, Chang and Ng, Hwee Tou
Abstract
We show empirically that TESLA—CELAB significantly outperforms character-level BLEU in the English—Chinese translation evaluation tasks.
Conclusion
We show empirically that TESLA-CELAB significantly outperforms the strong baseline of character-level BLEU in two well known English-Chinese MT evaluation data sets.
Experiments
The results indicate that TESLA-CELAB significantly outperforms BLEU.
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Li, Junhui and Tu, Zhaopeng and Zhou, Guodong and van Genabith, Josef
Abstract
Experiments on Chinese—English translation on four NIST MT test sets show that the HD—HPB model significantly outperforms Chiang’s model with average gains of 1.91 points absolute in BLEU.
Experiments
Table 3 shows that our HD-HPB model significantly outperforms Chiang’s HPB model with an average improvement of 1.91 in BLEU (and similar improvements over Moses HPB).
Introduction
Experiments on Chinese-English translation using four NIST MT test sets show that our HD-HPB model significantly outperforms Chiang’s HPB as well as a SAMT—style refined version of HPB.
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Branavan, S.R.K. and Kushman, Nate and Lei, Tao and Barzilay, Regina
Abstract
Additionally, we show that a high—level planner utilizing these extracted relations significantly outperforms a strong, text unaware baseline — successfully completing 80% of planning tasks as compared to 69% for the baseline.1
Conclusions
We show that building high-level plans in this manner significantly outperforms traditional techniques in terms of task completion.
Introduction
Our results show that our text-driven high-level planner significantly outperforms all baselines in terms of completed planning tasks — it successfully solves 80% as compared to 41% for the Metric-FF planner and 69% for the text unaware variant of our model.
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Cai, Peng and Gao, Wei and Zhou, Aoying and Wong, Kam-Fai
Abstract
Adaptation experiments on LETOR3.0 data set demonstrate that query weighting significantly outperforms document instance weighting methods.
Conclusion
We evaluated our approaches on LETOR3.0 dataset for ranking adaptation and found that: (l) the first method efficiently estimate query weights, and can outperform the document instance weighting but some information is lost during the aggregation; (2) the second method consistently and significantly outperforms document instance weighting.
Introduction
wise approach significantly outperformed pointwise approach, which takes each document instance as independent learning object, as well as pairwise approach, which concentrates learning on the order of a pair of documents (Liu, 2009).
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Jiang, Wenbin and Liu, Qun
Abstract
Experiments show that, the classifier trained on the projected classification instances significantly outperforms previous projected dependency parsers.
Experiments
The projected classifier significantly outperforms previous works on both test sets, which demonstrates that the word-pair classification model, although falling behind of the state-of-the-art on human-annotated treebanks, performs well in projected dependency parsing.
Introduction
Experimental results show that, the classifier trained on the projected classification instances significantly outperforms the projected dependency parsers in previous works.
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Xiong, Deyi and Zhang, Min and Aw, Aiti and Li, Haizhou
Conclusion
EXperiments show that our model achieves substantial improvements over baseline and significantly outperforms (Marton and Resnik, 2008)’s XP+.
Experiments
The binary SDB (BiSDB) model statistically significantly outperforms Marton and Resnik’s XP+ by an absolute improvement of 0.59 (relatively 2%).
Introduction
Our experimental results display that our SDB model achieves a substantial improvement over the baseline and significantly outperforms XP+ according to the BLEU metric (Papineni et al., 2002).
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Song, Young-In and Lee, Jung-Tae and Rim, Hae-Chang
Experiments
As shown in the table, the multi-parameter model improves by approximately 18% and 12% on the TREC-6 and 7 partial query sets, and it also significantly outperforms both the word model and the one-parameter model on the TREC-8 query set.
Experiments
For both opposite cases, the multi—parameter model significantly outperforms one—parameter model.
Experiments
Note that the multi—parameter model significantly outperforms the one—parameter model and all manually—set As for the queries ‘declining birth rate’ and ‘Amazon rain forest’, which also has one effective phrase, ‘rain forest’, and one noneffective phrase, ‘Amazon forest’.
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhao, Shiqi and Wang, Haifeng and Liu, Ting and Li, Sheng
Abstract
The evaluation results show that: (l) The pivot approach is effective in extracting paraphrase patterns, which significantly outperforms the conventional method DIRT.
Conclusion
In addition, the log-linear model with the proposed feature functions significantly outperforms the conventional models.
Introduction
Our experiments show that the pivot approach significantly outperforms conventional methods.
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhang, Min and Jiang, Hongfei and Aw, Aiti and Li, Haizhou and Tan, Chew Lim and Li, Sheng
Abstract
Experimental results on the NIST MT-2005 Chinese-English translation task show that our method statistically significantly outperforms the baseline systems.
Experiments
1) Our tree sequence-based model significantly outperforms (p < 0.01) previous phrase-based and linguistically syntax-based methods.
Introduction
Experiment results on the NIST MT-2005 Chinese-English translation task show that our method significantly outperforms Moses (Koehn et al., 2007), a state-of-the-art phrase-based SMT system, and other linguistically syntax-based methods, such as SCFG-based and STSG-based methods (Zhang et al., 2007).
significantly outperforms is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: