Index of papers in Proc. ACL that mention
  • sentence-level
Yang, Bishan and Cardie, Claire
Approach
We formulate the sentence-level sentiment classification task as a sequence labeling problem.
Approach
The inputs to the model are sentence-segmented documents annotated with sentence-level sentiment labels (positive, negative or neutral) along with a set of unlabeled documents.
Introduction
In this paper, we focus on the task of sentence-level sentiment classification in online reviews.
Introduction
Semi-supervised techniques have been proposed for sentence-level sentiment classification (Tackstro'm and McDonald, 2011a; Qu et al., 2012).
Introduction
In this paper, we propose a sentence-level sentiment classification method that can (1) incorporate rich discourse information at both local and global levels; (2) encode discourse knowledge as soft constraints during learning; (3) make use of unlabeled data to enhance learning.
Related Work
In this paper, we focus on the study of sentence-level sentiment classification.
Related Work
Compared to the existing work on semi-supervised learning for sentence-level sentiment classification (Tackstro'm and McDonald, 2011a; Tackstrom and McDonald, 2011b; Qu et al., 2012), our work does not rely on a large amount of coarse-grained (document-level) labeled data, instead, distant supervision mainly comes from linguistically-motivated constraints.
Related Work
We also show that constraints derived from the discourse context can be highly useful for dis-ambiguating sentence-level sentiment.
sentence-level is mentioned in 27 sentences in this paper.
Topics mentioned in this paper:
Echizen-ya, Hiroshi and Araki, Kenji
Abstract
Experimental results show that our method obtained the highest correlations among the methods in both sentence-level adequacy and fluency.
Conclusion
Experimental results demonstrate that our method yields the highest correlation among eight methods in terms of sentence-level adequacy and fluency.
Conclusion
Future studies Will improve our method, enabling it to achieve high correlation in sentence-level fluency.
Experiments
We calculated Pearson’s correlation efficient and Spearman’s rank correlation efficient between the scores obtained using our method and the scores by human judgments in terms of sentence-level adequacy and fluency.
Experiments
Tables 2 and 3 respectively show Pearson’s correlation coefficient for sentence-level adequacy and fluency.
Experiments
Tables 4 and 5 respectively show Spearman’s rank correlation coefficient for sentence-level adequacy and fluency.
Introduction
However, sentence-level automatic evaluation is insufficient.
Introduction
As described herein, for use with MT systems, we propose a new automatic evaluation method using noun-phrase chunking to obtain higher sentence-level correlations.
Introduction
Evaluation experiments using MT outputs obtained by 12 machine translation systems in NTCIR—7(Fujii et al., 2008) demonstrate that the scores obtained using our system yield the highest correlation with the human judgments among the automatic evaluation methods in both sentence-level adequacy and fluency.
sentence-level is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Joty, Shafiq and Carenini, Giuseppe and Ng, Raymond and Mehdad, Yashar
Document-level Parsing Approaches
A key finding from several previous studies on sentence-level discourse analysis is that most sentences have a well-formed discourse subtree in the full document-level DT (Joty et al., 2012; Fisher and Roark, 2007).
Introduction
While recent advances in automatic discourse segmentation and sentence-level discourse parsing have attained accuracies close to human performance (Fisher and Roark, 2007; J oty et al., 2012), discourse parsing at the document-level still poses significant challenges (Feng and Hirst, 2012) and the performance of the existing document-level parsers (Hemault et al., 2010; Subba and Di-Eugenio, 2009) is still considerably inferior compared to human gold-standard.
Introduction
Since most sentences have a well-formed discourse subtree in the full document-level DT (for example, the second sentence in Figure 1), our first approach constructs a DT for every sentence using our intra-sentential parser, and then runs the multi-sentential parser on the resulting sentence-level DTs.
Introduction
Our second approach, in an attempt of dealing with these cases, builds sentence-level sub-trees by applying the intra-sentential parser on a sliding window covering two adjacent sentences and by then consolidating the results produced by over-
Our Discourse Parsing Framework
Since we already have an accurate sentence-level discourse parser (J oty et al., 2012), a straightforward approach to document-level parsing could be to simply apply this parser to the whole document.
Our Discourse Parsing Framework
For example, syntactic features like dominance sets (Soricut and Marcu, 2003) are extremely useful for sentence-level parsing, but are not even applicable in multi-sentential case.
Parsing Models and Parsing Algorithm
Recently, we proposed a novel parsing model for sentence-level discourse parsing (J oty et al., 2012), that outperforms previous approaches by effectively modeling sequential dependencies along with structure and labels jointly.
Parsing Models and Parsing Algorithm
The connections between adjacent nodes in a hidden layer encode sequential dependencies between the respective hidden nodes, and can enforce constraints such as the fact that a S]: 1 must not follow a Sj_1= l. The connections between the two hidden layers model the structure and the relation of a DT ( sentence-level ) constituent jointly.
Parsing Models and Parsing Algorithm
Figure 52 Our parsing model applied to the sequences at different levels of a sentence-level DT.
Related work
The idea of staging document-level discourse parsing on top of sentence-level discourse parsing was investigated in (Marcu, 2000a; LeThanh et al., 2004).
sentence-level is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
Hoffmann, Raphael and Zhang, Congle and Ling, Xiao and Zettlemoyer, Luke and Weld, Daniel S.
Abstract
This paper presents a novel approach for multi—instance learning with overlapping relations that combines a sentence-level extraction model with a simple, corpus-level component for aggregating the individual facts.
Inference
It is thus sufficient to independently compute an assignment for each sentence-level extraction variable 2,, ignoring the deterministic dependencies.
Introduction
0 MULTIR also produces accurate sentence-level predictions, decoding individual sentences as well as making corpus-level extractions.
Learning
We now present a multi-instance learning algorithm for our weak-supervision model that treats the sentence-level extraction random variables Z,- as latent, and uses facts from a database (6. g., Freebase) as supervision for the aggregate-level variables Y7".
Modeling Overlapping Relations
We define an undirected graphical model that allows joint reasoning about aggregate (corpus-level) and sentence-level extraction decisions.
Modeling Overlapping Relations
2, should be assigned a value 7“ E R only when :0, expresses the ground fact r(e), thereby modeling sentence-level extraction.
Modeling Overlapping Relations
(2009) sentence-level features in the ex-peiments, as described in Section 7.
Weak Supervision from a Database
In contrast, sentence-level extraction must justify each extraction with every sentence which expresses the fact.
sentence-level is mentioned in 17 sentences in this paper.
Topics mentioned in this paper:
Guzmán, Francisco and Joty, Shafiq and Màrquez, Llu'is and Nakov, Preslav
Conclusions and Future Work
Our results show that discourse-based metrics can improve the state-of-the-art MT metrics, by increasing correlation with human judgments, even when only sentence-level discourse information is used.
Conclusions and Future Work
First, at the sentence-level , we can use discourse information to re-rank alternative MT hypotheses; this could be applied either for MT parameter tuning, or as a postprocessing step for the MT output.
Experimental Results
We speculate that this might be caused by the fact that the lexical information in DR-LEX is incorporated only in the form of unigram matching at the sentence-level , while the metrics in group IV are already complex combined metrics, which take into account stronger lexical models.
Experimental Results
This is remarkable given that DR has a strong negative Tau as an individual metric at the sentence-level .
Experimental Setup
As in the WMT12 experimental setup, we use these rankings to calculate correlation with human judgments at the sentence-level , i.e.
Introduction
From its foundations, Statistical Machine Translation (SMT) had two defining characteristics: first, translation was modeled as a generative process at the sentence-level .
Introduction
Recently, there have been two promising research directions for improving SMT and its evaluation: (a) by using more structured linguistic information, such as syntax (Galley et al., 2004; Quirk et al., 2005), hierarchical structures (Chiang, 2005), and semantic roles (Wu and Fung, 2009; Lo et al., 2012), and (b) by going beyond the sentence-level , e.g., translating at the document level (Hardmeier et al., 2012).
Introduction
Going beyond the sentence-level is important since sentences rarely stand on their own in a well-written text.
Related Work
Unlike their work, which measures lexical cohesion at the document-level, here we are concerned with coherence (rhetorical) structure, primarily at the sentence-level .
sentence-level is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Liao, Shasha and Grishman, Ralph
Conclusion and Future Work
Experiments show that document-level information can improve the performance of a sentence-level baseline event extraction system.
Cross-event Approach
In this section we present our approach to using document-level event and role information to improve sentence-level ACE event extraction.
Cross-event Approach
Our event extraction system is a two-pass system where the sentence-level system is first applied to make decisions based on local information.
Cross-event Approach
5.1 Sentence-level Baseline System
Experiments
We use the rest of the ACE training corpus (549 documents) as training data for both the sentence-level baseline event tagger and document-level event tagger.
Experiments
Recall improved sharply, demonstrating that cross-event information could recover information that is difficult for the sentence-level baseline to extract; precision also improved over the baseline, although not as markedly.
Motivation
We analyzed the sentence-level baseline event extraction, and found that many events are missing or spuriously tagged because the local information is not sufficient to make a confident decision.
sentence-level is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Feng, Vanessa Wei and Hirst, Graeme
Abstract
We also analyze the difficulty of extending traditional sentence-level discourse parsing to text-level parsing by comparing discourse-parsing performance under different discourse conditions.
Conclusions
We analyzed the difficulty of extending traditional sentence-level discourse parsing to text-level parsing by showing that using exactly the same set of features, the performance of Structure and Relation classification on cross-sentence instances is consistently inferior to that on within-sentence instances.
Introduction
difficulty with extending traditional sentence-level discourse parsing to text-level parsing, by comparing discourse parsing performance under different discourse conditions.
Text-level discourse parsing
Unlike syntactic parsing, where we are almost never interested in parsing above sentence level, sentence-level parsing is not sufficient for discourse parsing.
Text-level discourse parsing
While a sequence of local ( sentence-level ) grammaticality can be considered to be global grammaticality, a sequence of local discourse coherence does not necessarily form a globally coherent text.
Text-level discourse parsing
Text-level discourse parsing imposes more constraints on the global coherence than sentence-level discourse parsing.
sentence-level is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Zarriess, Sina and Kuhn, Jonas
Abstract
We suggest a generation task that integrates discourse-level referring expression generation and sentence-level surface realization.
Conclusion
We have presented a data-driven approach for investigating generation architectures that address discourse-level reference and sentence-level syntax and word order.
Experiments
BLEU, sentence-level geometric mean of 1- to 4-gram precision, as in (Belz et al., 2011)
Experiments
NIST, sentence-level n- gram overlap weighted in favour of less frequent n- grams, as in (Belz et al., 2011)
Experiments
BLEUT, sentence-level BLEU computed on post-processed output where predicted referring expressions for victim and perp are replaced in the sentences (both gold and predicted) by their original role label, this score doeS not penalize lexical mismatches between corpus and system RES
Introduction
Generating well-formed linguistic utterances from an abstract nonlinguistic input involves making a multitude of conceptual, discourse-level as well as sentence-level , lexical and syntactic decisions.
Introduction
We integrate a discourse-level approach to REG with sentence-level surface realization in a data-driven framework.
sentence-level is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Li, Qi and Ji, Heng and Huang, Liang
Abstract
Our approach advances state-of-the-art sentence-level event extraction, and even outperforms previous argument labeling methods which use external knowledge from other sentences and documents.
Experiments
In addition to our baseline, we compare against the sentence-level system reported in Hong et a1.
Experiments
Remarkably, compared to the cross-entity approach reported in (Hong et al., 2011), which attained 68.3% F1 for triggers and 48.3% for arguments, our approach with global features achieves even better performance on argument labeling although we only used sentence-level information.
Experiments
We also show that it outperforms the sentence-level baseline reported in (J i and Grishman, 2008; Liao and Grishman, 2010), both of which attained 59.7% F1 for triggers and 36.6% for arguments.
Introduction
Different from traditional pipeline approach, we present a novel framework for sentence-level event extraction, which predicts triggers and their arguments jointly (Section 3).
Related Work
Ji and Grishman (2008) 59.7 36.6 sentence-level
sentence-level is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Salloum, Wael and Elfardy, Heba and Alamir-Salloum, Linda and Habash, Nizar and Diab, Mona
Abstract
In this paper we study the use of sentence-level dialect identification in optimizing machine translation system selection when translating mixed dialect input.
Conclusion and Future Work
We presented a sentence-level classification approach for MT system selection for diglossic languages.
Discussion and Error Analysis
In 21% of the error cases, our classifier predicted a better translation than the one considered gold by BLEU due to BLEU bias, e.g., severe sentence-level length penalty due to an extra punctuation in a short sentence.
Introduction
In this paper we study the use of sentence-level dialect identification together with various linguistic features in optimizing the selection of outputs of four different MT systems on input text that includes a mix of dialects.
MT System Selection
For baseline system selection, we use the classification decision of Elfardy and Diab (2013)’s sentence-level dialect identification system to decide on the target MT system.
MT System Selection
We run the 5,562 sentences of the classification training data through our four MT systems and produce sentence-level BLEU scores (with length penalty).
Related Work
used features from their token-level system to train a classifier that performs sentence-level dialect ID (Elfardy and Diab, 2013).
sentence-level is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Elliott, Desmond and Keller, Frank
Conclusions
In this paper we performed a sentence-level correlation analysis of automatic evaluation measures against expert human judgements for the automatic image description task.
Conclusions
We found that sentence-level unigram BLEU is only weakly correlated with human judgements, even though it has extensively reported in the literature for this task.
Methodology
(2011) to perform a sentence-level analysis, setting n = 1 and no brevity penalty to get the unigram BLEU measure, or n = 4 with the brevity penalty to get the Smoothed BLEU measure.
Methodology
The sentence-level evaluation measures were calculated for each image—description—reference tuple.
Methodology
The evaluation measure scores were then compared with the human judgements using Spearman’s correlation estimated at the sentence-level .
Results
Sentence-level automated measure score
Results
Sentence-level automated measure score
sentence-level is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Feng, Vanessa Wei and Hirst, Graeme
Bottom-up tree-building
In particular, starting from the constituents on the bottom level (EDUs for intra-sentential parsing and sentence-level discourse trees for multi-sentential parsing), at each step of the tree-building, we greedily merge a pair of adjacent discourse constituents such that the merged constituent has the highest probability as predicted by our structure model.
Linear time complexity
The total time to generate sentence-level discourse trees for n sentences is ZZ=10(m%).
Overall work flow
(2013), we perform a sentence-level parsing for each sentence first, followed by a text-level parsing to generate a full discourse tree for the whole document.
Overall work flow
Each sentence 5,, after being segmented into EDUs (not shown in the figure), goes through an intra-sentential bottom-up tree-building model Minna, to form a sentence-level discourse tree Tgi, with the EDUs as leaf nodes.
Overall work flow
We then combine all sentence-level discourse tree TS}: ’s using our multi-sentential bottom-up tree-building model Mmum to generate the text-level discourse tree TD.
Related work
First, they decomposed the problem of text-level discourse parsing into two stages: intra-sentential parsing to produce a discourse tree for each sentence, followed by multi-sentential parsing to combine the sentence-level discourse trees and produce the text-level discourse tree.
sentence-level is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Cui, Lei and Zhang, Dongdong and Liu, Shujie and Chen, Qiming and Li, Mu and Zhou, Ming and Yang, Muyun
Background: Deep Learning
Inspired by previous successful research, we first learn sentence representations using topic-related monolingual texts in the pre-training phase, and then optimize the bilingual similarity by leveraging sentence-level parallel data in the fine-tuning phase.
Introduction
In this case, people understand the meaning because of the IT topical context which goes beyond sentence-level analysis and requires more relevant knowledge.
Introduction
This underlying topic space is learned from sentence-level parallel data in order to share topic information across the source and target languages as much as possible.
Related Work
our method is that it is applicable to both sentence-level and document-level SMT, since we do not place any restrictions on the input.
Related Work
0 We directly optimized bilingual topic similarity in the deep learning framework with the help of sentence-level parallel data, so that the learned representation could be easily used in SMT decoding procedure.
Topic Similarity Model with Neural Network
learn topic representations using sentence-level parallel data.
sentence-level is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Iyyer, Mohit and Enns, Peter and Boyd-Graber, Jordan and Resnik, Philip
Experiments
Each of these models have the same task: to predict sentence-level ideology labels for sentences in a test set.
Experiments
Table 1: Sentence-level bias detection accuracy.
Experiments
RNNl initializes all parameters randomly and uses only sentence-level labels for training.
Recursive Neural Networks
They have achieved state-of-the-art performance on a variety of sentence-level NLP tasks, including sentiment analysis, paraphrase detection, and parsing (Socher et al., 2011a; Hermann and Blunsom, 2013).
Related Work
Finally, combining sentence-level and document-level models might improve bias detection at both levels.
sentence-level is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Chan, Wen and Zhou, Xiangdong and Wang, Wei and Chua, Tat-Seng
Experimental Results
which just utilize the independent sentence-level features, behave not vary well here, and there is no statistically significant performance difference between them.
Experimental Results
To see how much the different textual and non-textual features contribute to community answer summarization, the accumulated weight of each group of sentence-level features5 is presented in Figure 2.
The Summarization Framework
In this section, we give a detailed description of the different sentence-level cQA features and the contextual modeling between sentences used in our model for answer summarization.
The Summarization Framework
Sentence-level Features
The Summarization Framework
These sentence-level features can be easily utilized in the CRF framework.
sentence-level is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
duVerle, David and Prendinger, Helmut
Features
As evidenced by a number of discourse-parsing efforts focusing on intra-sentential parsing (Marcu, 2000; Soricut and Marcu, 2003), there is a strong correlation between different organizational levels of textual units and sub-trees of the RST tree both at the sentence-level and the paragraph level.
Features
While not always present, discourse markers (connectives, cue-words or cue-phrases, etc) have been shown to give good indications on discourse structure and labeling, particularly at the sentence-level (Marcu, 2000).
Features
A promising concept introduced by Soricut and Marcu (2003) in their sentence-level parser is the identification of ‘dominance sets’ in the syntax parse trees associated to each input sentence.
Introduction
Marcu and Soricut focussed on sentence-level parsing and developed two probabilistic models that use syntactic and lexical information (Soricut and Marcu, 2003).
sentence-level is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Riezler, Stefan and Simianer, Patrick and Haas, Carolin
Experiments
Method 4, named REBOL, implements REsponse-Based Online Learning by instantiating y+ and y‘ to the form described in Section 4: In addition to the model score 3, it uses a cost function 0 based on sentence-level BLEU (Nakov et al., 2012) and tests translation hypotheses for task-based feedback using a binary execution function 6.
Experiments
This can be attributed to the use of sentence-level BLEU as cost function in RAMPION and REBOL.
Response-based Online Learning
Computation of distance to the reference translation usually involves cost functions based on sentence-level BLEU (Nakov et al.
Response-based Online Learning
In addition, we can use translation-specific cost functions based on sentence-level BLEU in order to boost similarity of translations to human reference translations.
Response-based Online Learning
Our cost function c(y(i), y) = (l — BLEU(y(i), is based on a version of sentence-level BLEU Nakov et al.
sentence-level is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Andreevskaia, Alina and Bergler, Sabine
Experiments
These results highlight a special property of sentence-level annotation: greater sensitivity to sparseness of the model: On texts, classifier error on one particular sentiment marker is often compensated by a number of correctly identified other sentiment clues.
Experiments
Since sentences usually contain a much smaller number of sentiment clues than texts, sentence-level annotation more readily yields errors when a single sentiment clue is incorrectly identified or missed by the system.
Experiments
training sets are required to overcome this higher n-gram sparseness in sentence-level annotation.
Factors Affecting System Performance
To our knowledge, the only work that describes the application of statistical classifiers (SVM) to sentence-level sentiment classification is (Gamon and Aue, 2005)1.
sentence-level is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Heilman, Michael and Cahill, Aoife and Madnani, Nitin and Lopez, Melissa and Mulholland, Matthew and Tetreault, Joel
Discussion and Conclusions
This is the most realistic evaluation of methods for predicting sentence-level grammaticality to date.
Introduction
While some applications (e.g., grammar checking) rely on such fine-grained predictions, others might be better addressed by sentence-level grammaticality judgments (e. g., machine translation evaluation).
Introduction
Regarding sentence-level grammaticality, there has been much work on rating the grammatical-
Introduction
With this unique data set, which we will release to the research community, it is now possible to conduct realistic evaluations for predicting sentence-level grammaticality.
sentence-level is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Pei, Wenzhe and Ge, Tao and Chang, Baobao
Conventional Neural Network
To model the tag dependency, previous neural network models (Collobert et al., 2011; Zheng et al., 2013) introduce a transition score Aij for jumping from tag i E T to tag j E T. For a input sentence cum] with a tag sequence tum], a sentence-level score is then given by the sum of transition and network scores:
Conventional Neural Network
Given the sentence-level score, Zheng et al.
Conventional Neural Network
(2013), their model is a global one where the training and inference is performed at sentence-level .
Max-Margin Tensor Neural Network
(2013), our model is also trained at sentence-level and carries out inference globally.
sentence-level is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Yogatama, Dani and Smith, Noah A.
Experiments
The task is to predict sentence-level sentiment, so each training example is a sentence.
Experiments
It has been shown that syntactic information is helpful for sentence-level predictions (Socher et al., 2013), so the parse tree regularizer is naturally suitable for this task.
Structured Regularizers for Text
This regularizer captures the idea that phrases might be selected as relevant or (in most cases) irrelevant to a task, and is expected to be especially useful in sentence-level prediction tasks.
Structured Regularizers for Text
In sentence level prediction tasks, such as sentence-level sentiment analysis, it is known that most constituents (especially those that correspond to shorter phrases) in a parse tree are uninformative (neutral sentiment).
sentence-level is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Li, Peifeng and Zhu, Qiaoming and Zhou, Guodong
Abstract
As a paratactic language, sentence-level argument extraction in Chinese suffers much from the frequent occurrence of ellipsis with regard to inter-sentence arguments.
Experimentation
However, our model can be an effective complement of the sentence-level English argument extraction systems since the performance of argument extraction is still low in English and using discourse-level information is a way to improve its performance, especially for those event mentions whose arguments spread in complex sentences.
Related Work
In additional, there are only very few of them focusing on Chinese argument extraction and almost all aim to feature engineering and are based on sentence-level information and recast this task as an SRL-style task.
Related Work
Liao and Grishman (2010) mainly focus on employing the cross-event consistency information to improve sentence-level trigger extraction and they also propose an inference method to infer the arguments following role consistency in a document.
sentence-level is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Shindo, Hiroyuki and Miyao, Yusuke and Fujino, Akinori and Nagata, Masaaki
Conclusion
We proposed a novel backoff modeling of an SR-TSG based on the hierarchical Pitman-Yor Process and sentence-level and tree-level blocked MCMC sampling for training our model.
Inference
In each splitting step, we use two types of blocked MCMC algorithm: the sentence-level blocked Metroporil-Hastings (MH) sampler and the tree-level blocked Gibbs sampler, while (Petrov et al., 2006) use a different MLE-based model and the EM algorithm.
Inference
Our sampler iterates sentence-level sampling and tree-level sampling alternately.
Inference
The sentence-level MH sampler is a recently proposed algorithm for grammar induction (Johnson et al., 2007b; Cohn et al., 2010).
sentence-level is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Xiao, Tong and Zhu, Jingbo and Zhu, Muhua and Wang, Huizhen
Background
where BLEU(e,-j, r,-) is the smoothed sentence-level BLEU score (Liang et al., 2006) of the translation e with respect to the reference translations r,, and e: is the oracle translation which is selected from {em ..., em} in terms of BLEU(e,-j, r,-).
Background
In this work, a sentence-level combination method is used to select the best translation from the pool of the n-best outputs of all the member systems.
Background
In this work, we use a sentence-level system combination method to generate final translations.
Introduction
sentence-level combination (Hildebrand and Vogel, 2008) simply selects one from original translations, while some more sophisticated methods, such as word-level and phrase-level combination (Matusov et al., 2006; Rosti et al., 2007), can generate new translations differing from any of the original translations.
sentence-level is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Liu, Shujie and Li, Chi-Ho and Zhou, Ming
The DPDI Framework
Rather than recruiting annotators for marking span pairs, we modify the parsing algorithm in Section 3 so as to produce span pair annotation out of sentence-level annotation.
The DPDI Framework
In the base step, only the word pairs listed in sentence-level annotation are inserted in the hypergraph, and the re-cursive steps are just the same as usual.
The DPDI Framework
If the sentence-level annotation satisfies the alignment constraints of ITG, then each F-span will have only one E-span in the parse tree.
sentence-level is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Wu, Hua and Wang, Haifeng
Introduction
And this translation quality is higher than that of those produced by the system trained with a real Chinese-Spanish corpus; (3) Our sentence-level translation selection method consistently and significantly improves the translation quality over individual translation outputs in all of our experiments.
Translation Selection
We regard sentence-level translation selection as a machine translation (MT) evaluation problem and formalize this problem with a regression learning model.
Translation Selection
We use smoothed sentence-level BLEU score to replace the human assessments, where we use additive smoothing to avoid zero BLEU scores when we calculate the n-gram precisions.
Translation Selection
can easily retrain the learner under different conditions, therefore enabling our method to be applied to sentence-level translation selection from any sets of translation systems without any additional human work.
sentence-level is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Pado, Sebastian and Galley, Michel and Jurafsky, Dan and Manning, Christopher D.
EXpt. 1: Predicting Absolute Scores
We first concentrate on the upper half ( sentence-level results).
EXpt. 1: Predicting Absolute Scores
This result supports the conclusions we have drawn from the sentence-level analysis.
Experimental Evaluation
System-level predictions are computed in both experiments from sentence-level predictions, as the ratio of sentences for which each system provided the best translation (Callison-Burch et al., 2008).
Experimental Evaluation
BLEUR includes the following 18 sentence-level scores: BLEU-n and n-gram precision scores (1 g n g 4); BLEU brevity penalty (BP); BLEU score divided by BP.
sentence-level is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
He, Xiaodong and Deng, Li
Abstract
In the updating formula, we need to compute the sentence-level BLE U (En, E5).
Abstract
a non-clipped BP, BP = 60—9, for sentence-level BLEU].
Abstract
sentence-level BLEU (Exp.
sentence-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lu, Bin and Tan, Chenhao and Cardie, Claire and K. Tsou, Benjamin
A Joint Model with Unlabeled Parallel Text
In this study, we focus on sentence-level sentiment classification, i.e.
Introduction
Not surprisingly, most methods for sentiment classification are supervised learning techniques, which require training data annotated with the appropriate sentiment labels (e. g. document-level or sentence-level positive vs. negative polarity).
Introduction
Although our approach should be applicable at the document-level and for additional sentiment tasks, we focus on sentence-level polarity classification in this work.
sentence-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lo, Chi-kiu and Wu, Dekai
Abstract
Table 3: Sentence-level correlation with human adequacy judgments, across the evaluation metrics.
Abstract
Table 5: Sentence-level correlation with human adequacy judgments, for monolinguals vs. bilinguals.
Abstract
Table 8: Sentence-level correlation with human adequacy judgments.
sentence-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Chen, Harr and Benson, Edward and Naseem, Tahira and Barzilay, Regina
Experimental Setup
For these reasons, we evaluate on both sentence-level and token-level precision, recall, and F-score.
Experimental Setup
Note that sentence-level scores are always at least as high as token-level scores, since it is possible to select a sentence correctly but none of its true relation tokens while the opposite is not possible.
Results
In light of our strong sentence-level performance, this suggests a possible human-assisted application: use our model to identify promising relation-bearing sentences in a new domain, then have a human annotate those sentences for use by a supervised approach to achieve optimal token-level extraction.
sentence-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lo, Chi-kiu and Beloucif, Meriem and Saers, Markus and Wu, Dekai
Related Work
(2004) introduced a sentence-level QE system where an arbitrary threshold is used to classify the MT output as good or bad.
Related Work
To address this problem, Quirk (2004) related the sentence-level correctness of the QE model to human judgment and achieved a high correlation with human judgement for a small annotated corpus; however, the proposed model does not scale well to larger data sets.
Results
Table l: Sentence-level correlation with HAJ
sentence-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Reiter, Nils and Frank, Anette
Introduction
In the following, we will structure this feature space along two dimensions, distinguishing NP- and sentence-level factors as well as syntactic and semantic (including lexical semantic) factors.
Introduction
Sentence-level features are extracted from the clause (in which the NP appears), as well as sentential and non-sentential adjuncts of the clause.
Introduction
Using syntactic features on the NP-or sentence-level only, however, leads to a drop in precision as well as recall.
sentence-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Parton, Kristen and McKeown, Kathleen R. and Coyne, Bob and Diab, Mona T. and Grishman, Ralph and Hakkani-Tür, Dilek and Harper, Mary and Ji, Heng and Ma, Wei Yun and Meyers, Adam and Stolbach, Sara and Sun, Ang and Tur, Gokhan and Xu, Wei and Yaman, Sibel
Abstract
In this paper, we present an error analysis of a new cross-lingual task: the SW task, a sentence-level understanding task which seeks to return the English 5W's (Who, What, When, Where and Why) corresponding to a Chinese sentence.
Abstract
The best cross-lingual 5W system was still 19% worse than the best monolingual 5W system, which shows that MT significantly degrades sentence-level understanding.
Conclusions
The best cross-lingual 5W system was still 19% worse than the best monolingual 5W system, which shows that MT significantly degrades sentence-level understanding.
sentence-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: