Index of papers in Proc. ACL that mention
  • word-level
Liu, Chang and Ng, Hwee Tou
Abstract
For languages such as Chinese where words usually have meaningful internal structure and word boundaries are often fuzzy, TESLA—CELAB acknowledges the advantage of character-level evaluation over word-level evaluation.
Experiments
Although word-level BLEU has often been found inferior to the new-generation metrics when the target language is English or other European languages, prior research has shown that character-level BLEU is highly competitive when the target language is Chinese (Li et al., 2011).
Experiments
word-level Character-level
Experiments
word-level Character-level
Motivation
In this work, we attempt to address both of these issues by introducing TESLA-CELAB, a character-level metric that also models word-level linguistic phenomenon.
word-level is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Zhang, Meishan and Zhang, Yue and Che, Wanxiang and Liu, Ting
Abstract
Character-level information can benefit downstream applications by offering flexible granularities for word segmentation while improving word-level dependency parsing accuracies.
Character-Level Dependency Tree
Inner-word dependencies can also bring benefits to parsing word-level dependencies.
Character-Level Dependency Tree
When the internal structures of words are annotated, character-level dependency parsing can be treated as a special case of word-level dependency parsing, with “words” being “characters”.
Character-Level Dependency Tree
The word-level dependency parsing features are added when the inter-word actions are applied, and the features for joint word segmentation and POS-tagging are added when the actions PW, SHW and SHC are applied.
Introduction
Moreover, manually annotated intra-word dependencies can give improved word-level dependency accuracies than pseudo intra-word dependencies.
word-level is mentioned in 16 sentences in this paper.
Topics mentioned in this paper:
Shen, Mo and Liu, Hongxiao and Kawahara, Daisuke and Kurohashi, Sadao
Abstract
We propose a method that performs character-level POS tagging jointly with word segmentation and word-level POS tagging.
Character-level POS Tagset
Some of these tags are directly derived from the commonly accepted word-level part-of-speech, such as noun, verb, adjective and adverb.
Chinese Morphological Analysis with Character-level POS
This hybrid model constructs a lattice that consists of word-level and character-level nodes from a given input sentence.
Chinese Morphological Analysis with Character-level POS
Word-level nodes correspond to words found in the system’s lexicon, which has been compiled from training data.
Chinese Morphological Analysis with Character-level POS
upper part of the lattice (word-level nodes) represents known words, where each node carries information such as character form, character-level POS , and word-level POS.
Introduction
Table l. Character-level POS sequence as a more specified version of word-level POS: an example of verb.
Introduction
Another advantage of character-level P08 is that, the sequence of character-level P08 in a word can be seen as a more fine-grained version of word-level POS.
Introduction
The five words in this table are very likely to be tagged with the same word-level POS as verb in any available annotated corpora, while it can be commonly agreed among native speakers of Chinese that the syntactic behaviors of these words are different from each other, due to their distinctions in word constructions.
word-level is mentioned in 17 sentences in this paper.
Topics mentioned in this paper:
Kang, Jun Seok and Feng, Song and Akoglu, Leman and Choi, Yejin
Evaluation 11: Human Evaluation on ConnotationWordNet
We collect two separate sets of labels: a set of labels at the word-level , and another set at the sense-level.
Evaluation 11: Human Evaluation on ConnotationWordNet
For word-level labels we apply similar procedure as above.
Evaluation 11: Human Evaluation on ConnotationWordNet
Lexicon Word-level Sense-level SentiWordNet 27.22 14.29 OpinionFinder 3 1 .95 -Feng2013 62.72 -GWORD+SENSE(95%) 84.91 83.43 GWORD+SENSE(99%) 84.91 83.71 E-GWORD+SENSE(95%) 86.98 86.29 E-GWORD+SENSE(99%) 86.69 85.71
Introduction
For non-polysemous words, which constitute a significant portion of English vocabulary, learning the general connotation at the word-level (rather than at the sense-level) would be a natural operational choice.
Introduction
As a result, researchers often would need to aggregate labels across different senses to derive the word-level label.
Introduction
Therefore, in this work, we present the first unified approach that learns both sense- and word-level connotations simultaneously.
Pairwise Markov Random Fields and Loopy Belief Propagation
We formulate the task of learning sense- and word-level connotation lexicon as a graph-based classification task (Sen et al., 2008).
word-level is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Radziszewski, Adam
Evaluation
Degorski (2011) uses concatenation of word-level base forms assigned by the tagger as a baseline.
Introduction
According to the lemmatisation principles accompanying the NCP tagset, adjectives are lemmatised as masculine forms (gléwny), hence it is not sufficient to take word-level lemma nor the orthographic form to obtain phrase lemmatisation.
Introduction
It is worth stressing that even the task of word-level lemmatisation is nontrivial for inflectional languages due to a large number of inflected forms and even larger number of syncretisms.
Phrase lemmatisation as a tagging problem
To show the real setting, this time we give full NCP tags and word-level lemmas assigned as a result of tagging.
Phrase lemmatisation as a tagging problem
The notation cas=n om means that to obtain the desired form (e. g. gléwne) you need to find an entry in a morphological dictionary that bears the same word-level lemma as the inflected form (gféwny) and a tag that results from taking the tag of the inflected form (adj : sgzinst :n:pos) and setting the value of the tagset attribute cas (grammatical case) to the value nom (nominative).
Phrase lemmatisation as a tagging problem
Our idea is simple: by expressing phrase lemmatisation in terms of word-level transformations we can reduce the task to tagging problem and apply well known Machine Learning techniques that have been devised for solving such problems (e. g. CRF).
Preparation of training data
The development set was enhanced with word-level transformations that were induced automatically in the following manner.
Preparation of training data
The dictionary is stored as a set of (orthographic form, word-level lemma, tag).
Preparation of training data
The task is to find a suitable transformation for the given inflected form from the original phrase, its tag and word-level lemma, but also given the desired form being part of human-assigned lemma.
word-level is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Melamud, Oren and Berant, Jonathan and Dagan, Ido and Goldberger, Jacob and Szpektor, Idan
Abstract
We propose a novel two-level model, which computes similarities between word-level vectors that are biased by topic-level context representations.
Abstract
Evaluations on a naturally-distributed dataset show that our model significantly outperforms prior word-level and topic-level models.
Background and Model Setting
However, while DIRT computes sim(v, 21’) over vectors in the original word-level space, topic-level models compute sim(d, d’, w) by measuring similarity of vectors in a reduced-dimensionality latent space.
Background and Model Setting
slots in the original word-level space while biasing the similarity measure through topic-level context models.
Introduction
To address this hypothesized caveat of prior context-sensitive rule scoring methods, we propose a novel generic scheme that integrates word-level and topic-level representations.
Introduction
Rather than computing a single context-insensitive rule score, we compute a distinct word-level similarity score for each topic in an LDA model.
Results
Specifically, topics are leveraged for high-level domain disambiguation, while fine grained word-level distributional similarity is computed for each rule under each such domain.
Results
This result more explicitly shows the advantages of integrating word-level and context-sensitive topic-level similarities for differentiating valid and invalid contexts for rule applications.
Two-level Context-sensitive Inference
Thus, our model computes similarity over word-level (rather than topic-level) argument vectors, while biasing it according to the specific argument words in the given rule application context.
Two-level Context-sensitive Inference
The core of our contribution is thus defining the context-sensitive word-level vector similarity measure sim(v, v’ , w), as described in the remainder of this section.
Two-level Context-sensitive Inference
This way, rather than replacing altogether the word-level values v(w) by the topic probabilities p(t|dv, w), as done in the topic-level models, we use the latter to only bias the former while preserving fine-grained word-level representations.
word-level is mentioned in 15 sentences in this paper.
Topics mentioned in this paper:
Kruengkrai, Canasai and Uchimoto, Kiyotaka and Kazama, Jun'ichi and Wang, Yiou and Torisawa, Kentaro and Isahara, Hitoshi
Background
In the hybrid model, given an input sentence, a lattice that consists of word-level and character-level nodes is constructed.
Background
Word-level nodes, which correspond to
Background
In other words, we use word-level nodes to identify known words and character-level nodes to identify unknown words.
Experiments
Since we were interested in finding an optimal combination of word-level and character-level nodes for training, we focused on tuning 7“.
Policies for correct path selection
Ideally, we need to build a word-character hybrid model that effectively learns the characteristics of unknown words (with character-level nodes) as well as those of known words (with word-level nodes).
Policies for correct path selection
If we select the correct path yt that corresponds to the annotated sentence, it will only consist of word-level nodes that do not allow learning for unknown words.
Policies for correct path selection
We therefore need to choose character-level nodes as correct nodes instead of word-level nodes for some words.
Training method
W0 <w0> for word-level W1 <p0> nodes
Training method
Templates W0—W3 are basic word-level unigram features, where Length(w0) denotes the length of the word wo.
Training method
Templates B0-B9 are basic word-level bigram features.
word-level is mentioned in 12 sentences in this paper.
Topics mentioned in this paper:
Snyder, Benjamin and Naseem, Tahira and Barzilay, Regina
Introduction
Word-level alignments are then drawn based on the tree alignment.
Introduction
Finally, parallel sentences are assembled from these generated part-of-speech sequences and word-level alignments.
Introduction
The model is trained using bilingual data with automatically induced word-level alignments, but is tested on purely monolingual data for each language.
Model
word-level alignments, as observed data.
Model
We obtain these word-level alignments using GIZA++ (Och and Ney, 2003).
Model
Finally, word-level alignments are drawn based on the structure of the alignment tree.
Related Work
Assuming that trees induced over parallel sentences have to exhibit certain structural regularities, Kuhn manually specifies a set of rules for determining when parsing decisions in the two languages are inconsistent with GIZA++ word-level alignments.
word-level is mentioned in 14 sentences in this paper.
Topics mentioned in this paper:
Zhang, Hao and Quirk, Chris and Moore, Robert C. and Gildea, Daniel
Introduction
Most state-of—the-art statistical machine translation systems are based on large phrase tables extracted from parallel text using word-level alignments.
Introduction
These word-level alignments are most often obtained using Expectation Maximization on the conditional generative models of Brown et al.
Introduction
As these word-level alignment models restrict the word alignment complexity by requiring each target word to align to zero or one source words, results are improved by aligning both source-to-target as well as target-to-source,
Phrasal Inversion Transduction Grammar
Combining the two approaches, we have a staged training procedure going from the simplest unconstrained word based model to a constrained Bayesian word-level ITG model, and finally proceeding to a constrained Bayesian phrasal model.
word-level is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Echizen-ya, Hiroshi and Araki, Kenji
Automatic Evaluation Method using Noun-Phrase Chunking
Secondly, the system calculates word-level scores based on the correct matched words using the determined correspondences of noun phrases.
Automatic Evaluation Method using Noun-Phrase Chunking
The system calculates the final scores combining word-level scores and phrase-level scores.
Automatic Evaluation Method using Noun-Phrase Chunking
2.2 Word-level Score
word-level is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Escalante, Hugo Jair and Solorio, Thamar and Montes-y-Gomez, Manuel
Experiments and Results
One should note that, in general, better performance was obtained when using character-level rather than word-level information.
Experiments and Results
This confirms the results already reported by other researchers that have used character-level and word-level information for AA (Houvardas and Stamatatos, 2006;
Experiments and Results
Also, n-gram information is more dense in documents than word-level information.
Introduction
Also, we empirically show that local histograms at the character-level are more helpful than local histograms at the word-level for AA.
Related Work
Some researchers have gone a step further and have attempted to capture sequential information by using n-grams at the word-level (Peng et al., 2004) or by discovering maximal frequent word sequences (Coyotl-Morales et al., 2006).
word-level is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Li, Mu and Duan, Nan and Zhang, Dongdong and Li, Chi-Ho and Zhou, Ming
Collaborative Decoding
0 Word-level system combination (Rosti et al., 2007) of member decoders’ n-best outputs
Discussion
Word-level system combination (system combination hereafter) (Rosti et al., 2007; He et al., 2008) has been proven to be an effective way to improve machine translation quality by using outputs from multiple systems.
Experiments
We also implemented the word-level system combination (Rosti et al., 2007) and the hypothesis selection method (Hildebrand and Vogel, 2008).
Experiments
Word-level Comb 4045/4085 2952/3035 Hypo Selection 40.09/40.50 29.02/29.71
Introduction
Most of the work focused on seeking better word alignment for consensus-based confusion network decoding (Matusov et al., 2006) or word-level system combination (He et al., 2008; Ayan et al., 2008).
Introduction
We also conduct extensive investigations when different settings of co-decoding are applied, and make comparisons with related methods such as word-level system combination of hypothesis selection from multiple n-best lists.
word-level is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Trogkanis, Nikolaos and Elkan, Charles
Experimental design
We report both word-level and letter-level error rates.
Experimental design
The word-level error rate is the fraction of words on which a method makes at least one mistake.
Experimental design
Specifically, for English our word-level accuracy (“ower”) is 96.33% while their best (“WA”) is 95.65%.
Experimental results
For both languages, PAT GEN has higher serious letter-level and word-level error rates than TEX using the existing pattern files.
History of automated hyphenation
The accuracy we achieve is slightly higher: word-level accuracy of 96.33% compared to their
word-level is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Farra, Noura and Tomeh, Nadi and Rozovskaya, Alla and Habash, Nizar
Abstract
While operating at the character level, the model makes use of word-level and contextual information.
Conclusions
In the future, we plan to extend the model to use word-level language models to select between top character predictions in the output.
Experiments
The word-error-rate WER metric is computed by summing the total number of word-level substitution errors, insertion errors, and deletion errors in the output, and dividing by the number of words in the reference.
Related Work
Discriminative models have been proposed at the word-level for error correction (Duan et al., 2012) and for error detection (Habash and Roth, 2011).
The GSEC Approach
We implemented another approach for error correction based on a word-level maximum likelihood model.
word-level is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Wu, Xianchao and Sudoh, Katsuhito and Duh, Kevin and Tsukada, Hajime and Nagata, Masaaki
Gaining Dependency Structures
Arrows in red (upper): PASS, orange (bottom) =word-level dependencies generated from PASs, blue=newly appended dependencies.
Gaining Dependency Structures
In order to generate word-level dependency trees from the PCFG tree, we use the LTH constituent-to-dependency conversion tool3 written by J ohansson and Nugues (2007).
Gaining Dependency Structures
Table 1 lists the mapping from HPSG’s PAS types to word-level dependency arcs.
word-level is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Elsner, Micha and Goldwater, Sharon and Eisenstein, Jacob
Conclusion
It is the first model of lexical-phonetic acquisition to include word-level context and to be tested on an infant-directed corpus with realistic phonetic variability.
Conclusion
Whether trained using gold standard or automatically induced word boundaries, the model recovers lexical items more effectively than a system that assumes no phonetic variability; moreover, the use of word-level context is key to the model’s success.
Introduction
Previous models with similar goals have learned from an artificial corpus with a small vocabulary (Driesen et al., 2009; Rasanen, 2011) or have modeled variability only in vowels (Feldman et al., 2009); to our knowledge, this paper is the first to use a naturalistic infant-directed corpus while modeling variability in all segments, and to incorporate word-level context (a bigram language model).
Related work
In contrast, our model uses a symbolic representation for sounds, but models variability in all segment types and incorporates a bigram word-level language model.
Related work
Here, we use a naturalistic corpus, demonstrating that lexical-phonetic learning is possible in this more general setting and that word-level context information is important for doing so.
word-level is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Chen, Ruey-Cheng
Evaluation
Segmentation performance is measured using word-level precision (P), recall (R), and F-measure (F).
Evaluation
The best performance result achieved by G2 in our experiment is 81.7 in word-level F-measure, although this was obtained from search setting (c), using a heuristic p value 0.37.
Evaluation
It would be interesting to confirm this by studying the correlation between description length and word-level F-measure.
word-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Hatori, Jun and Matsuzaki, Takuya and Miyao, Yusuke and Tsujii, Jun'ichi
Introduction
Furthermore, the word-level information is often augmented with the POS tags, which, along with segmentation, form the basic foundation of statistical NLP.
Model
We use standard measures of word-level precision, recall, and F1 score, for evaluating each task.
Related Works
To incorporate the word-level features into the character-based decoder, the features are decomposed into substring-level features, which are effective for incomplete words to have comparable scores to complete words in the beam.
word-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Hu, Yuening and Zhai, Ke and Eidelman, Vladimir and Boyd-Graber, Jordan
Polylingual Tree-based Topic Models
In this section, we bring existing tree-based topic models (Boyd-Graber et al., 2007, tLDA) and polylingual topic models (Mimno et al., 2009, pLDA) together and create the polylingual tree-based topic model (ptLDA) that incorporates both word-level correlations and document-level alignment information.
Polylingual Tree-based Topic Models
Word-level Correlations Tree-based topic models incorporate the correlations between words by
Polylingual Tree-based Topic Models
Build Prior Tree Structures One remaining question is the source of the word-level connections across languages for the tree prior.
word-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Xiong, Deyi and Zhang, Min and Li, Haizhou
Introduction
In Section 2, we review the previous work on word-level confidence estimation which is used for error detection.
Related Work
Ueffing and Ney (2007) exhaustively explore various word-level confidence measures to label each word in a generated translation hypothesis as correct or incorrect.
Related Work
(2009) study several confidence features based on mutual information between words and n-gram and backward n-gram language model for word-level and sentence-level CE.
word-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lei, Tao and Xin, Yu and Zhang, Yuan and Barzilay, Regina and Jaakkola, Tommi
Related Work
Nevertheless, any such word-level representation can be used to offset inherent sparsity problems associated with full lexi-calization (Cirik and Sensoy, 2013).
Related Work
Word-level vector space embeddings have so far had limited impact on parsing performance.
Related Work
While this method learns to map word combinations into vectors, it builds on existing word-level vector representations.
word-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Salameh, Mohammad and Cherry, Colin and Kondrak, Grzegorz
Methods
In this section, we discuss how a lattice from a multi-stack phrase-based decoder such as Moses (Koehn et al., 2007) can be desegmented to enable word-level features.
Methods
We now have a desegmented lattice, but it has not been annotated with an unsegmented ( word-level ) language model.
Methods
Indeed, the expanded word-level context is one of the main benefits of incorporating a word-level LM.
word-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Sun, Xu and Gao, Jianfeng and Micol, Daniel and Quirk, Chris
A Phrase-Based Error Model
Furthermore, the word-level alignments between Q and C can most often be identified with little ambiguity.
A Phrase-Based Error Model
Thus we restrict our attention to those phrase transformations consistent with a good word-level alignment.
Related Work
(2006) extend the error model by capturing word-level similarities learned from query logs.
word-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Ganchev, Kuzman and Gillenwater, Jennifer and Taskar, Ben
Abstract
We consider generative and discriminative models for dependency grammar induction that use word-level alignments and a source language parser (English) to constrain the space of possible target trees.
Approach
A parallel corpus is word-level aligned using an alignment toolkit (Graca et al., 2009) and the source (English) is parsed using a dependency parser (McDonald et al., 2005).
Introduction
For example, several early works (Yarowsky and Ngai, 2001; Yarowsky et al., 2001; Merlo et al., 2002) demonstrate transfer of shallow processing tools such as part-of-speech taggers and noun-phrase chunkers by using word-level alignment models (Brown et al., 1994; Och and Ney, 2000).
word-level is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: