Index of papers in Proc. ACL that mention
  • content words
Johnson, Mark and Christophe, Anne and Dupoux, Emmanuel and Demuth, Katherine
Introduction
Their experiments suggest that function words play a special role in the acquisition process: children learn function words before they learn the vast bulk of the associated content words , and they use function words to help identify context words.
Introduction
Traditional descriptive linguistics distinguishes function words, such as determiners and prepositions, from content words , such as nouns and verbs, corresponding roughly to the distinction between functional categories and lexical categories of modern generative linguistics (Fromkin, 2001).
Introduction
Function words differ from content words in at
Word segmentation results
Thus, the present model, initially aimed at segmenting words from continuous speech, shows three interesting characteristics that are also exhibited by human infants: it distinguishes between function words and content words (Shi and Werker, 2001), it allows learners to acquire at least some of the function words of their language (e. g. (Shi et al., 2006)); and furthermore, it may also allow them to start grouping together function words according to their category (Cauvet et al., 2014; Shi and Melancon, 2010).
Word segmentation with Adaptor Grammars
This means that “function words” are memoised independently of the “content words” that Word expands to; i.e., the model learns distinct “function word” and “content word” vocabularies.
content words is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Abend, Omri and Cohen, Shay B. and Steedman, Mark
Experimental Setup
A Reverb argument is represented as the conjunction of its content words that appear more than 10 times in the corpus.
Experimental Setup
The first, LEFTMOST, selects the leftmost content word for each predicate.
Our Proposal: A Latent LC Approach
In our experiments we attempt to keep the approach maximally general, and define H p to be the set of all subsets of size 1 or 2 of content words in Wpl.
Our Proposal: A Latent LC Approach
1We use a POS tagger to identify content words .
Our Proposal: A Latent LC Approach
Prepositions are considered content words under this definition.
content words is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Oh, Jong-Hoon and Torisawa, Kentaro and Hashimoto, Chikara and Sano, Motoki and De Saeger, Stijn and Ohtake, Kiyonori
Causal Relations for Why-QA
A bansetsa is a syntactic constituent composed of a content word and several function words such as postpositions and case markers.
Causal Relations for Why-QA
Our term matching method judges that a causal relation is a candidate of an appropriate causal relation if its effect part contains at least one content word (nouns, verbs, and adjectives) in the question.
Causal Relations for Why-QA
The n-grams of 75 f1 and tfg are restricted to those containing at least one content word in a question.
System Architecture
We retrieved documents from Japanese web texts using Boolean AND and OR queries generated from the content words in why-questions.
content words is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Zou, Bowei and Zhou, Guodong and Zhu, Qiaoming
Baselines
Since such correlation is more from the semantic perspective than the grammatical perspective, only content words are considered in our graph model, ignoring functional words (e.g., the, t0,.
Baselines
Especially, the content words limited to those with part-of-
Baselines
While the above word-based graph model can well capture the relatedness between content words , it can only partially model the focus of a negation eXpression since negation focus is more directly related with topic than content.
content words is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Pilehvar, Mohammad Taher and Navigli, Roberto
Experiments
For ontologizing WT and OW, the bag of content words W is given by the content words in sense definitions and, if available, additional related words obtained from lexicon relations (see Section 3).
Experiments
tively small in number, are already disambiguated and, therefore, the ontologization was just performed on the definition’s content words .
Lexical Resource Ontologization
We first create the empty undirected graph G L = (V, E) such that V is the set of concepts in L and E = (D. For each source concept c E V we create a bag of content words W = {2121, .
Lexical Resource Ontologization
,wn} which includes all the content words in its definition d and, if available, additional related words obtained from lexicon relations (e.g., synonyms in Wiktionary).
Lexical Resource Ontologization
The definition contains two content words : fruitn and conifern.
Resource Alignment
In this component the personalization vector vi is set by uniformly distributing the probability mass over the nodes corresponding to the senses of all the content words in the extended definition of di according to the sense inventory of a semantic network H. We use the same semantic graph H for computing the semantic signatures of both definitions.
content words is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Wang, Baoxun and Wang, Xiaolong and Sun, Chengjie and Liu, Bingquan and Sun, Lin
Experiments
Because the features for QA pairs are quite sparse and the content words in the questions are usually morphologically different from the ones with the same meaning in the answers, the Cosine Similarity method become less powerful.
Learning with Homogenous Data
Figure 4 shows the percentage of the concurrent words in the top-ranked content words with high frequency.
Learning with Homogenous Data
The number k on the horizontal axis in Figure 4 represents the top k content words in the
Learning with Homogenous Data
Percentage of concurrent content words
content words is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Tian, Ran and Miyao, Yusuke and Matsuzaki, Takuya
The Idea
Our solution is to redefine DCS trees without the aid of any databases, by considering each node of a DCS tree as a content word in a sentence (but may no longer be a table in a specific database), while each edge represents semantic relations between two words.
The Idea
0 Content words: a content word (e.g.
The Idea
A DCS tree ’2' = (N, 5) is defined as a rooted tree, where each node 0 E N is labeled with a content word 212(0) and each edge (a, 0’) E 5 C N x N is labeled with a pair of semantic roles (r, r’)7.
content words is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Wu, Xianchao and Matsuzaki, Takuya and Tsujii, Jun'ichi
Backgrounds
Particles are suffixes or tokens in Japanese grammar that immediately follow modified content words or sentences.
Composed Rule Extraction
A chunk contains roughly one content word (usually the head) and affixed function words, such as case markers (e.g., go) and verbal morphemes (e.g., sa re to, which indicate past tense and passive voice).
Introduction
This indicates that the alignments of the function words are more easily to be mistaken than content words .
Related Research
Specially, we observed that most incorrect or ambiguous word alignments are caused by function words rather than content words .
content words is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Davidov, Dmitry and Rappoport, Ari
Corpora and Parameters
F0 (upper bound for content word frequency in patterns) influences which words are considered as hook and target words.
Corpora and Parameters
Since content words determine the joining of patterns into clusters, the more ambiguous a word is, the noisier the resulting clusters.
Corpora and Parameters
The value we use for FH is lower than that used for F0, in order to allow as HFWs function words of relatively low frequency (e.g., ‘through’), while allowing as content words some frequent words that participate in meaningful relationships (e.g., ‘ game’).
Pattern Clustering Algorithm
Following (Davidov and Rappoport, 2006), we classified words into high-frequency words (HFWs) and content words (CWs).
content words is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Biran, Or and McKeown, Kathleen
Conclusion
With this approach, using a stop list does not have a major effect on results for most relation classes, which suggests most of the word pairs affecting performance are content word pairs which may truly be semantically related to the discourse structure.
Introduction
We show that our formulation outperforms the original one while requiring less features, and that using a stop list of functional words does not significantly affect performance, suggesting that these features indeed represent semantically related content word pairs.
Word Pairs
An analysis in (Pitler et al., 2009) also shows that the top word pairs (ranked by information gain) all contain common functional words, and are not at all the semantically-related content words that were imagined.
content words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Frank, Stella and Feldman, Naomi H. and Goldwater, Sharon
Experiments
We restrict the corpus to content words by retaining only words tagged as adj, n, part and v (adjectives, nouns, particles, and verbs).
Experiments
As well as function words, we also remove the five most frequent content words (be, go, get, want, come).
Experiments
On average, situations are only 59 words long, reflecting the relative lack of content words in CD8 utterances.
content words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Cui, Lei and Zhang, Dongdong and Liu, Shujie and Chen, Qiming and Li, Mu and Zhou, Ming and Yang, Muyun
Introduction
Since the information within the sentence is insufficient for topic modeling, we first enrich sentence contexts via Information Retrieval (IR) methods using content words in the sentence as queries, so that topic-related monolingual documents can be collected.
Topic Similarity Model with Neural Network
One problem with auto-encoder is that it treats all words in the same way, making no distinguish-ment between function words and content words .
Topic Similarity Model with Neural Network
For each positive instance ( f, e), we select 6’ which contains at least 30% different content words from 6.
content words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wang, Lu and Cardie, Claire
Content Selection
A valid indicator-argument pair should have at least one content word and satisfy one of the following constraints:
Content Selection
For training data construction, we consider a relation instance to be a positive example if it shares any content word with its corresponding abstracts, and a negative example otherwise.
Surface Realization
number of content words in indicator/argument number of content words that are also in previous DA indicator/argument only contains stopword?
content words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Özbal, Gözde and Pighin, Daniele and Strapparava, Carlo
Architecture of BRAINSUP
ning, 2003) and produce the patterns by stripping away all content words from the parses.
Evaluation
Two or three content words appearing in each slogan were randomly selected as the target words 1:.
Evaluation
Furthermore, we only considered sentences in which all the content words are listed in WordNet (Miller, 1995) with the observed part of speech.8 The LSA space used for the semantic feature functions was also learned on BNC data, but in this case no filtering was applied.
content words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Beigman Klebanov, Beata and Flor, Michael
Application to Essay Scoring
Likewise, a feature that calculates the average PMI for all pairs of content word types in the text failed to produce an improvement over the baseline for setA pl-p6.
Methodology
The second is which pairs of words in a text to consider when building a profile for the text; we opted for all pairs of content word types occurring in a text, irrespective of the distance between them.
Methodology
Thus, the text “The dog barked and wagged its tail” is much tighter than the text “Green ideas sleep furiously”, with all the six content word pairs scoring above PMI=5.5 in the first and below PMI=2.2 in the second.4
content words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Sun, Weiwei and Uszkoreit, Hans
Capturing Paradigmatic Relations via Word Clustering
Another interesting fact is that almost all of them are content words .
Capturing Syntagmatic Relations via Constituency Parsing
4.1.1 Content Words vs. Function Words
Capturing Syntagmatic Relations via Constituency Parsing
The majority of the words that are better labeled by the tagger are content words , including nouns(NN, NR, NT), numbers (CD, OD), predicates (VA, VC, VE), adverbs (AD), nominal modifiers (JJ), and so on.
content words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhao, Bing and Lee, Young-Suk and Luo, Xiaoqiang and Li, Liu
Elementary Trees to String Grammar
(0/ , 7’ ) deletes a src content word
Elementary Trees to String Grammar
(0/ , 7’ ) over generates a tgt content word ( v )
The Projectable Structures
The transformations could be as simple as merging two adjacent nonterminals into one bracket to accommodate non-contiguity on the target side, or lexicalizing those words which have fork-style, many—to—many alignment, or unaligned content words to enable the rest of the span to be generalized into nonterminals.
content words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Mitchell, Jeff and Lapata, Mirella and Demberg, Vera and Keller, Frank
Method
Finally, because our focus is the influence of semantic context, we selected only content words whose prior sentential context contained at least two further content words .
Models of Processing Difficulty
The model takes into account only content words , function words are of little interest here as they can be found in any context.
Models of Processing Difficulty
common content words and each vector component is given by the ratio of the probability of a c,-given I to the overall probability of 6,.
content words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Connor, Michael and Gertner, Yael and Fisher, Cynthia and Roth, Dan
Model
During initial unsupervised parsing we experiment with incorporating knowledge through a combination of statistical priors favoring a skewed distribution of words into classes, and an initial hard clustering of the vocabulary into function and content words .
Unsupervised Parsing
Because the function and content word preclus-tering preceded parameter estimation, it can be combined with either EM or VB learning.
Unsupervised Parsing
Although this initial split forces sparsity on the emission matrix and allows more uniform sized clusters, Dirichlet priors may still help, if word clusters within the function or content word subsets vary in size and frequency.
content words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Matsubayashi, Yuichiroh and Okazaki, Naoaki and Tsujii, Jun'ichi
Conclusion
or hr rl st vn frame 0 4 0 1 0 evoking word 3 4 7 3 0 ew & hw stem 9 34 20 8 0 ew & phrase type 11 7 11 3 1 head word 1 3 19 8 3 1 hw stem 1 1 17 8 8 1 content word 7 19 12 3 0 cw stem 1 1 26 13 5 0 cw P08 4 5 14 15 2 directed path 19 27 24 6 7 undirected path 21 35 17 2 6 partial path 15 18 16 13 5 last word 15 1 8 12 3 2 first word 1 1 23 53 26 10 supersense 7 7 35 25 4 position 4 6 30 9 5 others 27 29 3 3 19 6 total 188 298 313 152 50
Experiment and Discussion
The characteristics of x are: frame, frame evoking word, head word, content word (Surdeanu et al., 2003), first/last word, head word of left/right sister, phrase type, position, voice, syntactic path (di-rected/undirected/partial), governing category (Gildea and Jurafsky, 2002), WordNet supersense in the phrase, combination features of frame evoking word & headword, combination features of frame evoking word & phrase type, and combination features of voice & phrase type.
Experiment and Discussion
associations with lexical and structural characteristics such as the syntactic path, content word , and head word.
content words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Huang, Fei
Alignment Link Confidence Measure
F-content and F-function are the F-scores for content words and function words, respectively.
Alignment Link Confidence Measure
Overall it improves the F-score by 1.5 points (from 69.3 to 70.8), 1.8 point improvement for content words and 1.0 point for function words.
Related Work
On the other hand, removing incorrect content word links produced cleaner phrase translation tables.
content words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Feng, Yansong and Lapata, Mirella
BBC News Database
We randomly selected 240 image-caption pairs and manually assessed whether the caption content words (i.e., nouns, verbs, and adjectives) could describe the image.
BBC News Database
We rank the document’s content words (i.e., nouns, verbs, and adjectives) according to their tf * idf weight and select the top k to be the final annotations.
BBC News Database
Again we only use content words (the average title length in the training set was 4.0 words).
content words is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: