SciSurf: Index of "content words" in Proc. ACL

Index of papers in Proc. ACL that mention

content words

Seen in text as:

content words (68)
content word (33)

Seen in 95 sentences in 22 papers.

1. Modelling function words improves unsupervised word segmentation

Johnson, Mark and Christophe, Anne and Dupoux, Emmanuel and Demuth, Katherine

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Their experiments suggest that function words play a special role in the acquisition process: children learn function words before they learn the vast bulk of the associated content words , and they use function words to help identify context words.
Introduction	Traditional descriptive linguistics distinguishes function words, such as determiners and prepositions, from content words , such as nouns and verbs, corresponding roughly to the distinction between functional categories and lexical categories of modern generative linguistics (Fromkin, 2001).
Introduction	Function words differ from content words in at
Word segmentation results	Thus, the present model, initially aimed at segmenting words from continuous speech, shows three interesting characteristics that are also exhibited by human infants: it distinguishes between function words and content words (Shi and Werker, 2001), it allows learners to acquire at least some of the function words of their language (e. g. (Shi et al., 2006)); and furthermore, it may also allow them to start grouping together function words according to their category (Cauvet et al., 2014; Shi and Melancon, 2010).
Word segmentation with Adaptor Grammars	This means that “function words” are memoised independently of the “content words” that Word expands to; i.e., the model learns distinct “function word” and “content word” vocabularies.

content words is mentioned in 11 sentences in this paper.

Topics mentioned in this paper:

2. Lexical Inference over Multi-Word Predicates: A Distributional Approach

Abend, Omri and Cohen, Shay B. and Steedman, Mark

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experimental Setup	A Reverb argument is represented as the conjunction of its content words that appear more than 10 times in the corpus.
Experimental Setup	The first, LEFTMOST, selects the leftmost content word for each predicate.
Our Proposal: A Latent LC Approach	In our experiments we attempt to keep the approach maximally general, and define H p to be the set of all subsets of size 1 or 2 of content words in Wpl.
Our Proposal: A Latent LC Approach	1We use a POS tagger to identify content words .
Our Proposal: A Latent LC Approach	Prepositions are considered content words under this definition.

content words is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

3. Why-Question Answering using Intra- and Inter-Sentential Causal Relations

Oh, Jong-Hoon and Torisawa, Kentaro and Hashimoto, Chikara and Sano, Motoki and De Saeger, Stijn and Ohtake, Kiyonori

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Causal Relations for Why-QA	A bansetsa is a syntactic constituent composed of a content word and several function words such as postpositions and case markers.
Causal Relations for Why-QA	Our term matching method judges that a causal relation is a candidate of an appropriate causal relation if its effect part contains at least one content word (nouns, verbs, and adjectives) in the question.
Causal Relations for Why-QA	The n-grams of 75 f1 and tfg are restricted to those containing at least one content word in a question.
System Architecture	We retrieved documents from Japanese web texts using Boolean AND and OR queries generated from the content words in why-questions.

content words is mentioned in 7 sentences in this paper.

Topics mentioned in this paper:

4. Negation Focus Identification with Contextual Discourse Information

Zou, Bowei and Zhou, Guodong and Zhu, Qiaoming

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Baselines	Since such correlation is more from the semantic perspective than the grammatical perspective, only content words are considered in our graph model, ignoring functional words (e.g., the, t0,.
Baselines	Especially, the content words limited to those with part-of-
Baselines	While the above word-based graph model can well capture the relatedness between content words , it can only partially model the focus of a negation eXpression since negation focus is more directly related with topic than content.

content words is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

5. A Robust Approach to Aligning Heterogeneous Lexical Resources

Pilehvar, Mohammad Taher and Navigli, Roberto

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	For ontologizing WT and OW, the bag of content words W is given by the content words in sense definitions and, if available, additional related words obtained from lexicon relations (see Section 3).
Experiments	tively small in number, are already disambiguated and, therefore, the ontologization was just performed on the definition’s content words .
Lexical Resource Ontologization	We first create the empty undirected graph G L = (V, E) such that V is the set of concepts in L and E = (D. For each source concept c E V we create a bag of content words W = {2121, .
Lexical Resource Ontologization	,wn} which includes all the content words in its definition d and, if available, additional related words obtained from lexicon relations (e.g., synonyms in Wiktionary).
Lexical Resource Ontologization	The definition contains two content words : fruitn and conifern.
Resource Alignment	In this component the personalization vector vi is set by uniformly distributing the probability mass over the nodes corresponding to the senses of all the content words in the extended definition of di according to the sense inventory of a semantic network H. We use the same semantic graph H for computing the semantic signatures of both definitions.

content words is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

6. Modeling Semantic Relevance for Question-Answer Pairs in Web Social Communities

Wang, Baoxun and Wang, Xiaolong and Sun, Chengjie and Liu, Bingquan and Sun, Lin

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Because the features for QA pairs are quite sparse and the content words in the questions are usually morphologically different from the ones with the same meaning in the answers, the Cosine Similarity method become less powerful.
Learning with Homogenous Data	Figure 4 shows the percentage of the concurrent words in the top-ranked content words with high frequency.
Learning with Homogenous Data	The number k on the horizontal axis in Figure 4 represents the top k content words in the
Learning with Homogenous Data	Percentage of concurrent content words

content words is mentioned in 6 sentences in this paper.

Topics mentioned in this paper:

7. Logical Inference on Dependency-based Compositional Semantics

Tian, Ran and Miyao, Yusuke and Matsuzaki, Takuya

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

The Idea	Our solution is to redefine DCS trees without the aid of any databases, by considering each node of a DCS tree as a content word in a sentence (but may no longer be a table in a specific database), while each edge represents semantic relations between two words.
The Idea	0 Content words: a content word (e.g.
The Idea	A DCS tree ’2' = (N, 5) is defined as a rooted tree, where each node 0 E N is labeled with a content word 212(0) and each edge (a, 0’) E 5 C N x N is labeled with a pair of semantic roles (r, r’)7.

content words is mentioned in 5 sentences in this paper.

Topics mentioned in this paper:

8. Effective Use of Function Words for Rule Generalization in Forest-Based Translation

Wu, Xianchao and Matsuzaki, Takuya and Tsujii, Jun'ichi

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Backgrounds	Particles are suffixes or tokens in Japanese grammar that immediately follow modified content words or sentences.
Composed Rule Extraction	A chunk contains roughly one content word (usually the head) and affixed function words, such as case markers (e.g., go) and verbal morphemes (e.g., sa re to, which indicate past tense and passive voice).
Introduction	This indicates that the alignments of the function words are more easily to be mistaken than content words .
Related Research	Specially, we observed that most incorrect or ambiguous word alignments are caused by function words rather than content words .

content words is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

9. Unsupervised Discovery of Generic Relationships Using Pattern Clusters and its Evaluation by Automatically Generated SAT Analogy Questions

Davidov, Dmitry and Rappoport, Ari

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Corpora and Parameters	F0 (upper bound for content word frequency in patterns) influences which words are considered as hook and target words.
Corpora and Parameters	Since content words determine the joining of patterns into clusters, the more ambiguous a word is, the noisier the resulting clusters.
Corpora and Parameters	The value we use for FH is lower than that used for F0, in order to allow as HFWs function words of relatively low frequency (e.g., ‘through’), while allowing as content words some frequent words that participate in meaningful relationships (e.g., ‘ game’).
Pattern Clustering Algorithm	Following (Davidov and Rappoport, 2006), we classified words into high-frequency words (HFWs) and content words (CWs).

content words is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

10. Aggregated Word Pair Features for Implicit Discourse Relation Disambiguation

Biran, Or and McKeown, Kathleen

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	With this approach, using a stop list does not have a major effect on results for most relation classes, which suggests most of the word pairs affecting performance are content word pairs which may truly be semantically related to the discourse structure.
Introduction	We show that our formulation outperforms the original one while requiring less features, and that using a stop list of functional words does not significantly affect performance, suggesting that these features indeed represent semantically related content word pairs.
Word Pairs	An analysis in (Pitler et al., 2009) also shows that the top word pairs (ranked by information gain) all contain common functional words, and are not at all the semantically-related content words that were imagined.

content words is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

11. Weak semantic context helps phonetic learning in a model of infant language acquisition

Frank, Stella and Feldman, Naomi H. and Goldwater, Sharon

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	We restrict the corpus to content words by retaining only words tagged as adj, n, part and v (adjectives, nouns, particles, and verbs).
Experiments	As well as function words, we also remove the five most frequent content words (be, go, get, want, come).
Experiments	On average, situations are only 59 words long, reflecting the relative lack of content words in CD8 utterances.

content words is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

12. Learning Topic Representation for SMT with Neural Networks

Cui, Lei and Zhang, Dongdong and Liu, Shujie and Chen, Qiming and Li, Mu and Zhou, Ming and Yang, Muyun

In Proc. ACL 2014, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Introduction	Since the information within the sentence is insufficient for topic modeling, we first enrich sentence contexts via Information Retrieval (IR) methods using content words in the sentence as queries, so that topic-related monolingual documents can be collected.
Topic Similarity Model with Neural Network	One problem with auto-encoder is that it treats all words in the same way, making no distinguish-ment between function words and content words .
Topic Similarity Model with Neural Network	For each positive instance ( f, e), we select 6’ which contains at least 30% different content words from 6.

content words is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

13. Domain-Independent Abstract Generation for Focused Meeting Summarization

Wang, Lu and Cardie, Claire

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Content Selection	A valid indicator-argument pair should have at least one content word and satisfy one of the following constraints:
Content Selection	For training data construction, we consider a relation instance to be a positive example if it shares any content word with its corresponding abstracts, and a negative example otherwise.
Surface Realization	number of content words in indicator/argument number of content words that are also in previous DA indicator/argument only contains stopword?

content words is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

14. BRAINSUP: Brainstorming Support for Creative Sentence Generation

Özbal, Gözde and Pighin, Daniele and Strapparava, Carlo

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Architecture of BRAINSUP	ning, 2003) and produce the patterns by stripping away all content words from the parses.
Evaluation	Two or three content words appearing in each slogan were randomly selected as the target words 1:.
Evaluation	Furthermore, we only considered sentences in which all the content words are listed in WordNet (Miller, 1995) with the observed part of speech.8 The LSA space used for the semantic feature functions was also learned on BNC data, but in this case no filtering was applied.

content words is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

15. Word Association Profiles and their Use for Automated Scoring of Essays

Beigman Klebanov, Beata and Flor, Michael

In Proc. ACL 2013, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Application to Essay Scoring	Likewise, a feature that calculates the average PMI for all pairs of content word types in the text failed to produce an improvement over the baseline for setA pl-p6.
Methodology	The second is which pairs of words in a text to consider when building a profile for the text; we opted for all pairs of content word types occurring in a text, irrespective of the distance between them.
Methodology	Thus, the text “The dog barked and wagged its tail” is much tighter than the text “Green ideas sleep furiously”, with all the six content word pairs scoring above PMI=5.5 in the first and below PMI=2.2 in the second.4

content words is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

16. Capturing Paradigmatic and Syntagmatic Lexical Relations: Towards Accurate Chinese Part-of-Speech Tagging

Sun, Weiwei and Uszkoreit, Hans

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Capturing Paradigmatic Relations via Word Clustering	Another interesting fact is that almost all of them are content words .
Capturing Syntagmatic Relations via Constituency Parsing	4.1.1 Content Words vs. Function Words
Capturing Syntagmatic Relations via Constituency Parsing	The majority of the words that are better labeled by the tagger are content words , including nouns(NN, NR, NT), numbers (CD, OD), predicates (VA, VC, VE), adverbs (AD), nominal modifiers (JJ), and so on.

content words is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

17. Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations

Zhao, Bing and Lee, Young-Suk and Luo, Xiaoqiang and Li, Liu

In Proc. ACL 2011, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Elementary Trees to String Grammar	(0/ , 7’ ) deletes a src content word
Elementary Trees to String Grammar	(0/ , 7’ ) over generates a tgt content word ( v )
The Projectable Structures	The transformations could be as simple as merging two adjacent nonterminals into one bracket to accommodate non-contiguity on the target side, or lexicalizing those words which have fork-style, many—to—many alignment, or unaligned content words to enable the rest of the span to be generalized into nonterminals.

content words is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

18. Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure

Mitchell, Jeff and Lapata, Mirella and Demberg, Vera and Keller, Frank

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Method	Finally, because our focus is the influence of semantic context, we selected only content words whose prior sentential context contained at least two further content words .
Models of Processing Difficulty	The model takes into account only content words , function words are of little interest here as they can be found in any context.
Models of Processing Difficulty	common content words and each vector component is given by the ratio of the probability of a c,-given I to the overall probability of 6,.

content words is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

19. Starting from Scratch in Semantic Role Labeling

Connor, Michael and Gertner, Yael and Fisher, Cynthia and Roth, Dan

In Proc. ACL 2010, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Model	During initial unsupervised parsing we experiment with incorporating knowledge through a combination of statistical priors favoring a skewed distribution of words into classes, and an initial hard clustering of the vocabulary into function and content words .
Unsupervised Parsing	Because the function and content word preclus-tering preceded parameter estimation, it can be combined with either EM or VB learning.
Unsupervised Parsing	Although this initial split forces sparsity on the emission matrix and allows more uniform sized clusters, Dirichlet priors may still help, if word clusters within the function or content word subsets vary in size and frequency.

content words is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

20. A Comparative Study on Generalization of Semantic Roles in FrameNet

Matsubayashi, Yuichiroh and Okazaki, Naoaki and Tsujii, Jun'ichi

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Conclusion	or hr rl st vn frame 0 4 0 1 0 evoking word 3 4 7 3 0 ew & hw stem 9 34 20 8 0 ew & phrase type 11 7 11 3 1 head word 1 3 19 8 3 1 hw stem 1 1 17 8 8 1 content word 7 19 12 3 0 cw stem 1 1 26 13 5 0 cw P08 4 5 14 15 2 directed path 19 27 24 6 7 undirected path 21 35 17 2 6 partial path 15 18 16 13 5 last word 15 1 8 12 3 2 first word 1 1 23 53 26 10 supersense 7 7 35 25 4 position 4 6 30 9 5 others 27 29 3 3 19 6 total 188 298 313 152 50
Experiment and Discussion	The characteristics of x are: frame, frame evoking word, head word, content word (Surdeanu et al., 2003), first/last word, head word of left/right sister, phrase type, position, voice, syntactic path (di-rected/undirected/partial), governing category (Gildea and Jurafsky, 2002), WordNet supersense in the phrase, combination features of frame evoking word & headword, combination features of frame evoking word & phrase type, and combination features of voice & phrase type.
Experiment and Discussion	associations with lexical and structural characteristics such as the syntactic path, content word , and head word.

content words is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

21. Confidence Measure for Word Alignment

Huang, Fei

In Proc. ACL 2009, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Alignment Link Confidence Measure	F-content and F-function are the F-scores for content words and function words, respectively.
Alignment Link Confidence Measure	Overall it improves the F-score by 1.5 points (from 69.3 to 70.8), 1.8 point improvement for content words and 1.0 point for function words.
Related Work	On the other hand, removing incorrect content word links produced cleaner phrase translation tables.

content words is mentioned in 3 sentences in this paper.

Topics mentioned in this paper:

22. Automatic Image Annotation Using Auxiliary Text Information

Feng, Yansong and Lapata, Mirella

In Proc. ACL 2008, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

BBC News Database	We randomly selected 240 image-caption pairs and manually assessed whether the caption content words (i.e., nouns, verbs, and adjectives) could describe the image.
BBC News Database	We rank the document’s content words (i.e., nouns, verbs, and adjectives) according to their tf * idf weight and select the top k to be the final annotations.
BBC News Database	Again we only use content words (the average title length in the training set was 4.0 words).

content words is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: