Index of papers in Proc. ACL that mention
  • news articles
Rokhlenko, Oleg and Szpektor, Idan
Abstract
One motivating example of its application is for increasing user engagement around news articles by suggesting relevant comparable questions, such as “is Beyonce a better singer than Madonna .7”, for the user to answer.
Comparable Question Mining
Input: A news article Output: A sorted list of comparable questions 1: Identify all target named entities (NEs) in the article 2: Infer the distribution of LDA topics for the article 3: For each comparable relation R in the database, compute its relevance score to be the similarity between the topic distributions of R and the article 4: Rank all the relations according to their relevance score and pick the top M as relevant 5: for each relevant relation R in the order of relevance ranking do 6: Filter out all the target NEs that do not pass the single entity classifier for R 7: Generate all possible NE pairs from the those that passed the single classifier 8: Filter out all the generated NE pairs that do not pass the entity pair classifier for R 9: Pick up the top N pairs with positive classification score to be qualified for generation
Introduction
In this paper we propose a new way to increase user engagement around news articles , namely suggesting questions for the user to answer, which are related to the viewed article.
Introduction
Sadly, fun and engaging comparative questions are typically not found within the text of news articles .
Introduction
However, it is highly unlikely that such sources will contain enough relevant questions for any news article due to typical sparseness issues as well as differences in interests between askers in CQA sites and news reporters.
Motivation and Algorithmic Overview
Given a news article , our algorithm generates a set of comparable questions for the article from question templates, e.g.
Online Question Generation
The online part of our automatic generation algorithm takes as input a news article and generates concrete comparable questions for it.
news articles is mentioned in 18 sentences in this paper.
Topics mentioned in this paper:
Guo, Weiwei and Li, Hao and Ji, Heng and Diab, Mona
Introduction
To enable the NLP tools to better understand Twitter feeds, we propose the task of linking a tweet to a news article that is relevant to the tweet, thereby augmenting the context of the tweet.
Introduction
For example, we want to supplement the implicit context of the above tweet with a news article such as the following entitled:
Introduction
To create a gold standard dataset, we download tweets spanning over 18 days, each with a url linking to a news article of CNN or NYTIMES, as well as all the news of CNN and NYTIMES published during the period.
Task and Data
The task is given the text in a tweet, a system aims to find the most relevant news article .
Task and Data
For gold standard data, we harvest all the tweets that have a single url link to a CNN or NYTIMES news article , dated from the 11th of Jan to the 27th of Jan, 2013.
Task and Data
In evaluation, we consider this url-referred news article as the gold standard — the most relevant document for the tweet, and remove the url from the text of the tweet.
news articles is mentioned in 39 sentences in this paper.
Topics mentioned in this paper:
Park, Souneil and Lee, Kyung Soon and Song, Junehwa
Abstract
We present disputant relation-based method for classifying news articles on contentious issues.
Abstract
It performs unsupervised classification on news articles based on disputant relations, and helps readers intuitively view the articles through the opponent-based frame.
Background and Related Work
The discourse of contentious issues in news articles show different characteristics from that studied in the sentiment classification tasks.
Background and Related Work
For example, a news article can cast a negative light on a government program simply by covering the increase of deficit caused by it.
Background and Related Work
News articles of a contentious issue are more diverse than debate articles conveying explicit argument of a specific side.
Introduction
However, news articles are frequently biased and fail to fairly deliver conflicting arguments of the issue.
Introduction
In this paper, we present disputant relation-based method for classifying news articles on con-
Introduction
The method helps readers intuitively view the news articles through the opponent-based frame.
news articles is mentioned in 21 sentences in this paper.
Topics mentioned in this paper:
Arnold, Andrew and Nallapati, Ramesh and Cohen, William W.
Introduction
Specifically, you are given a corpus of news articles in which all tokens have been labeled as either belonging to personal name mentions or not.
Introduction
Clearly the problems of identifying names in news articles and e-mails are closely related, and learning to do well on one should help your performance on the other.
Introduction
When only the type of data being examined is allowed to vary (from news articles to e-mails, for example), the problem is called domain adaptation (Daumé III and Marcu, 2006).
Investigation
These are: abstracts from biological journals [UT (Bunescu et al., 2004), Yapex (Franzen et al., 2002)]; news articles [MUC6 (Fisher et al., 1995), MUC7 (Borthwick et al., 1998)]; and personal e-mails [CSPACE (Kraut et al., 2004)].
Investigation
0 person names in news articles and e-mails We chose this array of corpora so that we could evaluate our hierarchical prior’s ability to generalize across and incorporate information from a variety of domains, genres and tasks.
Investigation
Figure 3 shows the results of an experiment in learning to recognize person names in MUC6 news articles .
news articles is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Feng, Yansong and Lapata, Mirella
Abstract
We create a database of pictures that are naturally embedded into news articles and propose to use their captions as a proxy for annotation keywords.
Abstract
We also demonstrate that the news article associated with the picture can be used to boost image annotation performance.
BBC News Database
Many online news providers supply pictures with news articles , some even classify news into broad topic categories (e.g., business, world, sports, entertainment).
BBC News Database
We downloaded 3,361 news articles from the BBC News website.2 Each article was accompanied with an image and its caption.
Introduction
News articles associated with images and their captions spring readily to mind (e.g., BBC News, Yahoo News).
Introduction
Importantly, our images are not standalone, they come with news articles whose content is shared with the image.
Related Work
For example, news articles often contain images whose captions can be thought of as annotations.
news articles is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Dasgupta, Anirban and Kumar, Ravi and Ravi, Sujith
Experiments
We extracted a set of news articles and corresponding user comments from Yahoo!
Experiments
We then run our summarization algorithm on the instantiated graph to produce a summary for each news article .
Experiments
In addition, each news article and corresponding set of comments were presented to three human annotators.
Framework
Depending on the summarization application, 0 can refer to the set of documents (e.g., newswire) related to a particular topic as in standard summarization; in other scenarios (e. g., user-generated content), it is a collection of comments associated with a news article or a blog post, etc.
Introduction
On the other hand, in the case of user-generated content (say, comments on a news article ), even though the text is short, one is faced with a different set of problems: volume (popular articles generate more than 10,000 comments), noise (most comments are vacuous, linguistically deficient, and tangential to the article), and redundancy (similar views are expressed by multiple commenters).
Introduction
We then conduct experiments on two corpora: the DUC 2004 corpus and a corpus of user comments on news articles .
news articles is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Hasan, Kazi Saidul and Ng, Vincent
Analysis
To fill this gap, we ran four keyphrase extraction systems on four commonly-used datasets of varying sources, including Inspec abstracts (Hulth, 2003), DUC-2001 news articles (Over, 2001), scientific papers (Kim et al., 2010b), and meeting transcripts (Liu et al., 2009a).
Analysis
To be more concrete, consider the news article on athlete Ben Johnson in Figure 1, where the keyphrases are boldfaced.
Analysis
Figure 1: A news article on Ben Johnson from the DUC-2001 dataset.
Corpora
Consequently, it is harder to extract keyphrases from scientific papers, technical reports, and meeting transcripts than abstracts, emails, and news articles .
Corpora
Topic change An observation commonly exploited in keyphrase extraction from scientific articles and news articles is that keyphrases typically appear not only at the beginning (Witten et al., 1999) but also at the end (Medelyan et al., 2009) of a document.
Corpora
Topic correlation Another observation commonly exploited in keyphrase extraction from scientific articles and news articles is that the keyphrases in a document are typically related to each other (Turney, 2003; Mihalcea and Tarau, 2004).
news articles is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Alfonseca, Enrique and Pighin, Daniele and Garrido, Guillermo
Experiment settings
All six are large collections with 50 news articles , so this baseline is significantly different from a random baseline.
Headline generation
Our approach takes as input, for training, a corpus of news articles organized in news collections.
Headline generation
Algorithm 2 EXTRACTPATTERNSq;(n, E): n is the list of sentences in a news article .
Introduction
For some applications it is important to understand, given a collection of related news articles and re-
Related work
Most headline generation work in the past has focused on the problem of single-document summarization: given the main passage of a single news article , generate a very short summary of the article.
Related work
Filippova (2010) reports a system that is very close to our settings: the input is a collection of related news articles , and the system generates a headline that describes the main event.
news articles is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Woodsend, Kristian and Lapata, Mirella
Experimental Setup
Participants were presented with a news article and its corresponding highlights and were asked to rate the latter along three dimensions: informativeness (do the highlights represent the article’s main topics?
Introduction
If our goal is to summarize news articles , then we may be better off selecting the first n sentences of the document.
Introduction
Examples of CNN news articles with human-authored highlights are shown in Table 1.
Modeling
Highlights on a small screen deVice would presumably be shorter than highlights for news articles on the web.
The Task
Given a document, we aim to produce three or four short sentences covering its main topics, much like the “Story Highlights” accompanying the (online) CNN news articles .
The Task
The majority were news articles , but the set also contained a mixture of editorials, commentary, interviews and reviews.
news articles is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Zhou, Deyu and Chen, Liangyu and He, Yulan
Experiments
The other uses the traditional Stanford NER to extract named entities from news articles published in the same period and then perform fuzzy matching to identify named entities from tweets.
Introduction
Previous work in event extraction has focused largely on news articles , as the newswire texts have been the best source of information on current events (Hogen-boom et al., 2011).
Methodology
However, it is often observed that events mentioned in tweets are also reported in news articles in the same period (Petrovic et al., 2013).
Methodology
Therefore, named entities mentioned in tweets are likely to appear in news articles as well.
Methodology
First, a traditional NER tool such as the Stanford Named Entity Recognizer2 is used to identify named entities from the news articles crawled from BBC and CNN during the same period that the tweets were published.
news articles is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Pighin, Daniele and Cornolti, Marco and Alfonseca, Enrique and Filippova, Katja
Evaluation
Given the title or first sentence of a news article , run the same pattern extraction method that was used in training and, if possible, obtain a pattern p involving some entities.
Evaluation
Replace the entity placeholders in the top-scored patterns pj with the entities that were actually mentioned in the input news article .
Evaluation
Compared to this model, with heuristics we can obtain patterns for more than twice more news articles .
Introduction
Then, the same extraction method is used to collect patterns from sentences in never-seen-before news articles .
news articles is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Wan, Xiaojun and Li, Huiying and Xiao, Jianguo
Machine Translation Quality Prediction
DUC2001 provided 309 English news articles for document summarization tasks, and the articles were grouped into 30 document sets.
Machine Translation Quality Prediction
The news articles were selected from TREC-9.
Machine Translation Quality Prediction
We chose five document sets (d04, d05, d06, d08, d1 1) with 54 news articles out of the DUC2001 document sets.
Related Work 2.1 Machine Translation Quality Prediction
and Chiorean (2008) propose to produce summaries with the MMR method from Romanian news articles and then automatically translate the summaries into English.
news articles is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Deveaud, Romain and SanJuan, Eric and Bellot, Patrice
Conclusions & Future Work
els do not seem to be effective in the context of a news articles search task, they are a good indicator of effectiveness in the context of web search.
Evaluation
Robust04 is composed 528,155 of news articles coming from three newspapers and the FBIS.
Evaluation
We hypothesize that the heterogeneous nature of the web allows to model very different topics covering several aspects of the query, while news articles are contributions focused on a single subject.
Evaluation
Although topics coming from news articles may be limited, they benefit from the rich vocabulary of professional writers who are trained to avoid repetition.
news articles is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Joty, Shafiq and Carenini, Giuseppe and Ng, Raymond and Mehdad, Yashar
Document-level Parsing Approaches
For example, this is true for 75% cases in our development set containing 20 news articles from RST—DT and for 79% cases in our development set containing 20 how-to-do manuals from the Instructional corpus.
Introduction
While previous approaches have been tested on only one corpus, we evaluate our approach on texts from two very different genres: news articles and instructional how-to-do manuals.
Related work
They evaluate their approach on the RST—DT corpus (Carlson et al., 2002) of news articles .
news articles is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Nakashole, Ndapandula and Mitchell, Tom M.
Fact Candidates
The first task was titled “Trustworthiness of News Articles”, where annotators were given a link to a news article and
Fact Candidates
The second task was titled “Objectivity of News Articles” .
Fact Candidates
We randomly selected 500 news articles from a corpus of about 300,000 news articles obtained from Google News from the topics of Top News, Business, Entertainment, and SciTech.
news articles is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Speriosu, Michael and Baldridge, Jason
Data
The TR-CONLL corpus (Leidner, 2008) contains 946 REUTERS news articles published in August 1996.
Error Analysis
An instance of California in a baseball-related news article is incorrectly predicted to be the town California, Pennsylvania.
Introduction
ically recorded travel costs on the shaping of empires (Scheidel et al., 2012), and systems that convey the geographic content in news articles (Teitler et al., 2008; Sankaranarayanan et al., 2009) and microblogs (Gelernter and Mushegian, 2011).
news articles is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Sauper, Christina and Barzilay, Regina
Introduction
Then, it produces a new article by selecting content from the Internet for each part of this template.
Method
Our model constructs a new article by following these two steps: Ranking First, we attempt to rank candidate excerpts based on how representative they are of each individual topic.
Rank(eij1...€ij7~,Wj)
5 We are continually submitting new articles ; however, we report results on those that have at least a 6 month history at time of writing.
news articles is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Nastase, Vivi and Strapparava, Carlo
Cross Language Text Categorization
The data we work with consists of comparable corpora of news articles in English and Italian.
Cross Language Text Categorization
Each news article is annotated with one of the four categories: culture_andJchool, tourism, quality_0f_llfe, madejnltaly.
Introduction
To test the usefulness of etymological information we work with comparable collections of news articles in English and Italian, whose articles are assigned one of four categories: culture_andJchool, tourism, qual-ity_0f_life, madejnltaly.
news articles is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Lee, Taesung and Hwang, Seung-won
Experiments
We processed news articles for an entire year in 2008 by Xinhua news who publishes news in both English and Chinese, which were also used by Kim et al.
Experiments
The English corpus consists of 100,746 news articles, and the Chinese corpus consists of 88,031 news articles .
Experiments
0 D0: All news articles are used.
news articles is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Feng, Yansong and Lapata, Mirella
Conclusions
We have presented extractive and abstractive models that generate image captions for news articles .
Related Work
Instead of relying on manual annotation or background ontological information we exploit a multimodal database of news articles , images, and their captions.
Results
It is well known that news articles are written so that the lead contains the most important information in a story.7 This is an encouraging result as it highlights the importance of the visual information for the caption generation task.
news articles is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Zhao, Xin and Chen, Rishan and Fan, Kai and Yan, Hongfei and Li, Xiaoming
Evaluation
Since our major focus is to detect events from news articles , we only keep the web pages with keyword “news” in URL field.
Introduction
One standard way for that is to cluster news articles as events by following a two-step approach (Yang et al., 1998): 1) represent document as vectors and calculate similarities between documents; 2) run the clustering algorithm to obtain document clusters as events.1 Underlying text representation often plays a critical role in this approach, especially for long text streams.
Introduction
D1 and D2 are news articles about U.S. presidential election respectively in years 2004 and 2008.
news articles is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Kessler, Rémy and Tannier, Xavier and Hagège, Caroline and Moriceau, Véronique and Bittar, André
Related Work
Important events are those reported in a large number of news articles and each event is constructed according to one single query and represented by a set of sentences.
Temporal and Linguistic Processing
In news articles , this is the DCT.
Temporal and Linguistic Processing
Figure 4 shows an example of an analyzed excerpt of a news article .
news articles is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Diao, Qiming and Jiang, Jing and Zhu, Feida and Lim, Ee-Peng
Abstract
Although bursty event detection from text streams has been studied before, previous work may not be suitable for microblogs because compared with other text streams such as news articles and scientific publications, microblog posts are particularly diverse and noisy.
Introduction
More importantly, their model was applied to news articles and scientific publications, where most documents follow the global topical trends.
Method
Unlike news articles from traditional media, which are mostly about current affairs, an important property of microblog posts is that many posts are about users’ personal encounters and interests rather than global events.
news articles is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wang, Dong and Liu, Yang
Introduction
Summarization has been applied to different genres, such as news articles , scientific articles, and speech domains including broadcast news, meetings, conversations and lectures.
Opinion Summarization Methods
To obtain this, we trained a maximum entropy classifier with a bag-of-words model using a combination of data sets from several domains, including movie data (Pang and Lee, 2004), news articles from MPQA corpus (Wilson and Wiebe, 2003), and meeting transcripts from AMI corpus (Wilson, 2008a).
Related Work
Previous studies have used various domains, including news articles , scientific articles, web documents, reviews.
news articles is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: