You Had Me at Hello: How Phrasing Affects Memorability
Danescu-Niculescu-Mizil, Cristian and Cheng, Justin and Kleinberg, Jon and Lee, Lillian

Article Structure

Abstract

Understanding the ways in which information achieves widespread public awareness is a research question of significant interest.

Hello. My name is Inigo Montoya.

Understanding what items will be retained in the public consciousness, and why, is a question of fundamental interest in many domains, including marketing, politics, entertainment, and social media; as we all know, many items barely register, whereas others catch on and take hold in many people’s minds.

I’m ready for my closeup.

2.1 Data

Never send a human to do a machine’s job.

We now discuss experiments that investigate the hypotheses discussed in §1.

A long time ago, in a galaxy far, far away

How an item’s linguistic form affects the reaction it generates has been studied in several contexts, including evaluations of product reviews [9], political speeches [12], online posts [13], scientific papers [14], and retweeting of Twitter posts [36].

I think this is the beginning of a beautiful friendship.

Motivated by the broad question of what kinds of information achieve widespread public awareness, we studied the the effect of phrasing on a quote’s memorability.

Topics

language model

Appears in 13 sentences as: language model (6) language models (4) language models’ (1) language” model (2) language” models (2)
In You Had Me at Hello: How Phrasing Affects Memorability
  1. First, we show a concrete sense in which memorable quotes are indeed distinctive: with respect to lexical language models trained on the newswire portions of the Brown corpus [21], memorable quotes have significantly lower likelihood than their non-memorable counterparts.
    Page 2, “Hello. My name is Inigo Montoya.”
  2. In particular, we analyze a corpus of advertising slogans, and we show that these slogans have significantly greater likelihood at both the word level and the part-of-speech level with respect to a language model trained on memorable movie quotes, compared to a corresponding language model trained on non-memorable movie quotes.
    Page 3, “Hello. My name is Inigo Montoya.”
  3. In order to assess different levels of lexical and syntactic distinctiveness, we employ a total of six Laplace-smoothed8 language models : l-gram, 2-gram, and 3-gram word LMs and l-gram, 2-gram and 3-gram part-of-speech9 LMs.
    Page 5, “Never send a human to do a machine’s job.”
  4. As indicated in Table 3, for each of our lexical “common language” models , in about 60% of the quote pairs, the memorable quote is more distinctive.
    Page 5, “Never send a human to do a machine’s job.”
  5. The language models’ vocabulary was that of the entire training corpus.
    Page 5, “Never send a human to do a machine’s job.”
  6. Table 3: Distinctiveness: percentage of quote pairs in which the the memorable quote is more distinctive than the non-memorable one according to the respective “common language” model .
    Page 6, “Never send a human to do a machine’s job.”
  7. Specifically, we train one language model on memorable quotes and another on non-memorable quotes
    Page 6, “Never send a human to do a machine’s job.”
  8. (Non)memorable Slogans N ewswire language models 1—gram 56.15%** 33.77%*** lexical 2—gram 51.51% 25.15%*** 3—gram 52.44% 28.89%*** 1—gram 73.09%*** 68.27%*** syntactic 2—gram 64.04%*** 50.21% 3—gram 62.88%*** 55.09%***
    Page 7, “Never send a human to do a machine’s job.”
  9. Table 5: Cross-domain concept of “memorable” language: percentage of slogans that have higher likelihood under the memorable language model than under the non-memorable one (for each of the six language models considered).
    Page 7, “Never send a human to do a machine’s job.”
  10. Rightmost column: for reference, the percentage of newswire sentences that have higher likelihood under the memorable language model than under the non-memorable one.
    Page 7, “Never send a human to do a machine’s job.”
  11. We also note that the higher likelihood of slogans under a “memorable language” model is not simply occurring for the trivial reason that this model predicts all other large bodies of text better.
    Page 7, “Never send a human to do a machine’s job.”

See all papers in Proc. ACL 2012 that mention language model.

See all papers in Proc. ACL that mention language model.

Back to top.

bag-of-words

Appears in 5 sentences as: bag-of-words (5)
In You Had Me at Hello: How Phrasing Affects Memorability
  1. Our first formulation of the prediction task uses a standard bag-of-words model“).
    Page 7, “Never send a human to do a machine’s job.”
  2. If there were no information in the textual content of a quote to determine whether it were memorable, then an SVM employing bag-of-words features should perform no better than chance.
    Page 7, “Never send a human to do a machine’s job.”
  3. Even a relatively small number of distinctiveness features, on their own, improve significantly over the much larger bag-of-words model.
    Page 7, “Never send a human to do a machine’s job.”
  4. Thus, the main conclusion from these prediction tasks is that abstracting notions such as distinctiveness and generality can produce relatively streamlined models that outperform much heavier-weight bag-of-words models, and can suggest steps toward approaching the performance of human judges who — very much unlike our system — have the full cultural context in which movies occur at their disposal.
    Page 7, “Never send a human to do a machine’s job.”
  5. Accuracies statistically significantly greater than bag-of-words according to a two-tailed t-test are indicated with *(p<.05) and **(p<.01).
    Page 8, “Never send a human to do a machine’s job.”

See all papers in Proc. ACL 2012 that mention bag-of-words.

See all papers in Proc. ACL that mention bag-of-words.

Back to top.

part-of-speech

Appears in 4 sentences as: part-of-speech (5)
In You Had Me at Hello: How Phrasing Affects Memorability
  1. Interestingly, this distinctiveness takes place at the level of words, but not at the level of other syntactic features: the part-of-speech composition of memorable quotes is in fact more likely with respect to newswire.
    Page 2, “Hello. My name is Inigo Montoya.”
  2. Thus, we can think of memorable quotes as consisting, in an aggregate sense, of unusual word choices built on a scaffolding of common part-of-speech patterns.
    Page 2, “Hello. My name is Inigo Montoya.”
  3. In particular, we analyze a corpus of advertising slogans, and we show that these slogans have significantly greater likelihood at both the word level and the part-of-speech level with respect to a language model trained on memorable movie quotes, compared to a corresponding language model trained on non-memorable movie quotes.
    Page 3, “Hello. My name is Inigo Montoya.”
  4. We then develop models using features based on the measures formulated earlier in this section: generality measures (the four listed in Table 4); distinctiveness measures (likelihood according to l, 2, and 3-gram “common language” models at the lexical and part-of-speech level for each quote in the pair, their differences, and pairwise comparisons between them); and similarity-to-slogans measures (likelihood according to l, 2, and 3-gram slogan-language models at the lexical and part-of-speech level for each quote in the pair, their differences, and pairwise comparisons between them).
    Page 7, “Never send a human to do a machine’s job.”

See all papers in Proc. ACL 2012 that mention part-of-speech.

See all papers in Proc. ACL that mention part-of-speech.

Back to top.

human judgments

Appears in 3 sentences as: human judges (1) human judgments (2)
In You Had Me at Hello: How Phrasing Affects Memorability
  1. None of these observations, however, serve as definitions, and indeed, we believe it desirable to not pre-commit to an abstract definition, but rather to adopt an operational formulation based on external human judgments .
    Page 2, “Hello. My name is Inigo Montoya.”
  2. In designing our study, we focus on a domain in which (i) there is rich use of language, some of which has achieved deep cultural penetration; (ii) there already exist a large number of external human judgments — perhaps implicit, but in a form we can extract; and (iii) we can control for the setting in which the text was used.
    Page 2, “Hello. My name is Inigo Montoya.”
  3. Thus, the main conclusion from these prediction tasks is that abstracting notions such as distinctiveness and generality can produce relatively streamlined models that outperform much heavier-weight bag-of-words models, and can suggest steps toward approaching the performance of human judges who — very much unlike our system — have the full cultural context in which movies occur at their disposal.
    Page 7, “Never send a human to do a machine’s job.”

See all papers in Proc. ACL 2012 that mention human judgments.

See all papers in Proc. ACL that mention human judgments.

Back to top.

model trained

Appears in 3 sentences as: model trained (3) models trained (1)
In You Had Me at Hello: How Phrasing Affects Memorability
  1. First, we show a concrete sense in which memorable quotes are indeed distinctive: with respect to lexical language models trained on the newswire portions of the Brown corpus [21], memorable quotes have significantly lower likelihood than their non-memorable counterparts.
    Page 2, “Hello. My name is Inigo Montoya.”
  2. In particular, we analyze a corpus of advertising slogans, and we show that these slogans have significantly greater likelihood at both the word level and the part-of-speech level with respect to a language model trained on memorable movie quotes, compared to a corresponding language model trained on non-memorable movie quotes.
    Page 3, “Hello. My name is Inigo Montoya.”
  3. In particular, the newswire section of the Brown corpus is predicted better at the lexical level by the language model trained on non-memorable quotes.
    Page 7, “Never send a human to do a machine’s job.”

See all papers in Proc. ACL 2012 that mention model trained.

See all papers in Proc. ACL that mention model trained.

Back to top.

statistically significant

Appears in 3 sentences as: statistically significant (2) statistically significantly (1)
In You Had Me at Hello: How Phrasing Affects Memorability
  1. For the null hypothesis of random guessing, these results are statistically significant , p < 2‘6 m .016.
    Page 4, “I’m ready for my closeup.”
  2. Table 2 shows that all the subjects performed (sometimes much) better than chance, and against the null hypothesis that all subjects are guessing randomly, the results are statistically significant , p < 2‘6 m .016.
    Page 4, “I’m ready for my closeup.”
  3. Accuracies statistically significantly greater than bag-of-words according to a two-tailed t-test are indicated with *(p<.05) and **(p<.01).
    Page 8, “Never send a human to do a machine’s job.”

See all papers in Proc. ACL 2012 that mention statistically significant.

See all papers in Proc. ACL that mention statistically significant.

Back to top.