Detecting Experiences from Weblogs
Park, Keun Chan and Jeong, Yoonjae and Myaeng, Sung Hyon

Article Structure

Abstract

Weblogs are a source of human activity knowledge comprising valuable information such as facts, opinions and personal experiences.

Introduction

In traditional philosophy, human beings are known to acquire knowledge mainly by reasoning and experience.

Lexicon Construction

Since our definition of experience is based on activities and events, it is critical to determine whether a sentence contains a predicate describing an activity or an event.

Experience Detection

As mentioned earlier, experience-revealing sentences tend to have a certain linguistic style.

Related Work

Experience mining in its entirety is a relatively new area where various natural language processing and text mining techniques can play a significant role.

Conclusion and Future Work

We defined experience detection as an essential task for experience mining, which is restated as

Topics

WordNet

Appears in 9 sentences as: WordNet (9)
In Detecting Experiences from Weblogs
  1. We consider all the verbs and verb phrases in WordNet (Fellbaum, 1998) which is the largest electronic lexical database.
    Page 2, “Lexicon Construction”
  2. Based on the query matrix in table 2, we issued queries for all the verbs and verb phrases from WordNet to a search engine.
    Page 3, “Lexicon Construction”
  3. We finally trained our model with the top 10 features and classified all WordNet verbs and verb phrases.
    Page 5, “Lexicon Construction”
  4. We collected all hyponyms of words “do” and “act”, from WordNet (Fellbaum, 1998).
    Page 7, “Experience Detection”
  5. Lastly, we removed all the verbs that are under the hierarchy of “move” from WordNet .
    Page 7, “Experience Detection”
  6. In addition, the lexicon we constructed for the baseline (i.e., using the WordNet ) contains more errors than our activity lexicon for activity verbs.
    Page 7, “Experience Detection”
  7. Since our work is specifically geared toward domain-independent experience detection, we attempted to maximize the coverage by using all the verbs in WordNet , as opposed to the verbs appearing in a particular domain-specific corpus (e.g., medicine domain) as done in the previous work.
    Page 8, “Related Work”
  8. For verb classes, in particular, we devised a method for classifying all the verbs and verb phrases in WordNet into the activity and state classes.
    Page 8, “Conclusion and Future Work”
  9. The experimental results show that verb and verb phrase classification method is reasonably accurate with 91% precision and 78% recall with manually constructed gold standard consisting of 80 verbs and 82% accuracy for a random sample of all the WordNet entries.
    Page 8, “Conclusion and Future Work”

See all papers in Proc. ACL 2010 that mention WordNet.

See all papers in Proc. ACL that mention WordNet.

Back to top.

precision and recall

Appears in 6 sentences as: precision and recall (6)
In Detecting Experiences from Weblogs
  1. Note that the precision and recall are macro-averaged values across the two classes, activity and state.
    Page 5, “Lexicon Construction”
  2. We not only compared our results with the baseline in terms of precision and recall but also
    Page 7, “Experience Detection”
  3. The performance for the best case with all the features included is very promising, closed to 92% precision and recall .
    Page 7, “Experience Detection”
  4. In order to see the effect of including individual features in the feature set, precision and recall were measured after eliminating a particular feature from the full set.
    Page 7, “Experience Detection”
  5. Although the absence of the lexicon feature hurt the performance most badly, still the performance was reasonably high (roughly 84 % in precision and recall for the Logistic Regression case).
    Page 7, “Experience Detection”
  6. For experience detection, the performance was very promising, closed to 92% in precision and recall when all the features were used.
    Page 8, “Conclusion and Future Work”

See all papers in Proc. ACL 2010 that mention precision and recall.

See all papers in Proc. ACL that mention precision and recall.

Back to top.

SVM

Appears in 5 sentences as: SVM (6)
In Detecting Experiences from Weblogs
  1. ME SVM Prec.
    Page 5, “Lexicon Construction”
  2. While we tested several classifiers, we chose to use two different classifiers based on SVM and Logistic Regression for the final experimental results because they showed the best performance.
    Page 6, “Experience Detection”
  3. Logistic Feature Regression SVM
    Page 7, “Experience Detection”
  4. Logistic Feature Regression SVM
    Page 7, “Experience Detection”
  5. There is almost no difference between the L0-gistic Regression and SVM classifiers for our methods although SVM was inferior for the baseline.
    Page 7, “Experience Detection”

See all papers in Proc. ACL 2010 that mention SVM.

See all papers in Proc. ACL that mention SVM.

Back to top.

classification task

Appears in 4 sentences as: classification task (3) classification tasks (1)
In Detecting Experiences from Weblogs
  1. Based on an observation that expe-rience-revealing sentences have a certain linguistic style, we formulate the problem of detecting experience as a classification task using various features including tense, mood, aspect, modality, experiencer, and verb classes.
    Page 1, “Abstract”
  2. the problem as a classification task using various linguistic features including tense, mood, aspect, modality, experiencer, and verb classes.
    Page 2, “Introduction”
  3. The other one is based on Support Vector Machine (Chang and Lin, 2001) which is the state-of-the-art algorithm for many classification tasks .
    Page 5, “Lexicon Construction”
  4. Having converted the problem of experience detection for sentences to a classification task , we focus on the extent to which various linguistic features contribute to the performance of the binary classif1er for sentences.
    Page 6, “Experience Detection”

See all papers in Proc. ACL 2010 that mention classification task.

See all papers in Proc. ACL that mention classification task.

Back to top.

dependency parser

Appears in 4 sentences as: dependency parser (4)
In Detecting Experiences from Weblogs
  1. For a POS and grammatical check of a candidate sentence, we used the Stanford POS tagger (Toutanova et al., 2003) and Stanford dependency parser (Klein and Manning, 2003).
    Page 4, “Lexicon Construction”
  2. The dependency parser is used to ensure a modal marker is indeed associated with the main predicate.
    Page 6, “Experience Detection”
  3. In order to make a distinction, we use the dependency parser and a named-entity recognizer (Finkel et al., 2005) that can recognize person pronouns and person names.
    Page 6, “Experience Detection”
  4. We used the dependency parser for extracting objective cases using the direct object relation.
    Page 7, “Experience Detection”

See all papers in Proc. ACL 2010 that mention dependency parser.

See all papers in Proc. ACL that mention dependency parser.

Back to top.

data sparseness

Appears in 3 sentences as: Data sparseness (1) data sparseness (2)
In Detecting Experiences from Weblogs
  1. Other thematic roles did not perform well because of the data sparseness .
    Page 5, “Lexicon Construction”
  2. Data sparseness affected the linguistic schemata as well.
    Page 5, “Lexicon Construction”
  3. In order to increase the coverage even further and reduce the errors in lexicon construction, i.e., verb classification, caused by data sparseness , we need to devise a different method, perhaps using domain specific resources.
    Page 8, “Conclusion and Future Work”

See all papers in Proc. ACL 2010 that mention data sparseness.

See all papers in Proc. ACL that mention data sparseness.

Back to top.

randomly sampled

Appears in 3 sentences as: random sample (1) randomly sampled (2)
In Detecting Experiences from Weblogs
  1. We randomly sampled 200 items and examined how accurately the classification was done.
    Page 5, “Lexicon Construction”
  2. We randomly sampled l,000 sentences4 and asked three annotators to judge whether or not individual sentences are considered containing an experience based on our definition.
    Page 6, “Experience Detection”
  3. The experimental results show that verb and verb phrase classification method is reasonably accurate with 91% precision and 78% recall with manually constructed gold standard consisting of 80 verbs and 82% accuracy for a random sample of all the WordNet entries.
    Page 8, “Conclusion and Future Work”

See all papers in Proc. ACL 2010 that mention randomly sampled.

See all papers in Proc. ACL that mention randomly sampled.

Back to top.