Unsupervised Solution Post Identification from Discussion Forums
P, Deepak and Visweswariah, Karthik

Article Structure

Abstract

Discussion forums have evolved into a dependable source of knowledge to solve common problems.

Introduction

Discussion forums have become a popular knowledge source for finding solutions to common problems.

Related Work

In this section, we provide a brief overview of previous work related to our problem.

Problem Definition

Let a thread ’2' from a discussion forum be made up of 75 posts.

Our Approach

4.1 The Correlation Assumption

Experimental Evaluation

We use a crawl of 140k threads from Apple Discussion forumslo.

Conclusions and Future Work

We considered the problem of unsupervised solution post identification from discussion forum threads.

Topics

translation models

Appears in 24 sentences as: translation model (9) Translation models (1) translation models (17)
In Unsupervised Solution Post Identification from Discussion Forums
  1. We use translation models and language models to exploit lexical correlations and solution post character respectively.
    Page 1, “Abstract”
  2. We model the lexical correlation and solution post character using regularized translation models and unigram language models respectively.
    Page 2, “Introduction”
  3. Usage of translation models for modeling the correlation between textual problems and solutions have been explored earlier starting from the answer retrieval work in (Xue et al., 2008) where new queries were conceptually expanded using the translation model to improve retrieval.
    Page 3, “Related Work”
  4. Translation models were also seen to be useful in segmenting incident reports into the problem and solution parts (Deepak et al., 2012); we will use an adaptation of the generative model presented therein, for our solution extraction formulation.
    Page 3, “Related Work”
  5. Entity-level translation models
    Page 3, “Related Work”
  6. The usage of translation models in QA retrieval (Xue et al., 2008; Singh, 2012) and segmentation (Deepak et al., 2012) were also motivated by the correlation assumption.
    Page 4, “Our Approach”
  7. We use an IBM Model 1 translation model (Brown et al., 1990) in our technique; simplistically, such a model m may be thought of as a 2-d associative array where the value m[w1] [mg] is directly related to the probability of wl occuring in the problem when ’LU2 occurs in the solution.
    Page 4, “Our Approach”
  8. Consider a unigram language model 83 that models the lexical characteristics of solution posts, and a translation model 73 that models the lexical correlation between problems and solutions.
    Page 4, “Our Approach”
  9. In short, each solution word is assumed to be generated from the language model or the translation model (conditioned on the problem words) with a probability of A and l — A respectively, thus accounting for the correlation assumption.
    Page 4, “Our Approach”
  10. Of the solution words above, generic words such as try and should could probably be explained by (i.e., sampled from) the solution language model, whereas disconnect and rejoin could be correlated well with surf and wifi and hence are more likely to be supported better by the translation model .
    Page 4, “Our Approach”
  11. F ((p, 7“), S , ’2' ) indicates the conformance of the (p, 7“) pair (details in Section 4.3.1) with the generative model that uses the S and ’2' models as the language and translation models respectively.
    Page 4, “Our Approach”

See all papers in Proc. ACL 2014 that mention translation models.

See all papers in Proc. ACL that mention translation models.

Back to top.

language models

Appears in 14 sentences as: Language Model (1) language model (5) language models (8)
In Unsupervised Solution Post Identification from Discussion Forums
  1. We use translation models and language models to exploit lexical correlations and solution post character respectively.
    Page 1, “Abstract”
  2. The cornerstone of our technique is the usage of a hitherto unexplored textual feature, lexical correlations between problems and solutions, that is exploited along with language model based characterization of solution posts.
    Page 2, “Introduction”
  3. We model the lexical correlation and solution post character using regularized translation models and unigram language models respectively.
    Page 2, “Introduction”
  4. We will use translation and language models in our method for solution identification.
    Page 3, “Related Work”
  5. Consider a unigram language model 83 that models the lexical characteristics of solution posts, and a translation model 73 that models the lexical correlation between problems and solutions.
    Page 4, “Our Approach”
  6. In short, each solution word is assumed to be generated from the language model or the translation model (conditioned on the problem words) with a probability of A and l — A respectively, thus accounting for the correlation assumption.
    Page 4, “Our Approach”
  7. Of the solution words above, generic words such as try and should could probably be explained by (i.e., sampled from) the solution language model , whereas disconnect and rejoin could be correlated well with surf and wifi and hence are more likely to be supported better by the translation model.
    Page 4, “Our Approach”
  8. In our example, if the word disconnect is assigned a source probability of 0.9 and 0.1 for the translation and language models respectively, the virtual document-pair from (p, 7“) that goes into the training of the respective ’2' model would assume that disconnect occurs in 7“ with a frequency of 0.9; similarly, the respective 8 would account for disconnect with a frequency of 0.1.
    Page 6, “Our Approach”
  9. The language models are learnt only over the 7“ parts of the (p, 7“) pairs since they are meant to characterize reply behavior; on the other hand, translation models learn over both p and 7“ parts to model correlation.
    Page 6, “Our Approach”
  10. Consider the post and reply vocabularies to be of sizes A and B respectively; then, the translation model would have A x B variables, whereas the unigram language model has only B variables.
    Page 6, “Our Approach”
  11. This gives the translation model an implicit edge due to having more parameters to tune to the data, putting the language models at a disadvantage.
    Page 6, “Our Approach”

See all papers in Proc. ACL 2014 that mention language models.

See all papers in Proc. ACL that mention language models.

Back to top.

F-measure

Appears in 8 sentences as: F-Measure (2) F-measure (8)
In Unsupervised Solution Post Identification from Discussion Forums
  1. leading to an F-measure of 53% for our initialization heuristic.
    Page 7, “Experimental Evaluation”
  2. on various quality metrics, of which F-Measure is typically considered most important.
    Page 8, “Experimental Evaluation”
  3. Our pure-LM13 setting (i.e., A = l) was seen to perform up to 6 F-Measure points better than the pure-TM14 setting (i.e., A = 0), whereas the uniform mix is seen to be able to harness both to give a 1.4 point (i.e., 2.2%) improvement over the pure-LM case.
    Page 8, “Experimental Evaluation”
  4. The comparison with the approach from (Cong et al., 2008) illustrates that our method is very clearly the superior method for solution identification outperforming the former by large margins on all the evaluation measures, with the improvement on F-measure being more than 25 points.
    Page 8, “Experimental Evaluation”
  5. Our technique is seen to outperform ANS CT by a respectable margin (8.6 F-measure points) while trailing behind the enhanced ANS-ACK PCT method with a reasonably narrow 3.8 F-measure point margin.
    Page 8, “Experimental Evaluation”
  6. Figure 1 plots the F-measure across iterations for the run with A = 0.5, 7' = 0.4 setting, where the F-measure is seen to stabilize in as few as 4-5 iterations.
    Page 8, “Experimental Evaluation”
  7. As may be seen from Figure 2, the quality of the results as measured by the F-measure is seen to peak around the middle (i.e., A = 0.5), and decline slowly towards either extreme, with a sharp decline at A = 0 (i.e., pure-TM setting).
    Page 9, “Experimental Evaluation”
  8. Beyond 800 threads, the F-measure was seen to flatten out rapidly and stabilize at N 64%.
    Page 9, “Experimental Evaluation”

See all papers in Proc. ACL 2014 that mention F-measure.

See all papers in Proc. ACL that mention F-measure.

Back to top.

structural features

Appears in 7 sentences as: structural feature (2) structural features (6)
In Unsupervised Solution Post Identification from Discussion Forums
  1. Our technique is designed to not rely much on structural features such as post metadata since such features are often not uniformly available across forums.
    Page 1, “Abstract”
  2. Though such assumptions on structural features, if generic enough, may be built into unsupervised techniques to aid solution identification, the variation in availability of such features across forums limits the usage of models that rely heavily on structural features .
    Page 2, “Introduction”
  3. In particular, we show that by using post position as the only non-textual feature, we are able to achieve accuracies comparable to supervision-based approaches that use many structural features (Catherine et al., 2013).
    Page 2, “Introduction”
  4. Towards this, we make use of a structural feature ; in particular, adapting the hypothesis that solutions occur in the first N posts (Ref.
    Page 6, “Our Approach”
  5. We will show that we are able to effectively perform solution identification using our approach by exploiting just one structural feature , the post position, as above.
    Page 7, “Our Approach”
  6. Thus, our technique is able to exploit any extra solution identifying structural features that are available.
    Page 9, “Experimental Evaluation”
  7. We show that our technique is able to effectively identify solutions using just one non-content based feature, the post position, whereas previous techniques in literature have depended heavily on structural features (that are not always available in many forums) and supervised information.
    Page 9, “Conclusions and Future Work”

See all papers in Proc. ACL 2014 that mention structural features.

See all papers in Proc. ACL that mention structural features.

Back to top.

generative model

Appears in 6 sentences as: Generative model (1) generative model (4) generative model: (1)
In Unsupervised Solution Post Identification from Discussion Forums
  1. Translation models were also seen to be useful in segmenting incident reports into the problem and solution parts (Deepak et al., 2012); we will use an adaptation of the generative model presented therein, for our solution extraction formulation.
    Page 3, “Related Work”
  2. 4.2 Generative model for Solution Posts
    Page 4, “Our Approach”
  3. Our generative model models the reply part of a (p, r) pair (in which r is a solution) as being generated from the statistical models in {83, 73} as follows.
    Page 4, “Our Approach”
  4. The generative model above is similar to the proposal in (Deepak et al., 2012), adapted suitably for our scenario.
    Page 4, “Our Approach”
  5. F ((p, 7“), S , ’2' ) indicates the conformance of the (p, 7“) pair (details in Section 4.3.1) with the generative model that uses the S and ’2' models as the language and translation models respectively.
    Page 4, “Our Approach”
  6. falls out of the generative model:
    Page 5, “Our Approach”

See all papers in Proc. ACL 2014 that mention generative model.

See all papers in Proc. ACL that mention generative model.

Back to top.

unigram

Appears in 4 sentences as: unigram (4)
In Unsupervised Solution Post Identification from Discussion Forums
  1. We model the lexical correlation and solution post character using regularized translation models and unigram language models respectively.
    Page 2, “Introduction”
  2. Consider a unigram language model 83 that models the lexical characteristics of solution posts, and a translation model 73 that models the lexical correlation between problems and solutions.
    Page 4, “Our Approach”
  3. Consider the post and reply vocabularies to be of sizes A and B respectively; then, the translation model would have A x B variables, whereas the unigram language model has only B variables.
    Page 6, “Our Approach”
  4. We model and harness lexical correlations using translation models, in the company of unigram language models that are used to characterize reply posts, and formulate a clustering-based EM approach for solution identification.
    Page 9, “Conclusions and Future Work”

See all papers in Proc. ACL 2014 that mention unigram.

See all papers in Proc. ACL that mention unigram.

Back to top.