Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums
Ding, Shilin and Cong, Gao and Lin, Chin-Yew and Zhu, Xiaoyan

Article Structure

Abstract

Online forum discussions often contain vast amounts of questions that are the focuses of discussions.

Introduction

Forums are web virtual spaces where people can ask questions, answer questions and participate in discussions.

Related Work

There is some research on summarizing discussion threads and emails.

Context and Answer Detection

A question is a linguistic expression used by a questioner to request information in the form of an answer.

Experiments

4.1 Experimental setup

Discussions and Conclusions

We presented a new approach to detecting contexts and answers for questions in forums with good performance.

Topics

CRFs

Appears in 45 sentences as: CRFs (63)
In Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums
  1. In this paper, we propose a general framework based on Conditional Random Fields ( CRFs ) to detect the contexts and answers of questions from forum threads.
    Page 1, “Abstract”
  2. We improve the basic framework by Skip—chain CRFs and 2D CRFs to better accommodate the features of forums for better performance.
    Page 1, “Abstract”
  3. First, we employ Linear Conditional Random Fields ( CRFs ) to identify contexts and answers, which can capture the relationships between contiguous sentences.
    Page 2, “Introduction”
  4. We also extend the basic model to 2D CRFs to model dependency between contiguous questions in a forum thread for context and answer identification.
    Page 2, “Introduction”
  5. Experimental results show that 1) Linear CRFs outperform SVM and decision tree in both context and answer detection; 2) Skip-chain CRFs outperform Linear CRFs for answer finding, which demonstrates that context improves answer finding; 3) 2D CRF model improves the performance of Linear CRFs and the combination of 2D CRFs and Skip-chain CRFs achieves better performance for context detection.
    Page 2, “Introduction”
  6. We first discuss using Linear CRFs for context and answer detection, and then extend the basic framework to Skip-chain CRFs and 2D CRFs to better model our problem.
    Page 3, “Context and Answer Detection”
  7. 3.1 Using Linear CRFs
    Page 3, “Context and Answer Detection”
  8. For ease of presentation, we focus on detecting contexts using Linear CRFs .
    Page 3, “Context and Answer Detection”
  9. To this end, we proposed a general framework to detect contexts and answers based on Conditional Random Fields (Lafferty et al., 2001) ( CRFs ) which are able to model the sequential dependencies between contiguous nodes.
    Page 4, “Context and Answer Detection”
  10. (See Section 3.4 for more about CRFs )
    Page 4, “Context and Answer Detection”
  11. However, our problem cannot be modeled with Linear CRFs in the same way as other NLP tasks, where one node has a unique label.
    Page 4, “Context and Answer Detection”

See all papers in Proc. ACL 2008 that mention CRFs.

See all papers in Proc. ACL that mention CRFs.

Back to top.

CRF

Appears in 22 sentences as: CRF (25)
In Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums
  1. To capture the dependency between contexts and answers, we introduce Skip-chain CRF model for answer detection.
    Page 2, “Introduction”
  2. Experimental results show that 1) Linear CRFs outperform SVM and decision tree in both context and answer detection; 2) Skip-chain CRFs outperform Linear CRFs for answer finding, which demonstrates that context improves answer finding; 3) 2D CRF model improves the performance of Linear CRFs and the combination of 2D CRFs and Skip-chain CRFs achieves better performance for context detection.
    Page 2, “Introduction”
  3. Finally, we will briefly introduce CRF models and the features that we used for CRF model.
    Page 3, “Context and Answer Detection”
  4. A CRF is an undirected graphical model G of the conditional distribution P(Y|X).
    Page 4, “Context and Answer Detection”
  5. Linear CRF model has been successfully applied in NLP and text mining tasks (McCallum and Li, 2003; Sha and Pereira, 2003).
    Page 4, “Context and Answer Detection”
  6. In each pass, one question Q, is selected as focus and each other sentence in the thread will be labeled as context 0 of Q,- or not using Linear CRF model.
    Page 4, “Context and Answer Detection”
  7. The Linear CRF model can capture the dependency between contiguous sentences.
    Page 4, “Context and Answer Detection”
  8. To model the long distance dependency between contexts and answers, we will use Skip-chain CRF model to detect context and answer together.
    Page 4, “Context and Answer Detection”
  9. Skip-chain CRF model is applied for entity extraction and meeting summarization (Sutton and McCallum, 2006; Galley, 2006).
    Page 4, “Context and Answer Detection”
  10. The graphical representation of a Skip-chain CRF given in Figure2(b) consists of two types of edges: linear-chain (yt_1 to yt) and skip-chain edges (y, to yj).
    Page 4, “Context and Answer Detection”
  11. 3.3 Using 2D CRF Model
    Page 5, “Context and Answer Detection”

See all papers in Proc. ACL 2008 that mention CRF.

See all papers in Proc. ACL that mention CRF.

Back to top.

SVM

Appears in 9 sentences as: SVM (10)
In Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums
  1. Experimental results show that 1) Linear CRFs outperform SVM and decision tree in both context and answer detection; 2) Skip-chain CRFs outperform Linear CRFs for answer finding, which demonstrates that context improves answer finding; 3) 2D CRF model improves the performance of Linear CRFs and the combination of 2D CRFs and Skip-chain CRFs achieves better performance for context detection.
    Page 2, “Introduction”
  2. (2007) used SVM to extract input-reply pairs from forums for chatbot knowledge.
    Page 3, “Related Work”
  3. SVM , can be employed, where each pair of question and candidate context will be treated as an instance.
    Page 4, “Context and Answer Detection”
  4. Model H Prec(%) l Rec(%) 1 F1(%) 1 Context Detection SVM 75.27 68.80 71.32 C4.5 70.16 64.30 67.21 L—CRF 75.75 72.84 74.45 Answer Detection SVM 73.31 47.35 57.52 C4.5 65.36 46.55 54.37 L—CRF 63.92 58.74 61.22
    Page 7, “Experiments”
  5. This experiment is to evaluate Linear CRF model (Section 3.1) for context and answer detection by comparing with SVM and C4.5(Quinlan, 1993).
    Page 7, “Experiments”
  6. For SVM , we use SVMlightanchims, 1999).
    Page 7, “Experiments”
  7. SVM and C45 use the same set of features as Linear CRFs.
    Page 7, “Experiments”
  8. As shown in Table 4, Linear CRF model outperforms SVM and C45 for both context and answer detection.
    Page 7, “Experiments”
  9. As a summary, 1) our CRF model outperforms SVM and C45 for both context and answer detections; 2) context is very useful in answer detection; 3) the Skip-chain CRF method is effective in leveraging context for answer detection; and 4) 2D CRF model improves the performance of Linear CRFs for both context and answer detection.
    Page 8, “Experiments”

See all papers in Proc. ACL 2008 that mention SVM.

See all papers in Proc. ACL that mention SVM.

Back to top.

contextual information

Appears in 6 sentences as: Contextual Information (1) contextual information (5)
In Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums
  1. As shown in the example, a forum question usually requires contextual information to provide background or constraints.
    Page 1, “Introduction”
  2. Moreover, it sometimes needs contextual information to provide explicit link to its answers.
    Page 2, “Introduction”
  3. We call contextual information the context of a question in this paper.
    Page 2, “Introduction”
  4. Table 6: Contextual Information for Answer Detection.
    Page 7, “Experiments”
  5. Linear CRFs with contextual information perform better than those without context.
    Page 7, “Experiments”
  6. The results clearly shows that contextual information greatly improves the performance of answer detection.
    Page 7, “Experiments”

See all papers in Proc. ACL 2008 that mention contextual information.

See all papers in Proc. ACL that mention contextual information.

Back to top.

Cosine similarity

Appears in 6 sentences as: Cosine similarity (3) cosine similarity (3)
In Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums
  1. (2006a) used cosine similarity to match students’ query with reply posts for discussion-bot.
    Page 3, “Related Work”
  2. The word similarity is based on cosine similarity of TF/IDF weighted vectors.
    Page 6, “Context and Answer Detection”
  3. - Cosine similarity with the question
    Page 6, “Context and Answer Detection”
  4. - Cosine similarity between contiguous sentences
    Page 6, “Context and Answer Detection”
  5. - Similarity between contiguous sentences using WordNet - Cosine similarity with the expanded question using the lexical matching words
    Page 6, “Context and Answer Detection”
  6. rums), and then use them to expand question and compute cosine similarity .
    Page 6, “Context and Answer Detection”

See all papers in Proc. ACL 2008 that mention Cosine similarity.

See all papers in Proc. ACL that mention Cosine similarity.

Back to top.

dependency relationship

Appears in 4 sentences as: dependency relation (1) dependency relationship (3)
In Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums
  1. One is the dependency relationship between contexts and answers, which should be leveraged especially when questions alone do not provide sufficient information to find answers; the other is the dependency between answer candidates (similar to sentence dependency described above).
    Page 2, “Introduction”
  2. However, they cannot capture the dependency relationship between sentences.
    Page 4, “Context and Answer Detection”
  3. To label 810, we need consider the dependency relation between Q2 and Q3.
    Page 5, “Context and Answer Detection”
  4. The labels of the same sentence for two contiguous questions in a thread would be conditioned on the dependency relationship between the questions.
    Page 5, “Context and Answer Detection”

See all papers in Proc. ACL 2008 that mention dependency relationship.

See all papers in Proc. ACL that mention dependency relationship.

Back to top.

semantic similarity

Appears in 4 sentences as: semantic similarity (4)
In Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums
  1. here, we use the product of sim(xu, Qi) and sim(:cv, {mm 62,} to estimate the possibility of being a context-answer pair for (u, v) , where sim(-, is the semantic similarity calculated on WordNet as described in Section 3.5.
    Page 5, “Context and Answer Detection”
  2. The similarity feature is to capture the word similarity and semantic similarity between candidate contexts and answers.
    Page 6, “Context and Answer Detection”
  3. The semantic similarity between words is computed based on Wu and Palmer’s measure (Wu and Palmer, 1994) using WordNet (Fellbaum, 1998).1 The similarity between contiguous sentences will be used to capture the dependency for CRFs.
    Page 6, “Context and Answer Detection”
  4. 1The semantic similarity between sentences is calculated as in (Yang et al., 2006).
    Page 6, “Context and Answer Detection”

See all papers in Proc. ACL 2008 that mention semantic similarity.

See all papers in Proc. ACL that mention semantic similarity.

Back to top.

WordNet

Appears in 4 sentences as: WordNet (4)
In Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums
  1. here, we use the product of sim(xu, Qi) and sim(:cv, {mm 62,} to estimate the possibility of being a context-answer pair for (u, v) , where sim(-, is the semantic similarity calculated on WordNet as described in Section 3.5.
    Page 5, “Context and Answer Detection”
  2. The semantic similarity between words is computed based on Wu and Palmer’s measure (Wu and Palmer, 1994) using WordNet (Fellbaum, 1998).1 The similarity between contiguous sentences will be used to capture the dependency for CRFs.
    Page 6, “Context and Answer Detection”
  3. - Similarity with the question using WordNet
    Page 6, “Context and Answer Detection”
  4. - Similarity between contiguous sentences using WordNet - Cosine similarity with the expanded question using the lexical matching words
    Page 6, “Context and Answer Detection”

See all papers in Proc. ACL 2008 that mention WordNet.

See all papers in Proc. ACL that mention WordNet.

Back to top.

knowledge base

Appears in 3 sentences as: knowledge base (3)
In Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums
  1. Extracting contexts and answers together with the questions will yield not only a coherent forum summary but also a valuable QA knowledge base .
    Page 1, “Abstract”
  2. Another motivation of detecting contexts and answers of the questions in forum threads is that it could be used to enrich the knowledge base of community-based question and answering (CQA) services such as Live QnA and Yahoo!
    Page 2, “Introduction”
  3. To enrich the knowledge base , not only the answers, but also the contexts are critical; otherwise the answer to a question such as How much is the taxi would be useless without context in the database.
    Page 2, “Introduction”

See all papers in Proc. ACL 2008 that mention knowledge base.

See all papers in Proc. ACL that mention knowledge base.

Back to top.

Structural features:

Appears in 3 sentences as: structural feature (1) structural features (1) Structural features: (1)
In Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums
  1. Structural features:
    Page 6, “Context and Answer Detection”
  2. The structural features of forums provide strong clues for contexts.
    Page 6, “Context and Answer Detection”
  3. We found that similarity features are the most important and structural feature the next.
    Page 8, “Experiments”

See all papers in Proc. ACL 2008 that mention Structural features:.

See all papers in Proc. ACL that mention Structural features:.

Back to top.