Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
Zhou, Guangyou and Liu, Fang and Liu, Yang and He, Shizhu and Zhao, Jun

Article Structure

Abstract

Community question answering (CQA) has become an increasingly popular research topic.

Introduction

With the development of Web 2.0, community question answering (CQA) services like Yahoo!

Our Approach

2.1 Problem Statement

Experiments

3.1 Data Set and Evaluation Metrics

Conclusions and Future Work

In this paper, we propose to employ statistical machine translation to improve question retrieval and

Topics

machine translation

Appears in 21 sentences as: Machine Translation (2) machine translation (18) machine translator (2)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. Our proposed method employs statistical machine translation to improve question retrieval and enriches the question representation with the translated words from other languages via matrix factorization.
    Page 1, “Abstract”
  2. The idea of improving question retrieval with statistical machine translation is based on the following two observa-
    Page 2, “Introduction”
  3. However, there are two problems with this enrichment: (1) enriching the original questions with the translated words from other languages increases the dimensionality and makes the question representation even more sparse; (2) statistical machine translation may introduce noise, which can harm the performance of question retrieval.
    Page 2, “Introduction”
  4. To solve these two problems, we propose to leverage statistical machine translation to improve question retrieval via matrix factorization.
    Page 2, “Introduction”
  5. Section 2 describes the proposed method by leveraging statistical machine translation to improve question retrieval via matrix factorization.
    Page 2, “Introduction”
  6. This paper aims to leverage statistical machine translation to enrich the question representation.
    Page 3, “Our Approach”
  7. Statistical machine translation (e.g., Google Translate) can utilize contextual information during the question translation, so it can solve the word ambiguity and word mismatch problems to some extent.
    Page 3, “Our Approach”
  8. However, there are two problems with this enrichment: (1) enriching the original questions with the translated words from other languages makes the question representation even more sparse; (2) statistical machine translation may introduce noise.5 To solve these two problems, we propose to leverage statistical machine translation to improve question retrieval via matrix factorization.
    Page 3, “Our Approach”
  9. 5 Statistical machine translation quality is far from satisfactory in real applications.
    Page 3, “Our Approach”
  10. Machine Translation
    Page 3, “Our Approach”
  11. Machine Translation
    Page 3, “Our Approach”

See all papers in Proc. ACL 2013 that mention machine translation.

See all papers in Proc. ACL that mention machine translation.

Back to top.

statistical machine translation

Appears in 16 sentences as: Statistical machine translation (2) statistical machine translation (15)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. Our proposed method employs statistical machine translation to improve question retrieval and enriches the question representation with the translated words from other languages via matrix factorization.
    Page 1, “Abstract”
  2. The idea of improving question retrieval with statistical machine translation is based on the following two observa-
    Page 2, “Introduction”
  3. However, there are two problems with this enrichment: (1) enriching the original questions with the translated words from other languages increases the dimensionality and makes the question representation even more sparse; (2) statistical machine translation may introduce noise, which can harm the performance of question retrieval.
    Page 2, “Introduction”
  4. To solve these two problems, we propose to leverage statistical machine translation to improve question retrieval via matrix factorization.
    Page 2, “Introduction”
  5. Section 2 describes the proposed method by leveraging statistical machine translation to improve question retrieval via matrix factorization.
    Page 2, “Introduction”
  6. This paper aims to leverage statistical machine translation to enrich the question representation.
    Page 3, “Our Approach”
  7. Statistical machine translation (e.g., Google Translate) can utilize contextual information during the question translation, so it can solve the word ambiguity and word mismatch problems to some extent.
    Page 3, “Our Approach”
  8. However, there are two problems with this enrichment: (1) enriching the original questions with the translated words from other languages makes the question representation even more sparse; (2) statistical machine translation may introduce noise.5 To solve these two problems, we propose to leverage statistical machine translation to improve question retrieval via matrix factorization.
    Page 3, “Our Approach”
  9. 5 Statistical machine translation quality is far from satisfactory in real applications.
    Page 3, “Our Approach”
  10. If we set a small value for Ap, the objective function behaves like the traditional NMF and the importance of data sparseness is emphasized; while a big value of Ap indicates Vp should be very closed to V1, and equation (3) aims to remove the noise introduced by statistical machine translation .
    Page 4, “Our Approach”
  11. Row 8 and row 9 are our proposed method, which leverages statistical machine translation to improve question retrieval via matrix factorization.
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention statistical machine translation.

See all papers in Proc. ACL that mention statistical machine translation.

Back to top.

translation model

Appears in 12 sentences as: translation model (7) translation models (7)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. State-of-the-art approaches address these issues by implicitly expanding the queried questions with additional words or phrases using monolingual translation models .
    Page 1, “Abstract”
  2. Researchers have proposed the use of word-based translation models (Berger et al., 2000; Jeon et al., 2005; Xue et al., 2008; Lee et al., 2008; Bernhard and Gurevych, 2009) to solve the word mismatch problem.
    Page 2, “Introduction”
  3. As a principle approach to capture semantic word relations, word-based translation models are built by using the IBM model 1 (Brown et al., 1993) and have been shown to outperform traditional models (e.g., VSM, BM25, LM) for question retrieval.
    Page 2, “Introduction”
  4. (2011) proposed the phrase-based translation models for question and answer retrieval.
    Page 2, “Introduction”
  5. (2012) exploited bilingual translation for question retrieval and obtained the better performance than traditional monolingual translation models .
    Page 2, “Introduction”
  6. Row 3 and row 6 are monolingual translation models to address the word mismatch problem and obtain the state-of-the-art performance in previous work.
    Page 7, “Experiments”
  7. Row 3 is the word-based translation model (Jeon et al., 2005), and row 4 is the word-based translation language model, which linearly combines the word-based translation model and language model into a unified framework (Xue et al., 2008).
    Page 7, “Experiments”
  8. Row 5 is the phrase-based translation model , which translates a sequence of words as whole (Zhou et al., 2011).
    Page 7, “Experiments”
  9. Row 6 is the entity-based translation model, which extends the word-based translation model and explores strategies to learn the translation probabilities between words and the concepts using the CQA archives and a popular entity catalog (Singh, 2012).
    Page 7, “Experiments”
  10. Row 7 is the bilingual translation model , which translates the English questions from Yahoo!
    Page 7, “Experiments”
  11. (l) Monolingual translation models significantly outperform the VSM and LM (row 1 and
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention translation model.

See all papers in Proc. ACL that mention translation model.

Back to top.

contextual information

Appears in 10 sentences as: Contextual Information (1) Contextual information (1) contextual information (8)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. The basic idea is to capture the contextual information in modeling the translation of phrases as a whole, thus the word ambiguity problem is somewhat alleviated.
    Page 2, “Introduction”
  2. tions: (1) Contextual information is exploited during the translation from one language to another.
    Page 2, “Introduction”
  3. For example in Table 1, English words “interest” and “bank” that have multiple meanings under different contexts are correctly addressed by using the state-of-the-art translation tool — —Google Translate.4 Thus, word ambiguity based on contextual information is naturally involved when questions are translated.
    Page 2, “Introduction”
  4. Statistical machine translation (e.g., Google Translate) can utilize contextual information during the question translation, so it can solve the word ambiguity and word mismatch problems to some extent.
    Page 3, “Our Approach”
  5. Table 6: Impact of the contextual information .
    Page 8, “Experiments”
  6. 3.5 Impact of the Contextual Information
    Page 8, “Experiments”
  7. In this paper, we translate the English questions into other four languages using Google Translate (GTrans), which takes into account contextual information during translation.
    Page 8, “Experiments”
  8. If we translate a question word by word, it discards the contextual information .
    Page 8, “Experiments”
  9. To investigate the impact of contextual information for question retrieval, we only consider two languages and translate English questions into Chinese using an English to Chinese lexicon (Dict) in StarDictS.
    Page 8, “Experiments”
  10. Table 6 shows the experimental results, we can see that the performance is degraded when the contextual information is not considered for the translation of questions.
    Page 8, “Experiments”

See all papers in Proc. ACL 2013 that mention contextual information.

See all papers in Proc. ACL that mention contextual information.

Back to top.

objective function

Appears in 8 sentences as: objective function (7) objective function: (1)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. Wh611 noring the coupling between Vp, it can be solved by minimizing the objective function as follows:
    Page 3, “Our Approach”
  2. Combining equations (1) and (2), we get the following objective function:
    Page 4, “Our Approach”
  3. If we set a small value for Ap, the objective function behaves like the traditional NMF and the importance of data sparseness is emphasized; while a big value of Ap indicates Vp should be very closed to V1, and equation (3) aims to remove the noise introduced by statistical machine translation.
    Page 4, “Our Approach”
  4. The objective function 0 defined in equation (4) performs data sparseness and noise removing simultaneously.
    Page 4, “Our Approach”
  5. In our optimization framework, we optimize the objective function in equation (4) by alternatively minimizing each component when the remaining 2P — 1 components are fixed.
    Page 4, “Our Approach”
  6. if p E [2, P], the objective function can be writ-
    Page 4, “Our Approach”
  7. if p = 1, the objective function can be written as:
    Page 4, “Our Approach”
  8. Now we rewrite the objective function in equation (12) as
    Page 5, “Our Approach”

See all papers in Proc. ACL 2013 that mention objective function.

See all papers in Proc. ACL that mention objective function.

Back to top.

data sparseness

Appears in 6 sentences as: data sparseness (6)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. To tackle the data sparseness of question representation with the translated words, we hope to find two or more lower dimensional matrices whose product provides a good approximate to the original one via matrix factorization.
    Page 3, “Our Approach”
  2. If we set a small value for Ap, the objective function behaves like the traditional NMF and the importance of data sparseness is emphasized; while a big value of Ap indicates Vp should be very closed to V1, and equation (3) aims to remove the noise introduced by statistical machine translation.
    Page 4, “Our Approach”
  3. The objective function 0 defined in equation (4) performs data sparseness and noise removing simultaneously.
    Page 4, “Our Approach”
  4. The reason is that matrix factorization used in the paper can effectively solve the data sparseness and noise introduced by the machine translator simultaneously.
    Page 7, “Experiments”
  5. Our proposed method (SMT + MF) can effectively solve the data sparseness and noise via matrix factorization.
    Page 7, “Experiments”
  6. To further investigate the impact of the matrix factorization, one intuitive way is to expand the original questions with the translated words from other four languages, without considering the data sparseness and noise introduced by machine translator.
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention data sparseness.

See all papers in Proc. ACL that mention data sparseness.

Back to top.

optimization problem

Appears in 5 sentences as: optimization problem (2) optimization problem: (1) optimization problems (2)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. By solving the optimization problem in equation (4), we can get the reduced representation of terms and questions.
    Page 4, “Our Approach”
  2. ,U p fixed, the update of Up amounts to the following optimization problem:
    Page 4, “Our Approach”
  3. Thus, the optimization of equation (5) can be decomposed into Mp optimization problems that can be solved independently, with each corresponding to one row of Up:
    Page 4, “Our Approach”
  4. optimization problem divided into two categories.
    Page 4, “Our Approach”
  5. N N gm ZHd§p>—Upv§p>né+zApllv§p>—v§1>n% {Vj 20} j=1 j=1 (10) which can be decomposed into N optimization problems that can be solved independently, with
    Page 5, “Our Approach”

See all papers in Proc. ACL 2013 that mention optimization problem.

See all papers in Proc. ACL that mention optimization problem.

Back to top.

significantly improve

Appears in 5 sentences as: significantly improve (3) significantly improved (2)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. (2) Taking advantage of potentially rich semantic information drawn from other languages via statistical machine translation, question retrieval performance can be significantly improved (row 3, row 4, row 5 and row 6 vs. row 7, row 8 and row 9, all these comparisons are statistically significant at p < 0.05).
    Page 7, “Experiments”
  2. (1) Our proposed matrix factorization can significantly improve the performance of question retrieval (row 1 vs. row2; row3 vs. row4, the improvements are statistically significant at p < 0.05).
    Page 8, “Experiments”
  3. (3) Compared to VSM, the performance of SMT + IBM is significantly improved (row 1 vs. row 3), which supports the motivation that the word ambiguity and word mismatch problems could be partially addressed by Google Translate.
    Page 8, “Experiments”
  4. (1) Taking advantage of potentially rich semantic information drawn from other languages can significantly improve the performance of question retrieval (row 1 vs. row 2, row 3, row 4 and row 5, the improvements relative to DT + MF are statistically significant at p < 0.05).
    Page 8, “Experiments”
  5. Experiments conducted on a real CQA data show some promising findings: (1) the proposed method significantly outperforms the previous work for question retrieval; (2) the proposed matrix factorization can significantly improve the performance of question retrieval, no matter whether considering the translation languages or not; (3) considering more languages can further improve the performance but it does not seem to produce significantly better performance; (4) different languages contribute unevenly for question retrieval; (5) our proposed method can be easily adapted to the large-scale information retrieval task.
    Page 9, “Conclusions and Future Work”

See all papers in Proc. ACL 2013 that mention significantly improve.

See all papers in Proc. ACL that mention significantly improve.

Back to top.

time complexity

Appears in 5 sentences as: Time Complexity (1) time complexity (4)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. 2.4 Time Complexity Analysis
    Page 5, “Our Approach”
  2. In this subsection, we discuss the time complexity of our proposed method.
    Page 5, “Our Approach”
  3. Similarly, the time complexity of optimization V,- using Algorithm 3 is 0(MpK2 + MpNK).
    Page 5, “Our Approach”
  4. Another time complexity is the iteration times T used in Algorithm 1 and the total number of
    Page 5, “Our Approach”
  5. languages P, the overall time complexity of our proposed method is:
    Page 5, “Our Approach”

See all papers in Proc. ACL 2013 that mention time complexity.

See all papers in Proc. ACL that mention time complexity.

Back to top.

LM

Appears in 4 sentences as: LM (4)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. As a principle approach to capture semantic word relations, word-based translation models are built by using the IBM model 1 (Brown et al., 1993) and have been shown to outperform traditional models (e.g., VSM, BM25, LM ) for question retrieval.
    Page 2, “Introduction”
  2. ‘ # 1 Methods | MAP 1 P@ 10 l 1 VSM 0.242 0.226 2 LM 0.385 0.242 3 Jeon et a1.
    Page 7, “Experiments”
  3. Row 1 and row 2 are two baseline systems, which model the relevance score using VSM (Cao et al., 2010) and language model ( LM ) (Zhai and Laf-ferty, 2001; Cao et al., 2010) in the term space.
    Page 7, “Experiments”
  4. (l) Monolingual translation models significantly outperform the VSM and LM (row 1 and
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention LM.

See all papers in Proc. ACL that mention LM.

Back to top.

statistically significant

Appears in 4 sentences as: statistically significant (4)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. (2) Taking advantage of potentially rich semantic information drawn from other languages via statistical machine translation, question retrieval performance can be significantly improved (row 3, row 4, row 5 and row 6 vs. row 7, row 8 and row 9, all these comparisons are statistically significant at p < 0.05).
    Page 7, “Experiments”
  2. (2012) (row 7 vs. row 8, the comparison is statistically significant at p < 0.05).
    Page 7, “Experiments”
  3. (1) Our proposed matrix factorization can significantly improve the performance of question retrieval (row 1 vs. row2; row3 vs. row4, the improvements are statistically significant at p < 0.05).
    Page 8, “Experiments”
  4. (1) Taking advantage of potentially rich semantic information drawn from other languages can significantly improve the performance of question retrieval (row 1 vs. row 2, row 3, row 4 and row 5, the improvements relative to DT + MF are statistically significant at p < 0.05).
    Page 8, “Experiments”

See all papers in Proc. ACL 2013 that mention statistically significant.

See all papers in Proc. ACL that mention statistically significant.

Back to top.

development set

Appears in 3 sentences as: development set (3)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. We tune the parameters on a small development set of 50 questions.
    Page 6, “Experiments”
  2. This development set is also extracted from Yahoo!
    Page 6, “Experiments”
  3. For parameter K, we do an experiment on the development set to determine the optimal values among 50, 100, 150, - - - , 300 in terms of MAP.
    Page 6, “Experiments”

See all papers in Proc. ACL 2013 that mention development set.

See all papers in Proc. ACL that mention development set.

Back to top.

Evaluation Metrics

Appears in 3 sentences as: Evaluation Metrics (2) evaluation metrics (1)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. where feature vector (I) (q, d) = (SVSM(Q7 d), 8((11, d1), 8(Q2, d2), - - - ,8(QP, 003)), and 6 is the corresponding weight vector, we optimize this parameter for our evaluation metrics directly using the Powell Search algorithm (Paul et al., 1992) via cross-validation.
    Page 6, “Our Approach”
  2. 3.1 Data Set and Evaluation Metrics
    Page 6, “Experiments”
  3. Evaluation Metrics : We evaluate the performance of question retrieval using the following metrics: Mean Average Precision (MAP) and Precision@N (P@N).
    Page 6, “Experiments”

See all papers in Proc. ACL 2013 that mention Evaluation Metrics.

See all papers in Proc. ACL that mention Evaluation Metrics.

Back to top.

language model

Appears in 3 sentences as: language model (4)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. Row 1 and row 2 are two baseline systems, which model the relevance score using VSM (Cao et al., 2010) and language model (LM) (Zhai and Laf-ferty, 2001; Cao et al., 2010) in the term space.
    Page 7, “Experiments”
  2. Row 3 is the word-based translation model (Jeon et al., 2005), and row 4 is the word-based translation language model, which linearly combines the word-based translation model and language model into a unified framework (Xue et al., 2008).
    Page 7, “Experiments”
  3. (2009) in Table 3 because previous work (Ming et al., 2010) demonstrated that word-based translation language model (Xue et al., 2008) obtained the superior performance than the syntactic tree matching (Wang et al., 2009).
    Page 7, “Experiments”

See all papers in Proc. ACL 2013 that mention language model.

See all papers in Proc. ACL that mention language model.

Back to top.

question answering

Appears in 3 sentences as: question answering (2) “question answers” (1)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. Community question answering (CQA) has become an increasingly popular research topic.
    Page 1, “Abstract”
  2. With the development of Web 2.0, community question answering (CQA) services like Yahoo!
    Page 1, “Introduction”
  3. tion consists of four parts: “question title , question description”, “question answers” and “question category”.
    Page 6, “Experiments”

See all papers in Proc. ACL 2013 that mention question answering.

See all papers in Proc. ACL that mention question answering.

Back to top.

significantly outperforms

Appears in 3 sentences as: significantly outperform (1) significantly outperforms (2)
In Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
  1. (l) Monolingual translation models significantly outperform the VSM and LM (row 1 and
    Page 7, “Experiments”
  2. (3) Our proposed method (leveraging statistical machine translation via matrix factorization, SMT + MF) significantly outperforms the bilingual translation model of Zhou et al.
    Page 7, “Experiments”
  3. Experiments conducted on a real CQA data show some promising findings: (1) the proposed method significantly outperforms the previous work for question retrieval; (2) the proposed matrix factorization can significantly improve the performance of question retrieval, no matter whether considering the translation languages or not; (3) considering more languages can further improve the performance but it does not seem to produce significantly better performance; (4) different languages contribute unevenly for question retrieval; (5) our proposed method can be easily adapted to the large-scale information retrieval task.
    Page 9, “Conclusions and Future Work”

See all papers in Proc. ACL 2013 that mention significantly outperforms.

See all papers in Proc. ACL that mention significantly outperforms.

Back to top.