Exploiting Heterogeneous Treebanks for Parsing
Niu, Zheng-Yu and Wang, Haifeng and Wu, Hua

Article Structure

Abstract

We address the issue of using heterogeneous treebanks for parsing by breaking it down into two sub-problems, converting grammar formalisms of the treebanks to the same one, and parsing on these homogeneous treebanks.

Introduction

The last few decades have seen the emergence of multiple treebanks annotated with different grammar formalisms, motivated by the diversity of languages and linguistic theories, which is crucial to the success of statistical parsing (Abeille et al., 2000; Brants et al., 1999; Bohmova et al., 2003; Han et al., 2002; Kurohashi and Nagao, 1998; Marcus et al., 1993; Moreno et al., 2003; Xue et al., 2005).

Our Two-Step Solution

2.1 Grammar Formalism Conversion

Experiments of Grammar Formalism Conversion

3.1 Evaluation on WSJ section 22

Experiments of Parsing

We investigated our two-step solution on two existing treebanks, CDT and CTB, and we used CDT as the source treebank and CTB as the target treebank.

Related Work

Recently there have been some studies addressing how to use treebanks with same grammar formalism for domain adaptation of parsers.

Conclusion

We have proposed a two-step solution to deal with the issue of using heterogeneous treebanks for parsing.

Topics

treebank

Appears in 46 sentences as: Treebank (12) treebank (33) treebanks (28)
In Exploiting Heterogeneous Treebanks for Parsing
  1. We address the issue of using heterogeneous treebanks for parsing by breaking it down into two sub-problems, converting grammar formalisms of the treebanks to the same one, and parsing on these homogeneous treebanks .
    Page 1, “Abstract”
  2. Then we provide two strategies to refine conversion results, and adopt a corpus weighting technique for parsing on homogeneous treebanks .
    Page 1, “Abstract”
  3. Results on the Penn Treebank show that our conversion method achieves 42% error reduction over the previous best result.
    Page 1, “Abstract”
  4. Evaluation on the Penn Chinese Treebank indicates that a converted dependency treebank helps constituency parsing and the use of unlabeled data by self-training further increases parsing f-score to 85.2%, resulting in 6% error reduction over the previous best result.
    Page 1, “Abstract”
  5. The last few decades have seen the emergence of multiple treebanks annotated with different grammar formalisms, motivated by the diversity of languages and linguistic theories, which is crucial to the success of statistical parsing (Abeille et al., 2000; Brants et al., 1999; Bohmova et al., 2003; Han et al., 2002; Kurohashi and Nagao, 1998; Marcus et al., 1993; Moreno et al., 2003; Xue et al., 2005).
    Page 1, “Introduction”
  6. Availability of multiple treebanks creates a scenario where we have a treebank annotated with one grammar formalism, and another treebank annotated with another grammar formalism that we are interested in.
    Page 1, “Introduction”
  7. a source treebank, and the second a target treebank .
    Page 1, “Introduction”
  8. We thus encounter a problem of how to use these heterogeneous treebanks for target grammar parsing.
    Page 1, “Introduction”
  9. Here heterogeneous treebanks refer to two or more treebanks with different grammar formalisms, e.g., one treebank annotated with dependency structure (DS) and the other annotated with phrase structure (PS).
    Page 1, “Introduction”
  10. It is important to acquire additional labeled data for the target grammar parsing through exploitation of existing source treebanks since there is often a shortage of labeled data.
    Page 1, “Introduction”
  11. Recently there have been some works on using multiple treebanks for domain adaptation of parsers, where these treebanks have the same grammar formalism (McClosky et al., 2006b; Roark and Bacchiani, 2003).
    Page 1, “Introduction”

See all papers in Proc. ACL 2009 that mention treebank.

See all papers in Proc. ACL that mention treebank.

Back to top.

reranking

Appears in 14 sentences as: reranker (4) reranking (11)
In Exploiting Heterogeneous Treebanks for Parsing
  1. When coupled with self-training technique, a reranking parser with CTB and converted CDT as labeled data achieves 85.2% f-score on CTB test set, an absolute 1.0% improvement (6% error reduction) over the previous best result for Chinese parsing.
    Page 2, “Introduction”
  2. We used Charniak’s maximum entropy inspired parser and their reranker (Charniak and Johnson, 2005) for target grammar parsing, called a generative parser (GP) and a reranking parser (RP) respectively.
    Page 5, “Experiments of Parsing”
  3. Table 5: Results of the generative parser (GP) and the reranking parser (RP) on the test set, when trained on only CTB training set or an optimal combination of CTB training set and CDTPS .
    Page 6, “Experiments of Parsing”
  4. Finally we evaluated two parsing models, the generative parser and the reranking parser, on the test set, with results shown in Table 5.
    Page 6, “Experiments of Parsing”
  5. When trained on CTB only, the generative parser and the reranking parser achieved f—scores of 81.0% and 83.3%.
    Page 6, “Experiments of Parsing”
  6. Table 6: Results of the generative parser and the reranking parser on the test set, when trained on an optimal combination of CTB training set and converted CDT.
    Page 6, “Experiments of Parsing”
  7. Table 6 provides f-scores of the generative parser and the reranker on the test set, when trained on CTB and CDTfs.
    Page 6, “Experiments of Parsing”
  8. We see that the performance of the reranking parser increased to
    Page 6, “Experiments of Parsing”
  9. Table 7: Results of the self-trained generative parser and updated reranking parser on the test set.
    Page 7, “Experiments of Parsing”
  10. 84.2% f-score, better than the result of the reranking parser with CTB and CDTPS as training data (shown in Table 5).
    Page 7, “Experiments of Parsing”
  11. Then we ran the reranking parser in Section 4.2.2 on PDC and used the parses on PDC as additional training data for the generative parser.
    Page 7, “Experiments of Parsing”

See all papers in Proc. ACL 2009 that mention reranking.

See all papers in Proc. ACL that mention reranking.

Back to top.

development set

Appears in 13 sentences as: development set (12) development sets (1)
In Exploiting Heterogeneous Treebanks for Parsing
  1. The number of removed trees will be determined by cross validation on development set .
    Page 4, “Our Two-Step Solution”
  2. The value of A will be tuned by cross validation on development set .
    Page 4, “Our Two-Step Solution”
  3. Corpus weighting is exactly such an approach, with the weight tuned on development set , that will be used for parsing on homogeneous treebanks in this paper.
    Page 4, “Our Two-Step Solution”
  4. Table 4: Results of the generative parser on the development set , when trained with various weighting of CTB training set and CDTPS .
    Page 5, “Experiments of Grammar Formalism Conversion”
  5. We used a standard split of CTB for performance evaluation, articles 1-270 and 400-1151 as training set, articles 301-325 as development set , and articles 271-300 as test set.
    Page 5, “Experiments of Parsing”
  6. We tried the corpus weighting method when combining CDTPS with CTB training set (abbreviated as CTB for simplicity) as training data, by gradually increasing the weight (including 1, 2, 5, 10, 20, 50) of CTB to optimize parsing performance on the development set .
    Page 6, “Experiments of Parsing”
  7. Table 4 presents the results of the generative parser with various weights of CTB on the development set .
    Page 6, “Experiments of Parsing”
  8. Considering the performance on the development set , we decided to give CTB a relative weight of 10.
    Page 6, “Experiments of Parsing”
  9. In addition, we used CTB training set as C pgtmin, and CTB development set as CPS,dev-
    Page 6, “Experiments of Parsing”
  10. Then we tuned the value of M by optimizing the parser’s performance on the development set with 10 x CTB+CDTfiS as training data.
    Page 6, “Experiments of Parsing”
  11. The values of A (varying from 0.0 to 1.0 with 0.1 as the interval) and the CTB weight (including 1, 2, 5, 10, 20, 50) were simultaneously tuned on the development sets .
    Page 6, “Experiments of Parsing”

See all papers in Proc. ACL 2009 that mention development set.

See all papers in Proc. ACL that mention development set.

Back to top.

f-score

Appears in 10 sentences as: f-score (10)
In Exploiting Heterogeneous Treebanks for Parsing
  1. Evaluation on the Penn Chinese Treebank indicates that a converted dependency treebank helps constituency parsing and the use of unlabeled data by self-training further increases parsing f-score to 85.2%, resulting in 6% error reduction over the previous best result.
    Page 1, “Abstract”
  2. Our conversion method achieves 93.8% f-score on dependency trees produced from WSJ section 22, resulting in 42% error reduction over the previous best result for DS to PS conversion.
    Page 2, “Introduction”
  3. When coupled with self-training technique, a reranking parser with CTB and converted CDT as labeled data achieves 85.2% f-score on CTB test set, an absolute 1.0% improvement (6% error reduction) over the previous best result for Chinese parsing.
    Page 2, “Introduction”
  4. Therefore we modified the selection metric in Section 2.1 by interpolating two scores, the probability of a conversion candidate from the parser and its unlabeled dependency f-score , shown as follows:
    Page 4, “Our Two-Step Solution”
  5. Finally Q-10-method achieved an f-score of 93.8% on WSJ section 22, an absolute 4.4% improvement (42% error reduction) over the best result of Xia et al.
    Page 5, “Experiments of Grammar Formalism Conversion”
  6. Finally Q-10-method achieved an f-score of 93.6% on WSJ section 2~l8 and 20~22, better than that of Q-0-method and comparable with that of Q-10-method in Section 3.1.
    Page 5, “Experiments of Grammar Formalism Conversion”
  7. Finally we decided that the optimal value of A was 0.4 and the optimal weight of CTB was 1, which brought the best performance on the development set (an f-score of 86.1%).
    Page 6, “Experiments of Parsing”
  8. In comparison with the results in Section 4.1, the average index of converted trees in 200-best list increased to 2, and their average unlabeled dependency f-score dropped to 65.4%.
    Page 6, “Experiments of Parsing”
  9. 84.2% f-score , better than the result of the reranking parser with CTB and CDTPS as training data (shown in Table 5).
    Page 7, “Experiments of Parsing”
  10. Comparing our result in Table 6 with that of Petrov and Klein (2007), we see that CDTfS helps parsing on CTB, which brought 0.9% f-score improvement.
    Page 7, “Experiments of Parsing”

See all papers in Proc. ACL 2009 that mention f-score.

See all papers in Proc. ACL that mention f-score.

Back to top.

best result

Appears in 7 sentences as: best result (7)
In Exploiting Heterogeneous Treebanks for Parsing
  1. Results on the Penn Treebank show that our conversion method achieves 42% error reduction over the previous best result .
    Page 1, “Abstract”
  2. Evaluation on the Penn Chinese Treebank indicates that a converted dependency treebank helps constituency parsing and the use of unlabeled data by self-training further increases parsing f-score to 85.2%, resulting in 6% error reduction over the previous best result .
    Page 1, “Abstract”
  3. Our conversion method achieves 93.8% f-score on dependency trees produced from WSJ section 22, resulting in 42% error reduction over the previous best result for DS to PS conversion.
    Page 2, “Introduction”
  4. When coupled with self-training technique, a reranking parser with CTB and converted CDT as labeled data achieves 85.2% f-score on CTB test set, an absolute 1.0% improvement (6% error reduction) over the previous best result for Chinese parsing.
    Page 2, “Introduction”
  5. The best result of Xia et al.
    Page 5, “Experiments of Grammar Formalism Conversion”
  6. Finally Q-10-method achieved an f-score of 93.8% on WSJ section 22, an absolute 4.4% improvement (42% error reduction) over the best result of Xia et al.
    Page 5, “Experiments of Grammar Formalism Conversion”
  7. Moreover, the use of unlabeled data further boosted the parsing performance to 85.2%, an absolute 1.0% improvement over the previous best result presented in Burkett and Klein (2008).
    Page 7, “Experiments of Parsing”

See all papers in Proc. ACL 2009 that mention best result.

See all papers in Proc. ACL that mention best result.

Back to top.

Penn Treebank

Appears in 6 sentences as: Penn Treebank (6)
In Exploiting Heterogeneous Treebanks for Parsing
  1. Results on the Penn Treebank show that our conversion method achieves 42% error reduction over the previous best result.
    Page 1, “Abstract”
  2. We have evaluated our conversion algorithm on a dependency structure treebank (produced from the Penn Treebank ) for comparison with previous work (Xia et al., 2008).
    Page 2, “Introduction”
  3. Section 3 provides experimental results of grammar formalism conversion on a dependency treebank produced from the Penn Treebank .
    Page 2, “Introduction”
  4. (2008) used WSJ section 19 from the Penn Treebank to extract DS to PS conversion rules and then produced dependency trees from WSJ section 22 for evaluation of their DS to PS conversion algorithm.
    Page 4, “Experiments of Grammar Formalism Conversion”
  5. 5 We used the tool “Penn2Malt” to produce dependency structures from the Penn Treebank , which was also used for PS to DS conversion in our conversion algorithm.
    Page 4, “Experiments of Grammar Formalism Conversion”
  6. Future work includes further investigation of our conversion method for other pairs of grammar formalisms, e.g., from the grammar formalism of the Penn Treebank to more deep linguistic formalism like CCG, HPSG, or LFG.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2009 that mention Penn Treebank.

See all papers in Proc. ACL that mention Penn Treebank.

Back to top.

POS tag

Appears in 6 sentences as: POS tag (5) POS tagged (1) POS tags (2)
In Exploiting Heterogeneous Treebanks for Parsing
  1. ” (a preposition, with “BA” as its POS tag in CTB), and the head of IP-OBJ is 3% [El ” .
    Page 4, “Our Two-Step Solution”
  2. (2008) used POS tag information, dependency structures and dependency tags in test set for conversion.
    Page 5, “Experiments of Grammar Formalism Conversion”
  3. Similarly, we used POS tag information in the test set to restrict search space of the parser for generation of better N-best parses.
    Page 5, “Experiments of Grammar Formalism Conversion”
  4. CDT consists of 60k Chinese sentences, annotated with POS tag information and dependency structure information (including 28 P08 tags, and 24 dependency tags) (Liu et al., 2006).
    Page 5, “Experiments of Parsing”
  5. We did not use POS tag information as inputs to the parser in our conversion method due to the difficulty of conversion from CDT POS tags to CTB POS tags .
    Page 5, “Experiments of Parsing”
  6. We used the POS tagged People Daily corpus9 (Jan. l998~Jun.
    Page 7, “Experiments of Parsing”

See all papers in Proc. ACL 2009 that mention POS tag.

See all papers in Proc. ACL that mention POS tag.

Back to top.

unlabeled data

Appears in 6 sentences as: Unlabeled Data (1) unlabeled data (5)
In Exploiting Heterogeneous Treebanks for Parsing
  1. Evaluation on the Penn Chinese Treebank indicates that a converted dependency treebank helps constituency parsing and the use of unlabeled data by self-training further increases parsing f-score to 85.2%, resulting in 6% error reduction over the previous best result.
    Page 1, “Abstract”
  2. 4.3 Using Unlabeled Data for Parsing
    Page 7, “Experiments of Parsing”
  3. Recent studies on parsing indicate that the use of unlabeled data by self-training can help parsing on the WSJ data, even when labeled data is relatively large (McClosky et al., 2006a; Reichart and Rappoport, 2007).
    Page 7, “Experiments of Parsing”
  4. 2000) (PDC) as unlabeled data for parsing.
    Page 7, “Experiments of Parsing”
  5. We see that the use of unlabeled data by self-training further increased the reranking parser’s performance from 84.2% to 85.2%.
    Page 7, “Experiments of Parsing”
  6. Moreover, the use of unlabeled data further boosted the parsing performance to 85.2%, an absolute 1.0% improvement over the previous best result presented in Burkett and Klein (2008).
    Page 7, “Experiments of Parsing”

See all papers in Proc. ACL 2009 that mention unlabeled data.

See all papers in Proc. ACL that mention unlabeled data.

Back to top.

dependency trees

Appears in 5 sentences as: dependency tree (2) dependency trees (3)
In Exploiting Heterogeneous Treebanks for Parsing
  1. Our conversion method achieves 93.8% f-score on dependency trees produced from WSJ section 22, resulting in 42% error reduction over the previous best result for DS to PS conversion.
    Page 2, “Introduction”
  2. Previous DS to PS conversion methods built a converted tree by iteratively attaching nodes and edges to the tree with the help of conversion rules and heuristic rules, based on current head-dependent pair from a source dependency tree and the structure of the built tree (Collins et al., 1999; Covington, 1994; Xia and Palmer, 2001; Xia et al., 2008).
    Page 2, “Our Two-Step Solution”
  3. (2008) used WSJ section 19 from the Penn Treebank to extract DS to PS conversion rules and then produced dependency trees from WSJ section 22 for evaluation of their DS to PS conversion algorithm.
    Page 4, “Experiments of Grammar Formalism Conversion”
  4. For comparison with their work, we conducted experiments in the same setting as theirs: using WSJ section 19 (1844 sentences) as Ops, producing dependency trees from WSJ section 22 (1700 sentences) as CD35, and using labeled bracketing f-scores from the tool
    Page 4, “Experiments of Grammar Formalism Conversion”
  5. Moreover, they presented two strategies to solve the problem that there might be multiple conversion rules matching the same input dependency tree pattern: (1) choosing the most frequent rules, (2) preferring rules that add fewer number of nodes and attach the subtree lower.
    Page 8, “Related Work”

See all papers in Proc. ACL 2009 that mention dependency trees.

See all papers in Proc. ACL that mention dependency trees.

Back to top.

labeled data

Appears in 5 sentences as: labeled data (6)
In Exploiting Heterogeneous Treebanks for Parsing
  1. It is important to acquire additional labeled data for the target grammar parsing through exploitation of existing source treebanks since there is often a shortage of labeled data .
    Page 1, “Introduction”
  2. When coupled with self-training technique, a reranking parser with CTB and converted CDT as labeled data achieves 85.2% f-score on CTB test set, an absolute 1.0% improvement (6% error reduction) over the previous best result for Chinese parsing.
    Page 2, “Introduction”
  3. Recent studies on parsing indicate that the use of unlabeled data by self-training can help parsing on the WSJ data, even when labeled data is relatively large (McClosky et al., 2006a; Reichart and Rappoport, 2007).
    Page 7, “Experiments of Parsing”
  4. Table 7 shows the performance of self-trained generative parser and updated reranker on the test set, with CTB and CDTfs as labeled data .
    Page 7, “Experiments of Parsing”
  5. All the works in Table 8 used CTB articles 1-270 as labeled data .
    Page 7, “Experiments of Parsing”

See all papers in Proc. ACL 2009 that mention labeled data.

See all papers in Proc. ACL that mention labeled data.

Back to top.

constituency parsing

Appears in 4 sentences as: constituency parser (1) constituency parsing (3)
In Exploiting Heterogeneous Treebanks for Parsing
  1. Evaluation on the Penn Chinese Treebank indicates that a converted dependency treebank helps constituency parsing and the use of unlabeled data by self-training further increases parsing f-score to 85.2%, resulting in 6% error reduction over the previous best result.
    Page 1, “Abstract”
  2. We first train a constituency parser on CPS
    Page 2, “Our Two-Step Solution”
  3. (1999) performed statistical constituency parsing of Czech on a treebank that was converted from the Prague Dependency Treebank under the guidance of conversion rules and heuristic rules, e.g., one level of projection for any category, minimal projection for any dependents, and fixed position of attachment.
    Page 7, “Related Work”
  4. Moreover, experimental results on the Penn Chinese Treebank indicate that a converted dependency treebank helps constituency parsing , and it is better to exploit probability information produced by the parser through score interpolation than to prune low quality trees for the use of the converted treebank.
    Page 8, “Conclusion”

See all papers in Proc. ACL 2009 that mention constituency parsing.

See all papers in Proc. ACL that mention constituency parsing.

Back to top.

domain adaptation

Appears in 4 sentences as: domain adaptation (4)
In Exploiting Heterogeneous Treebanks for Parsing
  1. Recently there have been some works on using multiple treebanks for domain adaptation of parsers, where these treebanks have the same grammar formalism (McClosky et al., 2006b; Roark and Bacchiani, 2003).
    Page 1, “Introduction”
  2. Recently there have been some studies addressing how to use treebanks with same grammar formalism for domain adaptation of parsers.
    Page 7, “Related Work”
  3. Roark and Bachiani (2003) presented count merging and model interpolation techniques for domain adaptation of parsers.
    Page 7, “Related Work”
  4. Their results indicated that both unlabeled in-domain data and labeled out-of-domain data can help domain adaptation .
    Page 7, “Related Work”

See all papers in Proc. ACL 2009 that mention domain adaptation.

See all papers in Proc. ACL that mention domain adaptation.

Back to top.

cross validation

Appears in 3 sentences as: cross validation (3)
In Exploiting Heterogeneous Treebanks for Parsing
  1. The number of removed trees will be determined by cross validation on development set.
    Page 4, “Our Two-Step Solution”
  2. The value of A will be tuned by cross validation on development set.
    Page 4, “Our Two-Step Solution”
  3. Here we tried the corpus weighting technique for an optimal combination of CTB, CDTfs and parsed PDC, and chose the relative weight of both CTB and CDTfs as 10 by cross validation on the development set.
    Page 7, “Experiments of Parsing”

See all papers in Proc. ACL 2009 that mention cross validation.

See all papers in Proc. ACL that mention cross validation.

Back to top.

iteratively

Appears in 3 sentences as: iteratively (3)
In Exploiting Heterogeneous Treebanks for Parsing
  1. First we propose to employ an iteratively trained target grammar parser to perform grammar formalism conversion, eliminating predefined heuristic rules as required in previous methods.
    Page 1, “Abstract”
  2. The procedure of tree conversion and parser retraining will be run iteratively until a stopping condition is satisfied.
    Page 2, “Introduction”
  3. Previous DS to PS conversion methods built a converted tree by iteratively attaching nodes and edges to the tree with the help of conversion rules and heuristic rules, based on current head-dependent pair from a source dependency tree and the structure of the built tree (Collins et al., 1999; Covington, 1994; Xia and Palmer, 2001; Xia et al., 2008).
    Page 2, “Our Two-Step Solution”

See all papers in Proc. ACL 2009 that mention iteratively.

See all papers in Proc. ACL that mention iteratively.

Back to top.

Models Training

Appears in 3 sentences as: Models Training (3)
In Exploiting Heterogeneous Treebanks for Parsing
  1. Models Training data (%) (%) (%) GP CTB 79.9 82.2 81.0 RP CTB 82.0 84.6 83.3
    Page 6, “Experiments of Parsing”
  2. All the sentences LR LP F Models Training data (%) (%) (%)
    Page 6, “Experiments of Parsing”
  3. LR LP F Models Training data (%) (%) (%)
    Page 7, “Experiments of Parsing”

See all papers in Proc. ACL 2009 that mention Models Training.

See all papers in Proc. ACL that mention Models Training.

Back to top.

parsing models

Appears in 3 sentences as: parsing models (3)
In Exploiting Heterogeneous Treebanks for Parsing
  1. After grammar formalism conversion, the problem now we face has been limited to how to build parsing models on multiple homogeneous treebank.
    Page 4, “Our Two-Step Solution”
  2. Finally we evaluated two parsing models , the generative parser and the reranking parser, on the test set, with results shown in Table 5.
    Page 6, “Experiments of Parsing”
  3. A possible reason is that most of non-perfect parses can provide useful syntactic structure information for building parsing models .
    Page 6, “Experiments of Parsing”

See all papers in Proc. ACL 2009 that mention parsing models.

See all papers in Proc. ACL that mention parsing models.

Back to top.