A Feature-Enriched Tree Kernel for Relation Extraction
Sun, Le and Han, Xianpei

Article Structure

Abstract

Tree kernel is an effective technique for relation extraction.

Introduction

Relation Extraction (RE) aims to identify a set of predefined relations between pairs of entities in text.

Topics

tree kernel

Appears in 28 sentences as: Tree Kernel (2) Tree kernel (1) tree kernel (32) Tree Kernels (1)
In A Feature-Enriched Tree Kernel for Relation Extraction
  1. Tree kernel is an effective technique for relation extraction.
    Page 1, “Abstract”
  2. In this paper, we propose a new tree kernel, called feature-enriched tree kernel (F TK ), which can enhance the traditional tree kernel by: 1) refining the syntactic tree representation by annotating each tree node with a set of discriminant features; and 2) proposing a new tree kernel which can better measure the syntactic tree similarity by taking all features into consideration.
    Page 1, “Abstract”
  3. Experimental results show that our method can achieve a 5.4% F—measure improvement over the traditional convolution tree kernel .
    Page 1, “Abstract”
  4. An effective technique is the tree kernel (Zelenko et al., 2003; Zhou et al., 2007; Zhang et al., 2006; Qian et al., 2008), which can exploit syntactic parse tree information for relation extraction.
    Page 1, “Introduction”
  5. Then the similarity between two trees are computed using a tree kernel, e. g., the convolution tree kernel proposed by Collins and Duffy (2001).
    Page 1, “Introduction”
  6. Unfortunately, one main shortcoming of the traditional tree kernel is that the syntactic tree representation usually cannot accurately capture the
    Page 1, “Introduction”
  7. This paper proposes a new tree kernel, referred as feature-enriched tree kernel (F TK ), which can effectively resolve the above problems by enhancing the traditional tree kernel in following ways:
    Page 1, “Introduction”
  8. 2) Based on the refined syntactic tree representation, we propose a new tree kernel —featnre-enriched tree kernel , which can better measure the similarity between two trees by also taking all features into consideration.
    Page 2, “Introduction”
  9. Experimental results show that our method can achieve a 5.4% F-measure improvement over the traditional convolution tree kernel based method.
    Page 2, “Introduction”
  10. Section 2 describes the feature-enriched tree kernel .
    Page 2, “Introduction”
  11. 2 The Feature-Enriched Tree Kernel
    Page 2, “Introduction”

See all papers in Proc. ACL 2014 that mention tree kernel.

See all papers in Proc. ACL that mention tree kernel.

Back to top.

context information

Appears in 10 sentences as: Context Information (1) context information (8) context information: (1)
In A Feature-Enriched Tree Kernel for Relation Extraction
  1. The feature we used includes characteristics of relation instance, phrase properties and context information (See Section 3 for details).
    Page 2, “Introduction”
  2. 3.3 Context Information Feature
    Page 4, “Introduction”
  3. The context information of a phrase node is critical for identifying the role and the importance of a subtree in the whole relation instance.
    Page 4, “Introduction”
  4. This paper captures the following context information:
    Page 4, “Introduction”
  5. (2007), the context path from root to the phrase node is an effective context information feature.
    Page 4, “Introduction”
  6. Using these five relative positions, we capture the context information using the following features:
    Page 4, “Introduction”
  7. We experiment our method with four different feature settings, correspondingly: 1) FTK with only instance features — FTK( instance); 2) FTK with only phrase features — F TK( phrase ); 3) FTK with only context information features — FTK( context); and 4) FTK with all features — F TK.
    Page 4, “Introduction”
  8. 2) All types of features can improve the performance of relation extraction: FTK can correspondingly get 2.6%, 2.2% and 4.9% F-measure improvements using instance features, phrase features and context information features.
    Page 5, “Introduction”
  9. 3) Within the three types of features, context information feature can achieve the highest F-measure improvement.
    Page 5, “Introduction”
  10. We believe this may because: ® The context information is useful in providing clues for identifying the role and the importance of a subtree; and @ The context-free assumption of CTK is too strong, some critical information will lost in the CTK computation.
    Page 5, “Introduction”

See all papers in Proc. ACL 2014 that mention context information.

See all papers in Proc. ACL that mention context information.

Back to top.

relation extraction

Appears in 9 sentences as: Relation Extraction (2) relation extraction (7)
In A Feature-Enriched Tree Kernel for Relation Extraction
  1. Tree kernel is an effective technique for relation extraction .
    Page 1, “Abstract”
  2. Relation Extraction (RE) aims to identify a set of predefined relations between pairs of entities in text.
    Page 1, “Introduction”
  3. In recent years, relation extraction has received considerable research attention.
    Page 1, “Introduction”
  4. An effective technique is the tree kernel (Zelenko et al., 2003; Zhou et al., 2007; Zhang et al., 2006; Qian et al., 2008), which can exploit syntactic parse tree information for relation extraction .
    Page 1, “Introduction”
  5. In this section, we describe the proposed feature-enriched tree kernel (FTK) for relation extraction .
    Page 2, “Introduction”
  6. 3 Features for Relation Extraction
    Page 3, “Introduction”
  7. 1) By refining the syntactic tree with discriminant features and incorporating these features into the final tree similarity, FTK can significantly improve the relation extraction performance: compared with the convolution tree kernel baseline CTK, our method can achieve a 5.4% F-meas-ure improvement.
    Page 4, “Introduction”
  8. 2) All types of features can improve the performance of relation extraction : FTK can correspondingly get 2.6%, 2.2% and 4.9% F-measure improvements using instance features, phrase features and context information features.
    Page 5, “Introduction”
  9. A classical technique for relation extraction is to model the task as a feature-based classification problem (Kambhatla, 2004; Zhou et al., 2005; J iang & Zhai, 2007; Chan & Roth, 2010; Chan & Roth, 2011), and feature engineering is obviously the key for performance improvement.
    Page 5, “Introduction”

See all papers in Proc. ACL 2014 that mention relation extraction.

See all papers in Proc. ACL that mention relation extraction.

Back to top.

relation instance

Appears in 7 sentences as: relation instance (4) Relation instances (1) relation instances (2)
In A Feature-Enriched Tree Kernel for Relation Extraction
  1. Finally, new relation instances are extracted using kernel based classifiers, e. g., the SVM classifier.
    Page 1, “Introduction”
  2. The feature we used includes characteristics of relation instance , phrase properties and context information (See Section 3 for details).
    Page 2, “Introduction”
  3. Relation instances of the same type often share some common characteristics.
    Page 3, “Introduction”
  4. A feature indicates whether a relation instance has the following four syntactico—semantic structures in (Chan & Roth, 2011) — Premodifiers, Possessive, Preposition, Formulaic and Verbal.
    Page 3, “Introduction”
  5. The context information of a phrase node is critical for identifying the role and the importance of a subtree in the whole relation instance .
    Page 4, “Introduction”
  6. We observed that a phrase’s relative position with the relation’s arguments is useful for identifying the role of the phrase node in the whole relation instance .
    Page 4, “Introduction”
  7. That is, we parse all sentences using the Charniak’s parser (Charniak, 2001), relation instances are generated by iterating over all pairs of entity mentions occurring in the same sentence.
    Page 4, “Introduction”

See all papers in Proc. ACL 2014 that mention relation instance.

See all papers in Proc. ACL that mention relation instance.

Back to top.

F-measure

Appears in 6 sentences as: F-measure (8)
In A Feature-Enriched Tree Kernel for Relation Extraction
  1. Experimental results show that our method can achieve a 5.4% F-measure improvement over the traditional convolution tree kernel based method.
    Page 2, “Introduction”
  2. The overall performance of CTK and FTK is shown in Table 1, the F-measure improvements over CTK are also shown inside the parentheses.
    Page 4, “Introduction”
  3. FTK on the 7 major relation types and their F-measure improvement over CTK
    Page 4, “Introduction”
  4. 2) All types of features can improve the performance of relation extraction: FTK can correspondingly get 2.6%, 2.2% and 4.9% F-measure improvements using instance features, phrase features and context information features.
    Page 5, “Introduction”
  5. 3) Within the three types of features, context information feature can achieve the highest F-measure improvement.
    Page 5, “Introduction”
  6. From Table 3, we can see that FTK can achieve competitive performance: Q) It achieves a 0.8% F-measure improvement over the feature-based system of J iang & Zhai (2007); @ It achieves a 0.5% F-measure improvement over a state-of-the-art tree kernel: context sensitive CTK with CSPT of Zhou et al., (2007); C3) The F-measure of our system is slightly lower than the current best performance on ACE 2004 (Qian et al., 2008) — 73.
    Page 5, “Introduction”

See all papers in Proc. ACL 2014 that mention F-measure.

See all papers in Proc. ACL that mention F-measure.

Back to top.

semantic relation

Appears in 5 sentences as: semantic relation (4) semantic relations (1)
In A Feature-Enriched Tree Kernel for Relation Extraction
  1. However, the traditional syntactic tree representation is often too coarse or ambiguous to accurately capture the semantic relation information between two entities.
    Page 1, “Abstract”
  2. 1) The syntactic tree focuses on representing syntactic relation/structure, which is often too coarse or ambiguous to capture the semantic relation information.
    Page 1, “Introduction”
  3. For example, all the three trees in Figure 1 share the same possessive syntactic structure, but express quite different semantic relations : where “Mary’s brothers” expresses PER-SOC Family relation, “Mary ’s toys” expresses Possession relation, and “New York’s airports” expresses PH YS-Located relation.
    Page 1, “Introduction”
  4. better capture the semantic relation information between two entities.
    Page 2, “Introduction”
  5. As described in above, syntactic tree is often too coarse or too ambiguous to represent the semantic relation information between two entities.
    Page 2, “Introduction”

See all papers in Proc. ACL 2014 that mention semantic relation.

See all papers in Proc. ACL that mention semantic relation.

Back to top.

Treebank

Appears in 4 sentences as: Treebank (4)
In A Feature-Enriched Tree Kernel for Relation Extraction
  1. In a syntactic tree, each node indicates a clause/phrase/word and is only labeled with a Treebank tag (Marcus et al., 1993).
    Page 1, “Introduction”
  2. The Treebank tag, unfortunately, is usually too coarse or too general to capture semantic information.
    Page 1, “Introduction”
  3. where Ln is its phrase label (i.e., its Treebank tag), and F7, is a feature vector which indicates the characteristics of node n, which is represented as:
    Page 2, “Introduction”
  4. As discussed in above, the Treebank tag is too coarse to capture the property of a phrase node.
    Page 3, “Introduction”

See all papers in Proc. ACL 2014 that mention Treebank.

See all papers in Proc. ACL that mention Treebank.

Back to top.

entity type

Appears in 3 sentences as: entity type (2) entity types (2)
In A Feature-Enriched Tree Kernel for Relation Extraction
  1. Features about the entity information of arguments, including: a) #TP]-#TP2: the concat of the major entity types of arguments; b) #STI-#ST2: the concat of the sub entity types of arguments; c) #MT] -#MT2: the concat of the mention types of arguments.
    Page 3, “Introduction”
  2. We capture the property of a node’s content using the following features: a) MB_#Num: The number of mentions contained in the phrase; b) MB_C_#Type: A feature indicates that the phrase contains a mention with major entity type #Type; c) M W_#Num: The number of words within the phrase.
    Page 3, “Introduction”
  3. a) #RP_Arg]Head_#Arg] Type: a feature indicates the relative position of a phrase node with argument 1’s head phrase, where #RP is the relative position (one of match, cover, within, overlap, other), and #Arg] Type is the major entity type of argument 1.
    Page 4, “Introduction”

See all papers in Proc. ACL 2014 that mention entity type.

See all papers in Proc. ACL that mention entity type.

Back to top.

feature vector

Appears in 3 sentences as: Feature Vector (1) feature vector (2) feature vectors (1)
In A Feature-Enriched Tree Kernel for Relation Extraction
  1. Feature Vector
    Page 2, “Introduction”
  2. where Ln is its phrase label (i.e., its Treebank tag), and F7, is a feature vector which indicates the characteristics of node n, which is represented as:
    Page 2, “Introduction”
  3. where 6 (t1, t2) is the same indicator function as in CTK; (m, nj)is a pair of aligned nodes between 151 and t2, where m and nj are correspondingly in the same position of tree t1 and t2; E (t1, 752) is the set of all aligned node pairs; sim(n,-, nj) is the feature vector similarity between nodeni and nj, computed as the dot product between their feature vectors Fm and Fnj.
    Page 3, “Introduction”

See all papers in Proc. ACL 2014 that mention feature vector.

See all papers in Proc. ACL that mention feature vector.

Back to top.

Feature weighting

Appears in 3 sentences as: feature weight (1) Feature weighting (1) feature weighting (1)
In A Feature-Enriched Tree Kernel for Relation Extraction
  1. Feature weighting .
    Page 4, “Introduction”
  2. Currently, we set all features with an uniform weight w E (0, 1), which is used to control the relative importance of the feature in the final tree similarity: the larger the feature weight , the more important the feature in the final tree similarity.
    Page 4, “Introduction”
  3. feature weighting algorithm which can accurately
    Page 5, “Introduction”

See all papers in Proc. ACL 2014 that mention Feature weighting.

See all papers in Proc. ACL that mention Feature weighting.

Back to top.

Lexical Semantics

Appears in 3 sentences as: lexical semantic (1) Lexical Semantics (1) lexical semantics: (1)
In A Feature-Enriched Tree Kernel for Relation Extraction
  1. Therefore, we enrich each phrase node with features about its lexical pattern, its content information, and its lexical semantics:
    Page 3, “Introduction”
  2. 3) Lexical Semantics .
    Page 3, “Introduction”
  3. If the node is a preterminal node, we capture its lexical semantic by adding features indicating its WordNet sense information.
    Page 3, “Introduction”

See all papers in Proc. ACL 2014 that mention Lexical Semantics.

See all papers in Proc. ACL that mention Lexical Semantics.

Back to top.

SVM

Appears in 3 sentences as: SVM (3)
In A Feature-Enriched Tree Kernel for Relation Extraction
  1. Finally, new relation instances are extracted using kernel based classifiers, e. g., the SVM classifier.
    Page 1, “Introduction”
  2. We apply the one vs. others strategy for multiple classification using SVM .
    Page 4, “Introduction”
  3. For SVM training, the parameter C is set to 2.4 for all experiments, and the tree kernel parameter A is tuned to 0.2 for FTK and 0.4 (the optimal parameter setting used in Qian et al.
    Page 4, “Introduction”

See all papers in Proc. ACL 2014 that mention SVM.

See all papers in Proc. ACL that mention SVM.

Back to top.

WordNet

Appears in 3 sentences as: WordNet (3)
In A Feature-Enriched Tree Kernel for Relation Extraction
  1. If the node is a preterminal node, we capture its lexical semantic by adding features indicating its WordNet sense information.
    Page 3, “Introduction”
  2. Specifically, the first WordNet sense of the terminal word, and all this sense’s hyponym senses will be added as features.
    Page 3, “Introduction”
  3. For example, WordNet senses {New Y0rk#], city#], district#],
    Page 3, “Introduction”

See all papers in Proc. ACL 2014 that mention WordNet.

See all papers in Proc. ACL that mention WordNet.

Back to top.