Conclusion | WOE can run in two modes: a CRF extractor (WOEPOS) trained with shallow features like POS tags; a pattern classfier (WOEparse) learned from dependency path patterns. |
Introduction | We show that abstract dependency paths are a highly informative feature when performing unlexicalized extraction. |
Related Work | (Snow et al., 2005) utilize WordNet to learn dependency path patterns for extracting the hypernym relation from text. |
Related Work | However, our results imply that abstracted dependency path features are highly informative for open IE. |
Wikipedia-based Open IE | WOEparse uses a pattern learner to classify whether the shortest dependency path between two noun phrases indicates a semantic relation. |
Wikipedia-based Open IE | Despite some evidence that parser-based features have limited utility in IE (Jiang and Zhai, 2007), we hoped dependency paths would improve precision on long sentences. |
Wikipedia-based Open IE | Shortest Dependency Path as Relation: Unless otherwise noted, WOE uses the Stanford Parser to create dependencies in the “collapsedDepen-dency” format. |
Experiments | We therefore mapped the DIRT Inference rules (Lin and Pantel, 2001), (which consist of pairs of dependency paths ) to TEXTRUNNER relations as follows. |
Experiments | From the parses we extracted all dependency paths between nouns that contain only words present in the TEXTRUNNER relation string. |
Experiments | These dependency paths were then matched against each pair in the DIRT database, and all pairs of associated relations were collected producing about 26,000 inference rules. |