Introduction | by (Chi et al., 2004) for discovering frequently occurring subtrees in a database of labelled unordered trees. |
Introduction | Section 3 shows how to adapt this algorithm to mine the SR dependency trees for subtrees with high suspicion rate. |
Mining Dependency Trees | Since we work with subtrees of arbitrary length, we also need to check whether constructing a longer subtree is useful that is, whether its suspicion rate is equal or higher than the suspicion rate of any of the subtrees it contains. |
Mining Dependency Trees | In that way, we avoid computing all subtrees (thus saving time and space). |
Mining Dependency Trees | Because we use a milder condition however (we accept bigger trees whose suspicion rate is equal to the suspicion rate of any of their subtrees ), some amount of |
Mining Trees | Mining for frequent subtrees is an important problem that has many applications such as XML data mining, web usage analysis and RNA classification. |
Mining Trees | The HybridTreeMiner (HTM) algorithm presented in (Chi et al., 2004) provides a complete and com—putationally efficient method for discovering frequently occurring subtrees in a database of labelled unordered trees and counting them. |
Mining Trees | Second, the subtrees of the BFCF trees are enumerated in increasing size order using two tree operations called join and extension and their support (the number of trees in the database that contains each subtree) is recorded. |
Hierarchical Coreference | More specificaly, for each MH step, we first randomly select two subtrees headed by node- |
Hierarchical Coreference | Otherwise T7; and rj are subtrees in the same entity tree, then the following proposals are used instead: 0 Split Right - Make the subtree rj the root of a new entity by detaching it from its parent 0 Collapse - If 73; has a parent, then move 73’s children to ri’s parent and then delete m. |
Introduction | each step of inference is computationally efficient because evaluating the cost of attaching (or detaching) subtrees requires computing just a single compatibility function (as seen in Figure 1). |
Introduction | Finally, if memory is limited, redundant mentions can be pruned by replacing subtrees with their roots. |
Background | The PTB tree is constructed from the CCG bottom-up, creating leaves with lexical schemas, then merg-ing/adding subtrees using rule schemas at each step. |
Our Approach | (8* f {a}) or default to X f, Place subtrees (PP f0 (S f1” k a)) |
Our Approach | The subscripts indicate which subtrees to place where. |