Medical Relation Extraction with Manifold Models

In this paper, we present a manifold model for medical relation extraction.

There exists a vast amount of knowledge sources and ontologies in the medical domain.

2.1 Medical Ontologies and Sources

3.1 Super Relations in Medical Domain

4.1 Motivations

5.1 Cross-Validation Test

In this paper, we identify a list of key relations that can facilitate clinical decision making.

Appears in 22 sentences as: Relation Extraction (2) Relation extraction (1) relation extraction (19) relation extractors (1)

In *Medical Relation Extraction with Manifold Models*

- In this paper, we present a manifold model for medical relation extraction .Page 1, “Abstract”
- Relation extraction plays a key role in information extraction.Page 1, “Introduction”
- To construct a medical relation extraction system, several challenges have to be addressed:Page 1, “Introduction”
- The medical corpus underlying our relation extraction system contains 80M sentences (ll gigabytes pure text).Page 1, “Introduction”
- The contributions of this paper on medical relation extraction are threefold:Page 2, “Introduction”
- In i2b2 relation extraction task, entity mentions are manually labeled, and each mention has 1 of 3 concepts: ‘treatment’, ‘problem’, and ‘test’.Page 2, “Introduction”
- To resemble real-world medical relation extraction challenges where perfect entity mentions do not exist, our new setup requires the entity mentions to be automatically detected.Page 2, “Introduction”
- 0 From the perspective of relation extraction applications, we identify “super relations”-the key relations that can facilitate clinical decision making (Table 1).Page 2, “Introduction”
- 0 From the perspective of relation extraction methodologies, we present a manifold model for relation extraction utilizing both labeled and unlabeled data.Page 2, “Introduction”
- The experimental results show that our relation detectors are fast and outperform the state-of-the-art approaches on medical relation extraction by a large margin.Page 2, “Introduction”
- 2.2 Relation ExtractionPage 2, “Background”

See all papers in *Proc. ACL 2014* that mention relation extraction.

See all papers in *Proc. ACL* that mention relation extraction.

Back to top.

Appears in 12 sentences as: unlabeled data (12)

In *Medical Relation Extraction with Manifold Models*

- 0 From the perspective of relation extraction methodologies, we present a manifold model for relation extraction utilizing both labeled and unlabeled data .Page 2, “Introduction”
- Part of the resulting data was manually vetted by our annotators, and the remaining was held as unlabeled data for further experiments.Page 4, “Identifying Key Medical Relations”
- Given a few labeled examples and many unlabeled examples for a relation, we want to build a relation detector leveraging both labeled and unlabeled data .Page 5, “Relation Extraction with Manifold Models”
- Integration of the unlabeled data can help solve overfitting problems when the labeled data is not sufficient.Page 5, “Relation Extraction with Manifold Models”
- When ,u = 0, the model disregards the unlabeled data , and the data manifold topology is not respected.Page 6, “Relation Extraction with Manifold Models”
- 0 The algorithm exploits unlabeled data , which helps prevent “overfitting” from happening.Page 7, “Relation Extraction with Manifold Models”
- By integrating unlabeled data , the manifold model under setting (1) made a 15% improvement over linear regression model on F1 score, where the improvement was significant across all relations.Page 8, “Experiments”
- This tells us that estimating the label of unlabeled examples based upon the sampling result is one way to utilize unlabeled data and may help improve the relation extraction results.Page 8, “Experiments”
- On one hand, this result shows that using more unlabeled data can further improve the result.Page 8, “Experiments”
- On the other hand, the insignificant improvement over (1) and (2) strongly indicates that how to utilize more unlabeled data to achieve a significant improvement is nontrivial and deserves more attention.Page 8, “Experiments”
- To what extensions the unlabeled data can help the learning process is an open problem.Page 8, “Experiments”

See all papers in *Proc. ACL 2014* that mention unlabeled data.

See all papers in *Proc. ACL* that mention unlabeled data.

Back to top.

Appears in 8 sentences as: Knowledge Base (2) knowledge base (6) knowledge bases (1)

In *Medical Relation Extraction with Manifold Models*

- In candidate answer generation, relations enable the background knowledge base to be used for potential candidatePage 1, “Introduction”
- We also apply our model to build a new medical relation knowledge base as a complement to the existing knowledge bases .Page 2, “Introduction”
- To achieve this, we parsed all 80M sentences in our medical corpus, looking for the sentences containing the terms that are associated with the CUI pairs in the knowledge base .Page 4, “Identifying Key Medical Relations”
- For example, we know from the knowledge base that “antibiotic drug” may treat “Lyme disease”.Page 4, “Identifying Key Medical Relations”
- 5.2 Knowledge Base (KB) ConstructionPage 9, “Experiments”
- Further, the medical knowledge is changing extremely quickly, making people hard to understand it, and update it in the knowledge base in a timely manner.Page 9, “Experiments”
- Table 3: Knowledge Base Comparison Recall@20 Recall@50 Recall@3000 Our KB 135/742 182/742 301/742Page 9, “Experiments”
- We apply the new model to construct a relation knowledge base (KB), and use it as a complement to the existing manually created KBs.Page 9, “Conclusions”

See all papers in *Proc. ACL 2014* that mention knowledge base.

See all papers in *Proc. ACL* that mention knowledge base.

Back to top.

Appears in 8 sentences as: labeled data (8)

In *Medical Relation Extraction with Manifold Models*

- When we build a naive model to detect relations, the model tends to overfit for the labeled data .Page 1, “Introduction”
- Recently, “distant supervision” has emerged to be a popular choice for training relation extractors without using manually labeled data (Mintz et al., 2009; J iang, 2009; Chan and Roth, 2010; Wang et al., 2011; Riedel et al., 2010; Ji et al., 2011; Hoffmann et al., 2011; Sur-deanu et al., 2012; Takamatsu et al., 2012; Min et al., 2013).Page 2, “Background”
- Our current strategy is to integrate all associated types, and rely on the relation detector trained with the labeled data to decide how to weight different types based upon the context.Page 5, “Identifying Key Medical Relations”
- Integration of the unlabeled data can help solve overfitting problems when the labeled data is not sufficient.Page 5, “Relation Extraction with Manifold Models”
- (1) Manifold Unlabeled: We combined the labeled data and unlabeled set 1 in training.Page 8, “Experiments”
- (2) Manifold Predicted Labels: We combined labeled data and unlabeled set 2 in training.Page 8, “Experiments”
- beled data and the data from unlabeled set 2 was used as labeled data (With Weights).Page 8, “Experiments”
- the approaches that completely depend on the labeled data are likely to run into overfitting.Page 8, “Experiments”

See all papers in *Proc. ACL 2014* that mention labeled data.

See all papers in *Proc. ACL* that mention labeled data.

Back to top.

Appears in 6 sentences as: tree kernel (3) tree kernels (3)

In *Medical Relation Extraction with Manifold Models*

- Many of them focus on using tree kernels to learn parse tree structure related features (Collins and Duffy, 2001; Culotta and Sorensen, 2004; Bunescu and Mooney, 2005).Page 2, “Background”
- For example, by combining tree kernels and convolution string kernels, (Zhang et al., 2006) achieved the state of the art performance on ACE data (ACE, 2004).Page 2, “Background”
- We compare our approaches to three state-of-the-art approaches including SVM with convolution tree kernels (Collins and Duffy, 2001), linear regression and SVM with linear kernels (Scholkopf and Smola, 2002).Page 7, “Experiments”
- To adapt the tree kernel to medical domain, we followed the approach in (Nguyen et al., 2009) to take the syntactic structures into consideration.Page 7, “Experiments”
- We also added the argument types as features to the tree kernel .Page 7, “Experiments”
- In the tree kernel implementation, we assigned the tree structure and the vector corresponding to the argument typesPage 7, “Experiments”

See all papers in *Proc. ACL 2014* that mention tree kernel.

See all papers in *Proc. ACL* that mention tree kernel.

Back to top.

Appears in 6 sentences as: entity mentions (7)

In *Medical Relation Extraction with Manifold Models*

- In i2b2 relation extraction task, entity mentions are manually labeled, and each mention has 1 of 3 concepts: ‘treatment’, ‘problem’, and ‘test’.Page 2, “Introduction”
- To resemble real-world medical relation extraction challenges where perfect entity mentions do not exist, our new setup requires the entity mentions to be automatically detected.Page 2, “Introduction”
- The most well-known tool to detect medical entity mentions is MetaMap (Aronson, 2001), which considers all terms as entities and automatically associates each term with a number of concepts from UMLS CUI dictionary (Lindberg et al., 1993) with more than 2.7 million distinct concepts (compared to 3 in i2b2).Page 2, “Introduction”
- The huge amount of entity mentions , concepts and noisy concept assignments provide a tough situation that people have to face in real-world applications.Page 2, “Introduction”
- The most well-known tool to detect medical entity mentions is MetaMap (Aronson, 2001), which considers all terms as entities and automatically associates each term with a number of concepts from UMLS CUI dictionary (Lindberg et al., 1993) with 2.7 million distinct concepts.Page 4, “Identifying Key Medical Relations”
- 3If we take the perfect entity mentions and the associated concepts provided by i2b2 (Uzuner et al., 2011) as the input, our system can directly apply to i2b2 relation extraction data.Page 7, “Experiments”

See all papers in *Proc. ACL 2014* that mention entity mentions.

See all papers in *Proc. ACL* that mention entity mentions.

Back to top.

Appears in 6 sentences as: overfit (1) overfitting (4) “overfitting” (1)

In *Medical Relation Extraction with Manifold Models*

- When we build a naive model to detect relations, the model tends to overfit for the labeled data.Page 1, “Introduction”
- Integration of the unlabeled data can help solve overfitting problems when the labeled data is not sufficient.Page 5, “Relation Extraction with Manifold Models”
- The second term is useful to bound the mapping function f and prevents overfitting from happening.Page 6, “Relation Extraction with Manifold Models”
- 0 The algorithm exploits unlabeled data, which helps prevent “overfitting” from happening.Page 7, “Relation Extraction with Manifold Models”
- the approaches that completely depend on the labeled data are likely to run into overfitting .Page 8, “Experiments”
- Linear SVM performed better than the other two, since the large-margin constraint together with the linear model constraint can alleviate overfitting .Page 8, “Experiments”

See all papers in *Proc. ACL 2014* that mention overfitting.

See all papers in *Proc. ACL* that mention overfitting.

Back to top.

Appears in 5 sentences as: linear regression (4) linear regressions (1)

In *Medical Relation Extraction with Manifold Models*

- o The algorithm is computationally efficient at the apply time (as fast as linear regressions ).Page 7, “Relation Extraction with Manifold Models”
- We compare our approaches to three state-of-the-art approaches including SVM with convolution tree kernels (Collins and Duffy, 2001), linear regression and SVM with linear kernels (Scholkopf and Smola, 2002).Page 7, “Experiments”
- The SVM with linear kernels and the linear regression model used the same features as the manifold models.Page 8, “Experiments”
- The tree kemel-based approach and linear regression achieved similar F1 scores, while linear SVM made a 5% improvement over them.Page 8, “Experiments”
- By integrating unlabeled data, the manifold model under setting (1) made a 15% improvement over linear regression model on F1 score, where the improvement was significant across all relations.Page 8, “Experiments”

See all papers in *Proc. ACL 2014* that mention linear regression.

See all papers in *Proc. ACL* that mention linear regression.

Back to top.

Appears in 5 sentences as: SVM (6)

In *Medical Relation Extraction with Manifold Models*

- In SVM implementations, the tradeoff parameter between training error and margin was set to l for all experiments.Page 7, “Experiments”
- We compare our approaches to three state-of-the-art approaches including SVM with convolution tree kernels (Collins and Duffy, 2001), linear regression and SVM with linear kernels (Scholkopf and Smola, 2002).Page 7, “Experiments”
- The SVM with linear kernels and the linear regression model used the same features as the manifold models.Page 8, “Experiments”
- The tree kemel-based approach and linear regression achieved similar F1 scores, while linear SVM made a 5% improvement over them.Page 8, “Experiments”
- Linear SVM performed better than the other two, since the large-margin constraint together with the linear model constraint can alleviate overfitting.Page 8, “Experiments”

See all papers in *Proc. ACL 2014* that mention SVM.

See all papers in *Proc. ACL* that mention SVM.

Back to top.

Appears in 5 sentences as: F1 score (3) F1 scores (2)

In *Medical Relation Extraction with Manifold Models*

- The F1 scores reported here are the average of all 5 rounds.Page 7, “Experiments”
- The tree kemel-based approach and linear regression achieved similar F1 scores , while linear SVM made a 5% improvement over them.Page 8, “Experiments”
- By integrating unlabeled data, the manifold model under setting (1) made a 15% improvement over linear regression model on F1 score , where the improvement was significant across all relations.Page 8, “Experiments”
- Under setting (2), the With Weights strategy achieved a slightly worse F1 score than the previous setting but much better result than the baseline approaches.Page 8, “Experiments”
- That resulted in very poor performance: the average F1 score was below 30%.Page 8, “Experiments”

See all papers in *Proc. ACL 2014* that mention F1 score.

See all papers in *Proc. ACL* that mention F1 score.

Back to top.

Appears in 5 sentences as: dependency path (4) dependency paths (1)

In *Medical Relation Extraction with Manifold Models*

- The similarity of two sentences is defined as the bag-of-words similarity of the dependency paths connecting arguments.Page 4, “Identifying Key Medical Relations”
- o (3) Syntactic features representing the dependency path between two arguments, such as “subj”, “pred”, “modJiprep” and “objprep” (between arguments “antibiotic” and “lyme disease”) in Figure 2.Page 5, “Relation Extraction with Manifold Models”
- 0 (5) Topic features modeling the words in the dependency path .Page 5, “Relation Extraction with Manifold Models”
- In the example given in Figure 2, the dependency path contains the following words: “be”, “standard therapy” and “for”.Page 5, “Relation Extraction with Manifold Models”
- 0 (7) Bag-of-words features modeling the dependency path .Page 5, “Relation Extraction with Manifold Models”

See all papers in *Proc. ACL 2014* that mention dependency path.

See all papers in *Proc. ACL* that mention dependency path.

Back to top.

Appears in 4 sentences as: regression model (3) regression models (1)

In *Medical Relation Extraction with Manifold Models*

- Our model goes beyond regular regression models in that it applies constraints to those coefficients, such that the topology of the given data manifold will be respected.Page 1, “Introduction”
- Computing the optimal weights in a regression model and preserving manifold topology are conflicting objectives, wePage 1, “Introduction”
- The SVM with linear kernels and the linear regression model used the same features as the manifold models.Page 8, “Experiments”
- By integrating unlabeled data, the manifold model under setting (1) made a 15% improvement over linear regression model on F1 score, where the improvement was significant across all relations.Page 8, “Experiments”

See all papers in *Proc. ACL 2014* that mention regression model.

See all papers in *Proc. ACL* that mention regression model.

Back to top.

Appears in 3 sentences as: Bag-of-words (2) bag-of-words (1)

In *Medical Relation Extraction with Manifold Models*

- The similarity of two sentences is defined as the bag-of-words similarity of the dependency paths connecting arguments.Page 4, “Identifying Key Medical Relations”
- 0 (7) Bag-of-words features modeling the dependency path.Page 5, “Relation Extraction with Manifold Models”
- o (8) Bag-of-words features modeling the whole sentence.Page 6, “Relation Extraction with Manifold Models”

See all papers in *Proc. ACL 2014* that mention Bag-of-words.

See all papers in *Proc. ACL* that mention Bag-of-words.

Back to top.

Appears in 3 sentences as: Parse Tree (1) parse tree (2)

In *Medical Relation Extraction with Manifold Models*

- Many of them focus on using tree kernels to learn parse tree structure related features (Collins and Duffy, 2001; Culotta and Sorensen, 2004; Bunescu and Mooney, 2005).Page 2, “Background”
- Figure 2: A Parse Tree ExamplePage 5, “Identifying Key Medical Relations”
- Consider the sentence: “Antibiotics are the standard therapy for Lyme disease”: MedicalESG first generates a dependency parse tree (Figure 2) to represent grammatical relations between the words in the sentence, and then associates the words with CUIs.Page 5, “Identifying Key Medical Relations”

See all papers in *Proc. ACL 2014* that mention parse tree.

See all papers in *Proc. ACL* that mention parse tree.

Back to top.

Appears in 3 sentences as: extraction system (3)

In *Medical Relation Extraction with Manifold Models*

- To construct a medical relation extraction system , several challenges have to be addressed:Page 1, “Introduction”
- The medical corpus underlying our relation extraction system contains 80M sentences (ll gigabytes pure text).Page 1, “Introduction”
- The first step in building a relation extraction system for medical domain is to identify the relations that are important for clinical decision making.Page 3, “Identifying Key Medical Relations”

See all papers in *Proc. ACL 2014* that mention extraction system.

See all papers in *Proc. ACL* that mention extraction system.

Back to top.

Appears in 3 sentences as: semantic relations (3)

In *Medical Relation Extraction with Manifold Models*

- Using question answering as an example (Wang et al., 2012): in question analysis, the semantic relations between the question focus and each term in the clue can be used to identify the weight of each term so that better search queries can be generated.Page 1, “Introduction”
- In candidate answer scoring, relation-based matching algorithms can go beyond explicit lexical and syntactic information to detect implicit semantic relations shared across the question and passages.Page 1, “Introduction”
- To extract semantic relations from text, three types of approaches have been applied.Page 2, “Background”

See all papers in *Proc. ACL 2014* that mention semantic relations.

See all papers in *Proc. ACL* that mention semantic relations.

Back to top.

Appears in 3 sentences as: similar scores (3)

In *Medical Relation Extraction with Manifold Models*

- To address this issue, we develop a manifold model (Belkin et al., 2006) that encourages examples (including both labeled and unlabeled examples) with similar contents to be assigned with similar scores .Page 1, “Introduction”
- Scores are fit so that examples (both labeled and unlabeled) with similar content get similar scores , and scores of labeled examples are close to their labels.Page 5, “Relation Extraction with Manifold Models”
- In addition, we also want f to preserve the manifold topology of the dataset, such that similar examples (both labeled and unlabeled) get similar scores .Page 6, “Relation Extraction with Manifold Models”

See all papers in *Proc. ACL 2014* that mention similar scores.

See all papers in *Proc. ACL* that mention similar scores.

Back to top.

Appears in 3 sentences as: distant supervision (1) “distant supervision” (2)

In *Medical Relation Extraction with Manifold Models*

- Recently, “distant supervision” has emerged to be a popular choice for training relation extractors without using manually labeled data (Mintz et al., 2009; J iang, 2009; Chan and Roth, 2010; Wang et al., 2011; Riedel et al., 2010; Ji et al., 2011; Hoffmann et al., 2011; Sur-deanu et al., 2012; Takamatsu et al., 2012; Min et al., 2013).Page 2, “Background”
- This ( distant supervision ) approach resulted in a huge amount of sentences that contain the desired relations, but also brought in a lot of noise in the form of false positives.Page 4, “Identifying Key Medical Relations”
- This feature is useful when the training data comes from “crowdsourcing” or “distant supervision” .Page 7, “Relation Extraction with Manifold Models”

See all papers in *Proc. ACL 2014* that mention “distant supervision”.

See all papers in *Proc. ACL* that mention “distant supervision”.

Back to top.

Appears in 3 sentences as: confidence score (1) confidence scores (2)

In *Medical Relation Extraction with Manifold Models*

- The new KB covers all super relations and stores the knowledge in the format of (rela-tionJiame, argument_l, argumentl, confidence), where the confidence is computed based on the relation detector confidence score and relation popularity in the corpus.Page 9, “Experiments”
- If we detect multiple relations in the question, and the same answer is generated from more than one relations, we sum up all those confidence scores to make such answers more preferable.Page 9, “Experiments”
- In this scenario, we sort the answers based upon the confidence scores and only consider up to p answers for each question.Page 9, “Experiments”

See all papers in *Proc. ACL 2014* that mention confidence scores.

See all papers in *Proc. ACL* that mention confidence scores.

Back to top.