Index of papers in Proc. ACL 2008 that mention
  • CRF
Arnold, Andrew and Nallapati, Ramesh and Cohen, William W.
Introduction
§2 introduces the maximum entropy (maxent) and conditional random field ( CRF ) learning techniques employed, along with specifications for the design and training of our hierarchical prior.
Investigation
Specifically, we compared our approximate hierarchical prior model (HIER), implemented as a CRF, against three baselines: o GAUSS: CRF model tuned on a single domain’s data, using a standard N(0, 1) prior 0 CAT: CRF model tuned on a concatenation of multiple domains’ data, using a N(0, 1) prior 0 CHELBA: CRF model tuned on one domain’s data, using a prior trained on a different, related domain’s data (cf.
Investigation
Line a shows the F1 performance of a CRF model tuned only on the target MUC6 domain (GAUSS) across a range of tuning data sizes.
Investigation
Line I) shows the same experiment, but this time the CRF model has been tuned on a dataset comprised of a simple concatenation of the training MUC6 data from (a), along with a different training set from MUC7 (CAT).
Models considered 2.1 Basic Conditional Random Fields
The parametric form of the CRF for a sentence of length n is given as follows:
Models considered 2.1 Basic Conditional Random Fields
CRF learns a model consisting of a set of weights A = {A1...)\F} over the features so as to maximize the conditional likelihood of the training data, p(lémm|Xtmm), given the model p A.
Models considered 2.1 Basic Conditional Random Fields
2.2 CRF with Gaussian priors
CRF is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Ding, Shilin and Cong, Gao and Lin, Chin-Yew and Zhu, Xiaoyan
Context and Answer Detection
Finally, we will briefly introduce CRF models and the features that we used for CRF model.
Context and Answer Detection
A CRF is an undirected graphical model G of the conditional distribution P(Y|X).
Context and Answer Detection
Linear CRF model has been successfully applied in NLP and text mining tasks (McCallum and Li, 2003; Sha and Pereira, 2003).
Introduction
To capture the dependency between contexts and answers, we introduce Skip-chain CRF model for answer detection.
Introduction
Experimental results show that 1) Linear CRFs outperform SVM and decision tree in both context and answer detection; 2) Skip-chain CRFs outperform Linear CRFs for answer finding, which demonstrates that context improves answer finding; 3) 2D CRF model improves the performance of Linear CRFs and the combination of 2D CRFs and Skip-chain CRFs achieves better performance for context detection.
CRF is mentioned in 22 sentences in this paper.
Topics mentioned in this paper:
Banko, Michele and Etzioni, Oren
Hybrid Relation Extraction
Due to the sequential nature of our RE task, H-CRF employs a CRF as the meta-leamer, as opposed to a decision tree or regression-based classifier.
Hybrid Relation Extraction
To obtain the probability at each position of a linear-chain CRF , the constrained forward-backward technique described in (Culotta and McCallum, 2004) is used.
Related Work
(2006) used a CRF for RE, yet their task differs greatly from open extraction.
Relation Extraction
Figure 1: Relation Extraction as Sequence Labeling: A CRF is used to identify the relationship, born in, between Kafka and Prague
Relation Extraction
The resulting set of labeled examples are described using features that can be extracted without syntactic or semantic analysis and used to train a CRF , a sequence model that learns to identify spans of tokens believed to indicate explicit mentions of relationships between entities.
Relation Extraction
The entity pair serves to anchor each end of a linear-chain CRF , and both entities in the pair are assigned a fixed label of ENT.
CRF is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Finkel, Jenny Rose and Kleeman, Alex and Manning, Christopher D.
Introduction
For example, in (Lafferty et al., 2001), when switching from a generatively trained hidden Markov model (HMM) to a discriminatively trained, linear chain, conditional random field ( CRF ) for part-of-speech tagging, their error drops from 5.7% to 5.6%.
Introduction
When they add in only a small set of orthographic features, their CRF error rate drops considerably more to 4.3%, and their out-of-vocabulary error rate drops by more than half.
The Model
We then define a conditional probability distribution over entire trees, using the standard CRF distribution, shown in (1).
CRF is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Kazama, Jun'ichi and Torisawa, Kentaro
Experiments
We generated the node and the edge features of a CRF model as described in Table 3 using these atomic features.
Experiments
To train CRF models, we used Taku Kudo’s CRF++ (ver.
Using Gazetteers as Features of NER
These annotated IOB tags can be used in the same way as other features in a CRF tagger.
CRF is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: