Index of papers in Proc. ACL 2012 that mention
  • CRF
Chan, Wen and Zhou, Xiangdong and Wang, Wei and Chua, Tat-Seng
Abstract
In order to automatically generate a novel and non-redundant community answer summary, we segment the complex original multi-sentence question into several sub questions and then propose a general Conditional Random Field ( CRF ) based answer summary method with group L1 regularization.
Introduction
We tackle the answer summary task as a sequential labeling process under the general Conditional Random Fields ( CRF ) framework: every answer sentence in the question thread is labeled as a summary sentence or non-summary sentence, and we concatenate the sentences with summary label to form the final summarized answer.
Introduction
First, we present a general CRF based framework
Introduction
Second, we propose a group L1-regularization approach in the CRF model for automatic optimal feature learning to unleash the potential of the features and enhance the performance of answer summarization.
The Summarization Framework
Then under CRF (Lafferty et al., 2001), the conditional probability of y given X obeys the following distribution: p(ylx) = 22mm 2 Mgl<v,y|.,x>
The Summarization Framework
Therefore, to explore the optimal combination of these features, we propose a group L1 regularization term in the general CRF model (Section 3.3) for feature learning.
The Summarization Framework
These sentence-level features can be easily utilized in the CRF framework.
CRF is mentioned in 21 sentences in this paper.
Topics mentioned in this paper:
Qu, Zhonghua and Liu, Yang
Abstract
We use linear-chain conditional random fields (CRF) for sentence type tagging, and a 2D CRF to label the dependency relation between sentences.
Introduction
We use linear-chain conditional random fields ( CRF ) to take advantage of many long-distance and nonlocal features.
Introduction
First each sentence is considered as a source, and we run a linear-chain CRF to label whether each of the other sentences is its target.
Introduction
Because multiple runs of separate linear-chain CRFs ignore the dependency between source sentences, the second approach we propose is to use a 2D CRF that models all pair relationships jointly.
Related Work
In (Ding et al., 2008), a two-pass approach was used to find relevant solutions for a given question, and a skip-chain CRF was adopted to model long range de-
Thread Structure Tagging
Linear-chain CRF is a special case of the general CRFs.
Thread Structure Tagging
In linear-chain CRF , cliques only involve two adjacent variables in the sequence.
Thread Structure Tagging
Figure 3 shows the graphical structure of a linear-chain CRF .
CRF is mentioned in 19 sentences in this paper.
Topics mentioned in this paper:
Green, Spence and DeNero, John
A Class-based Model of Agreement
We treat segmentation as a character-level sequence modeling problem and train a linear-chain conditional random field ( CRF ) model (Lafferty et al., 2001).
A Class-based Model of Agreement
Class-based Agreement Model t E T Set of morpho-syntactic classes 3 E S Set of all word segments 6569 Learned weights for the CRF-based segmenter 0mg Learned weights for the CRF-based tagger gbo, gbt CRF potential functions (emission and transition)
A Class-based Model of Agreement
For this task we also train a standard CRF model on full sentences with gold classes and segmentation.
Conclusion and Outlook
The model can be implemented with a standard CRF package, trained on existing treebanks for many languages, and integrated easily with many MT feature APIs.
CRF is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Tang, Hao and Keshet, Joseph and Livescu, Karen
Algorithm
(2) under the log-loss results in a probabilistic model commonly known as a conditional random field ( CRF ) (Lafferty et al., 2001).
Discussion
Large-margin learning, using the Passive-Aggressive and Pegasos algorithms, has benefits over CRF learning for our task: It produces sparser models, is faster, and produces better lexical access results.
Experiments
4We use the term “CRF” since the learning algorithm corresponds to CRF learning, although the task is multiclass classification rather than a sequence or structure prediction task.
Experiments
CRF learning with the same features performs about 6% worse than the corresponding PA and Pegasos models.
Experiments
The single-threaded running time for PNDP+ and Pegasos/DP+ is about 40 minutes per epoch, measured on a dual-core AMD 2.4GHz CPU with 8GB of memory; for CRF, it takes about 100 minutes for each epoch, which is almost entirely because the weight vector 0 is less sparse with CRF learning.
CRF is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Sun, Xu and Wang, Houfeng and Li, Wenjie
System Architecture
Based on our CRF word segmentation system, we can compute a probability for each segment.
System Architecture
Note that, although our model is a Markov CRF model, we can still use word features to learn word information in the training data.
System Architecture
For traditional implementation of CRF systems (e.g., the HCRF package), usually the edges features contain only the information of yi_1 and y, and without the information of
CRF is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Constant, Matthieu and Sigogne, Anthony and Watrin, Patrick
Evaluation
We first tested a standalone MWE recognizer based on CRF .
Evaluation
The CRF recognizer relies on the software Wapiti6 (Lavergne et al., 2010) to train and apply the model, and on the software Unitex (Paumier, 2011) to apply lexical resources.
Evaluation
Table 3: MWE identification with CRF : base are the features corresponding to token properties and word n-grams.
MWE-dedicated Features
In order to deal with unknown words and special tokens, we incorporate standard tagging features in the CRF : lowercase forms of the words, word prefixes of length l to 4, word suffice of length l to 4, whether the word is capitalized, whether the token has a digit, whether it is an hyphen.
Two strategies, two discriminative models
For such a task, we used Linear chain Conditional Ramdom Fields ( CRF ) that are discriminative prob-
CRF is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Li, Fangtao and Pan, Sinno Jialin and Jin, Ou and Yang, Qiang and Zhu, Xiaoyan
Introduction
Our work is similar to Jakob and Gurevych (2010) which proposed a Conditional Random Field ( CRF ) for cross-domain topic word extraction.
Introduction
We denote iSVM and iCRF the in-domain SVM and CRF classifiers in experiments, and compare our proposed methods,
Introduction
Cross-Domain CRF (Cross-CRF) we implement a cross-domain CRF algorithm proposed by (Jakob and Gurevych, 2010).
CRF is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Liu, Xiaohua and Zhou, Ming and Zhou, Xiangyang and Fu, Zhongyang and Wei, Furu
Introduction
(2011) develop a system that exploits a CRF model to segment named
Related Work
A linear CRF model
Related Work
(2010) use Amazons Mechanical Turk service 3 and CrowdFlower 4 to annotate named entities in tweets and train a CRF model to evaluate the effectiveness of human labeling.
Related Work
(2011) rebuild the NLP pipeline for tweets beginning with POS tagging, through chunking, to NER, which first exploits a CRF model to segment named entities and then uses a distantly supervised approach based on LabeledLDA to classify named entities.
CRF is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Wick, Michael and Singh, Sameer and McCallum, Andrew
Background: Pairwise Coreference
For higher accuracy, a graphical model such as a conditional random field ( CRF ) is constructed from the compatibility functions to jointly reason about the pairwise decisions (McCallum and Wellner, 2004).
Background: Pairwise Coreference
We now describe the pairwise CRF for coreference as a factor graph.
Background: Pairwise Coreference
Given the pairwise CRF , the problem of coreference is then solved by searching for the setting of the coreference decision variables that has the highest probability according to Equation 1 subject to the
Conclusion
Indeed, inference in the hierarchy is orders of magnitude faster than a pairwise CRF , allowing us to infer accurate coreference on
CRF is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Whitney, Max and Sarkar, Anoop
Existing algorithms 3.1 Yarowsky
Note that this is a linear model similar to a conditional random field ( CRF ) (Lafferty et al., 2001) for unstructured multiclass problems.
Existing algorithms 3.1 Yarowsky
It uses a CRF (Lafferty et al., 2001) as the underlying supervised learner.
Existing algorithms 3.1 Yarowsky
It differs significantly from Yarowsky in two other ways: First, instead of only training a CRF it also uses a step of graph propagation between distributions over the n-grams in the data.
CRF is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: