Index of papers in Proc. ACL 2010 that mention
  • feature vector
Huang, Ruihong and Riloff, Ellen
Related Work
However, when we create feature vectors for the classifier, the seeds themselves are hidden and only contextual features are used to represent each training instance.
Related Work
We use an in-house sentence segmenter and NP chunker to identify the base NPs in each sentence and create feature vectors that represent each constituent in the sentence as either an NP or an individual word.
Related Work
Two training instances would be created, with feature vectors that look like this, where M represents a modifier inside the target NP:
feature vector is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Xiong, Deyi and Zhang, Min and Li, Haizhou
Error Detection with a Maximum Entropy Model
To formalize this task, we use a feature vector w to represent a word 21) in question, and a binary variable 0 to indicate whether this word is correct or not.
Error Detection with a Maximum Entropy Model
In the feature vector , we look at 2 words before and 2 words after the current word position (w_2, w_1, 212,201,202).
Error Detection with a Maximum Entropy Model
We collect features {wd, p03, link, dwpp} for each word among these words and combine them into the feature vector w for 212.
feature vector is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Berant, Jonathan and Dagan, Ido and Goldberger, Jacob
Background
Quite a few methods have been suggested (Lin and Pantel, 2001; Bhagat et al., 2007; Yates and Etzioni, 2009), which differ in terms of the specifics of the ways in which predicates are represented, the features that are extracted, and the function used to compute feature vector similarity.
Experimental Evaluation
When computing distributional similarity scores, a template is represented as a feature vector of the CUIs that instantiate its arguments.
Learning Entailment Graph Edges
Next, we represent each pair of propositional templates with a feature vector of various distributional similarity scores.
Learning Entailment Graph Edges
A template pair is represented by a feature vector where each coordinate is a different distributional similarity score.
Learning Entailment Graph Edges
Another variant occurs when using binary templates: a template may be represented by a pair of feature vectors , one for each variable (Lin and Pantel, 2001), or by a single vector, where features represent pairs of instantiations (Szpektor et al., 2004; Yates and Etzioni, 2009).
feature vector is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Wang, Baoxun and Wang, Xiaolong and Sun, Chengjie and Liu, Bingquan and Sun, Lin
Learning with Homogenous Data
High-dimensional feature vectors with only several nonzero dimensions bring large time consumption to our model.
Learning with Homogenous Data
Thus it is necessary to reduce the dimension of the feature vectors .
The Deep Belief Network for QA pairs
In the bottom layer, the binary feature vectors based on the statistics of the word occurrence in the answers are used to compute the “hidden features” in the
The Deep Belief Network for QA pairs
j where 0'(x) = 1/ (l + e"‘), 3 denotes the visible feature vector of the answer, qi is the ith element of the question vector, and h stands for the hidden feature vector for reconstructing the questions.
The Deep Belief Network for QA pairs
To detect the best answer to a given question, we just have to send the vectors of the question and its candidate answers into the input units of the network and perform a level-by-level calculation to obtain the corresponding feature vectors .
feature vector is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Sun, Jun and Zhang, Min and Tan, Chew Lim
Bilingual Tree Kernels
In order to compute the dot product of the feature vectors in the exponentially high dimensional feature space, we introduce the tree kernel functions as follows:
Bilingual Tree Kernels
It is infeasible to explicitly compute the kernel function by expressing the sub-trees as feature vectors .
Introduction
In addition, explicitly utilizing syntactic tree fragments results in exponentially high dimensional feature vectors , which is hard to compute.
Substructure Spaces for BTKs
The feature vector of the classifier is computed using a composite kernel:
feature vector is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Vogel, Adam and Jurafsky, Daniel
Approximate Dynamic Programming
Furthermore, as the size of the feature vector K increases, the space becomes even more difficult to search.
Reinforcement Learning Formulation
Thus, we represent state/action pairs with a feature vector gb(s, a) E RK.
Reinforcement Learning Formulation
Learning exactly which words influence decision making is difficult; reinforcement learning algorithms have problems with the large, sparse feature vectors common in natural language processing.
Reinforcement Learning Formulation
For a given state 3 = (u, l, c) and action a = (l’, 0’), our feature vector gb(s, a) is composed of the following:
feature vector is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Cheung, Jackie Chi Kit and Penn, Gerald
Introduction
We tabulate the transitions of entities between different syntactic positions (or their nonoccurrence) in sentences, and convert the frequencies of transitions into a feature vector representation of transition probabilities in the document.
Introduction
We solve this problem in a supervised machine learning setting, where the input is the feature vector representations of the two versions of the document, and the output is a binary value indicating the document with the original sentence ordering.
Introduction
Transition length — the maximum length of the transitions used in the feature vector representation of a document.
feature vector is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Nishikawa, Hitoshi and Hasegawa, Takaaki and Matsuo, Yoshihiro and Kikui, Genichiro
Experiments
We used CRFs-based Japanese dependency parser (Imamura et al., 2007) and named entity recognizer (Suzuki et al., 2006) for sentiment extraction and constructing feature vectors for readability score, respectively.
Optimizing Sentence Sequence
where, given two adjacent sentences 3,- and 3,41, ngb(sz-, 3H1), which measures the connectivity of the two sentences, is the inner product of w and gb(si, 3H1), w is a parameter vector and gb(si, 3141) is a feature vector of the two sentences.
Optimizing Sentence Sequence
We also define feature vector (13(8) of the entire sequence 8 = (so, 31, .
feature vector is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Prettenhofer, Peter and Stein, Benno
Cross-Language Text Classification
In standard text classification, a document d is represented under the bag-of-words model as |V|-dimensional feature vector x E X, where V, the vocabulary, denotes an ordered set of words, :ci 6 x denotes the normalized frequency of word 2' in d, and X is an inner product space.
Cross-Language Text Classification
D3 denotes the training set and comprises tuples of the form (X, 3/), which associate a feature vector x E X with a class label 3/ E Y.
Experiments
A document d is described as normalized feature vector x under a unigram bag-of-words document representation.
feature vector is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Wang, WenTing and Su, Jian and Tan, Chew Lim
Incorporating Structural Syntactic Information
Thus, it is computational infeasible to directly use the feature vector (MT).
The Recognition Framework
Suppose the training set S consists of labeled vectors {(xi, 311)}, where xi is the feature vector
The Recognition Framework
where a,- is the learned parameter for a feature vector xi, and b is another parameter which can be derived from a,- .
feature vector is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: