Index of papers in Proc. ACL 2013 that mention
  • feature vector
Yu, Haonan and Siskind, Jeffrey Mark
Detailed Problem Formulation
We use discrete features, namely natural numbers, in our feature vectors , quantized by a binning process.
Detailed Problem Formulation
The length of the feature vector may vary across parts of speech.
Detailed Problem Formulation
Let NC denote the length of the feature vector for part of speech c, 3073; denote the time-series (mil, .
Introduction
We associate a feature vector with each frame (detection) of each such track.
Introduction
This feature vector can encode image features (including the identity of the particular detector that produced that detection) that correlate with object class; region color, shape, and size features that correlate with object properties; and motion features, such as linear and angular object position, velocity, and acceleration, that correlate with event properties.
Introduction
involves computing the associated feature vector for that HMM over the detections in the tracks chosen to fill its arguments.
The Sentence Tracker
,qT) denote the sequence of states qt that leads to an observed track, B (D75, jt,qt,)\) denote the conditional log probability of observing the feature vector associated with the detection selected by jt among the detections D75 in frame t, given that the HMM is in state qt, and A(qt_1, qt, A) denote the log transition probability of the HMM.
The Sentence Tracker
We further need to generalize F so that it computes the joint score of a sequence of detections, one for each track, G so that it computes the joint measure of coherence between a sequence of pairs of detections in two adjacent frames, and B so that it computes the joint conditional log probability of observing the feature vectors associated with the sequence of detections selected by jt.
The Sentence Tracker
We further need to generalize B so that it computes the joint conditional log probability of observing the feature vectors for the detections in the tracks that are assigned to the arguments of the HMM for each word in the sentence and A so that it computes the joint log transition probability for the HMMs for all words in the sentence.
feature vector is mentioned in 13 sentences in this paper.
Topics mentioned in this paper:
liu, lemao and Watanabe, Taro and Sumita, Eiichiro and Zhao, Tiejun
Abstract
In addition, word embedding is employed as the input to the neural network, which encodes each word as a feature vector .
Introduction
We also integrate word embedding into the model by representing each word as a feature vector (Collobert and Weston, 2008).
Introduction
.. ,hK(f,e,d))T is a K -dimensional feature vector defined on the tuple (f,e,d); W = (w1,w2,--- ,wK)T is a K-dimensional weight vector of h, i.e., the parameters of the model, and it can be tuned by the toolkit MERT (Och, 2003).
Introduction
(3) as a function of a feature vector h, i.e.
feature vector is mentioned in 9 sentences in this paper.
Topics mentioned in this paper:
Ravi, Sujith
Bayesian MT Decipherment via Hash Sampling
One possible strategy is to compute similarity scores 8(Wfi, we/) between the current source word feature vector Wfi and feature vectors we/Eve for all possible candidates in the target vocabulary.
Bayesian MT Decipherment via Hash Sampling
This makes the complexity far worse (in practice) since the dimensionality of the feature vectors d is a much higher value than Computing similarity scores alone (nai'vely) would incur O(|Ve| - d) time which is prohibitively huge since we have to do this for every token in the source language corpus.
Feature-based representation for Source and Target
But unlike documents, here each word w is associated with a feature vector wl...wd (where wi represents the weight for the feature indexed by i) which is constructed from monolingual corpora.
Feature-based representation for Source and Target
Unlike the target word feature vectors (which can be precomputed from the monolingual target corpus), the feature vector for every source word fj is dynamically constructed from the target translation sampled in each training iteration.
Feature-based representation for Source and Target
), it results in the feature representation becoming more sparse (especially for source feature vectors ) which can cause problems in efficiency as well as robustness when computing similarity against other vectors.
Training Algorithm
(a) Generate a proposal distribution by computing the hamming distance between the feature vectors for the source word and each target translation candidate.
feature vector is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Moreno, Jose G. and Dias, Gaël and Cleuziou, Guillaume
Evaluation
Before the clustering process takes place, Web snippets are represented as word feature vectors .
Evaluation
In particular, p is the size of the word feature vectors representing both Web snippets and centroids (p = 2.5), K is the number of clusters to be found (K = 2..10) and 8(Wik, 14/31) is the collocation measure integrated in the InfoSimba similarity measure.
Introduction
On the other hand, the polythetic approach which main idea is to represent Web snippets as word feature vectors has received less attention, the only relevant work being (Osinski and Weiss, 2005).
Introduction
feature vectors are hard to define in small collections of short text fragments (Timonen, 2013), (2) existing second-order similarity measures such as the cosine are unadapted to capture the semantic similarity between small texts, (3) Latent Semantic Analysis has evidenced inconclusive results (Osinski and Weiss, 2005) and (4) the labeling process is a surprisingly hard extra task (Carpineto et al., 2009).
feature vector is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Perez-Rosas, Veronica and Mihalcea, Rada and Morency, Louis-Philippe
Discussion
This may be due to the smaller number of feature vectors used in the experiments (only 80, as compared to the 412 used in the previous setup).
Discussion
Another possible reason is the fact that the acoustic and visual modalities are significantly weaker than the linguistic modality, most likely due to the fact that the feature vectors are now speaker-independent, which makes it harder to improve over the linguistic modality alone.
Experiments and Results
In this approach, the features collected from all the multimodal streams are combined into a single feature vector , thus resulting in one vector for each utterance in the dataset which is used to make a decision about the sentiment orientation of the utterance.
Multimodal Sentiment Analysis
The features are averaged over all the frames in an utterance, to obtain one feature vector for each utterance.
feature vector is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Rokhlenko, Oleg and Szpektor, Idan
Online Question Generation
We next describe the various features we extract for every entity and the supervised models that given this feature vector representation assess the correctness of an instantiation.
Online Question Generation
The feature vector of each named entity was induced as described in Section 4.2.1.
Online Question Generation
To generate features for a candidate pair, we take the two feature vectors of the two entities and induce families of pair features by comparing between the two vectors.
feature vector is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Wang, Zhigang and Li, Zhixing and Li, Juanzi and Tang, Jie and Z. Pan, Jeff
Our Approach
Then, we use a uniform automatic method, which primarily consists of word labeling and feature vector generation, to generate the training data set TD 2 {(55, from these collected articles.
Our Approach
(1) After the word labeling, each instance (word/token) is represented as a feature vector .
Our Approach
First, we turn the team: into a word sequence and compute the feature vector for each word based on the feature definition in Section 3.1.
Preliminaries
:10, can be represented as a feature vector according to its context.
feature vector is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Yih, Wen-tau and Chang, Ming-Wei and Meek, Christopher and Pastusiak, Andrzej
Learning QA Matching Models
Given a word pair (wq,w8), where mg 6 Vq and ws 6 V8, feature functions o1, - -- ,gbd map it to a d-dimensional real-valued feature vector .
Learning QA Matching Models
We consider two aggregate functions for defining the feature vectors of the whole ques-tiorflanswer pair: average and max.
Learning QA Matching Models
(Dmaaj (Q7 3) : £11252 ij (wqa we) (2) wsEVs Together, each questiorflsentence pair is represented by a 2d-dimensional feature vector .
feature vector is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zhu, Muhua and Zhang, Yue and Chen, Wenliang and Zhang, Min and Zhu, Jingbo
Baseline parser
Here (I) (ai) represents the feature vector for the it}, action a, in state item 04.
Improved hypotheses comparison
The significant variance in the number of actions N can have an impact on the linear separability of state items, for which the feature vectors are 21-111 (I) (ai).
Improved hypotheses comparison
A feature vector is extracted for the IDLE action according to the final state context, in the same way as other actions.
Improved hypotheses comparison
corresponding feature vectors have about the same sizes, and are more linearly separable.
feature vector is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Choi, Jinho D. and McCallum, Andrew
Selectional branching
For each parsing state sij, a prediction is made by generating a feature vector xij E X , feeding it into a classifier C1 that uses a feature map (19(53, y) and a weight vector w to measure a score for each label y E y, and choosing a label with the highest score.
Selectional branching
During training, a training instance is generated for each parsing state sij by taking a feature vector xij and its true label yij.
Selectional branching
Then, a subgradient is measured by taking all feature vectors together weighted by Q (line 6).
feature vector is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Cohn, Trevor and Specia, Lucia
Gaussian Process Regression
In our regression task3 the data consists of n pairs D = {(xi,yi)}, where x,- 6 RF is a F-dimensional feature vector and y,- E R is the response variable.
Gaussian Process Regression
Each instance is a translation and the feature vector encodes its linguistic features; the response variable is a numerical quality judgement: post editing time or likert score.
Gaussian Process Regression
GP regression assumes the presence of a latent function, f : RF —> R, which maps from the input space of feature vectors x to a scalar.
feature vector is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Guinaudeau, Camille and Strube, Michael
Method
A fundamental assumption underlying our model is that this bipartite graph contains the entity transition information needed for local coherence computation, rendering feature vectors and learning phase unnecessary.
The Entity Grid Model
To make this representation accessible to machine learning algorithms, Barzilay and Lapata (2008) compute for each document the probability of each transition and generate feature vectors representing the sentences.
The Entity Grid Model
(2011) use discourse relations to transform the entity grid representation into a discourse role matrix that is used to generate feature vectors for machine learning algorithms similarly to Barzilay and Lapata (2008).
feature vector is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: