SciSurf: Index of "feature vector" in Proc. ACL 2012

Index of papers in Proc. ACL 2012 that mention

feature vector

Seen in text as:

feature vector (14)
feature vectors (12)

Seen in 27 sentences in 5 papers.

1. A Nonparametric Bayesian Approach to Acoustic Model Discovery

Lee, Chia-ying and Glass, James

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Inference	For each of the remaining feature vectors in
Inference	Mixture ID (mt) For each feature vector in a segment, given the cluster label CM and the hidden state index st, the derivation of the conditional posterior probability of its mixture ID is straightforward:
Inference	where mm is the set of mixture IDs of feature vectors that belong to state 8 of HMM c. The mth entry of fl’ is fl + themcqs 6(mt, m), where we use
Model	Given the cluster label, choose a hidden state for each feature vector :13; in the segment.
Model	Use the chosen Gaussian mixture to generate the observed feature vector
Model	2, where the shaded circle denotes the observed feature vectors , and the squares denote the hyperparameters of the priors used in our model.
Problem Formulation	1 illustrates how the speech signal of a single word utterance banana is converted to a sequence of feature vectors to £13711.
Problem Formulation	Segment (p;- k) We define a segment to be composed of feature vectors between two boundary frames.
Problem Formulation	Hidden State (8%) Since we assume the observed data are generated by HMMs, each feature vector , 30%, has an associated hidden state index.

feature vector is mentioned in 12 sentences in this paper.

Topics mentioned in this paper:

2. Distributional Semantics in Technicolor

Bruni, Elia and Boleda, Gemma and Baroni, Marco and Tran, Nam Khanh

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Distributional semantic models	From every image in a dataset, relevant areas are identified and a low-level feature vector (called a “descriptor”) is built to represent each area.
Distributional semantic models	Now, given a new image, the nearest visual word is identified for each descriptor extracted from it, such that the image can be represented as a BoVW feature vector , by counting the instances of each visual word in the image (note that an occurrence of a low-level descriptor vector in an image, after mapping to the nearest cluster, will increment the count of a single dimension of the higher-level BoVW vector).
Distributional semantic models	We extract descriptor features of two types.6 First, the standard Scale-Invariant Feature Transform (SIFT) feature vectors (Lowe, 1999; Lowe, 2004), good at characterizing parts of objects.

feature vector is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

3. Prediction of Learning Curves in Machine Translation

Kolachina, Prasanth and Cancedda, Nicola and Dymetman, Marc and Venkatapathy, Sriram

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Inferring a learning curve from mostly monolingual data	The feature vector qb consists of the following features:
Inferring a learning curve from mostly monolingual data	We construct the design matrix (I) with one column for each feature vector qbct corresponding to each combination of training configuration 0 and test set If.
Inferring a learning curve from mostly monolingual data	For a new unseen configuration with feature vector gbu, we determine the parameters 6., of the corresponding learning curve as:

feature vector is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

4. Joint Feature Selection in Distributed Stochastic Learning for Large-Scale Discriminative Training in SMT

Simianer, Patrick and Riezler, Stefan and Dyer, Chris

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Experiments	Algorithms 2 and 3 were infeasible to run on Europarl data beyond one epoch because features vectors grew too large to be kept in memory.
Introduction	The simple but effective idea is to randomly divide training data into evenly sized shards, use stochastic learning on each shard in parallel, while performing 61/62 regularization for joint feature selection on the shards after each epoch, before starting a new epoch with a reduced feature vector averaged across shards.
Joint Feature Selection in Distributed Stochastic Learning	Let each translation candidate be represented by a feature vector x 6 RD where preference pairs for training are prepared by sorting translations according to smoothed sentence-wise BLEU score (Liang et al., 2006a) against the reference.
Joint Feature Selection in Distributed Stochastic Learning	Parameter mixing by averaging will help to ease the feature sparsity problem, however, keeping feature vectors on the scale of several million features in memory can be prohibitive.

feature vector is mentioned in 4 sentences in this paper.

Topics mentioned in this paper:

5. Spectral Learning of Latent-Variable PCFGs

Cohen, Shay B. and Stratos, Karl and Collins, Michael and Foster, Dean P. and Ungar, Lyle

In Proc. ACL 2012, part of Proceedings of the Annual Meeting of the Association for Computational Linguistics.

Estimating the Tensor Model	We assume a function u that maps outside trees 0 to feature vectors 2M0) 6 Rd].
Estimating the Tensor Model	For example, the feature vector might track the rule directly above the node in question, the word following the node in question, and so on.
Estimating the Tensor Model	We also assume a function gb that maps inside trees 75 to feature vectors gb(t) 6 Rd.

feature vector is mentioned in 3 sentences in this paper.

Topics mentioned in this paper: