Index of papers in Proc. ACL 2014 that mention
  • feature space
Bollegala, Danushka and Weir, David and Carroll, John
Distribution Prediction
To reduce the dimensionality of the feature space , and create dense representations for words, we perform SVD on F. We use the left singular vectors corresponding to the k largest singular values to compute a rank k approximation F, of F. We perform truncated SVD using SVDLIBCZ.
Distribution Prediction
Each row in F is considered as representing a word in a lower k (<<nc) dimensional feature space corresponding to a particular domain.
Distribution Prediction
Distribution prediction in this lower dimensional feature space is preferrable to prediction over the original feature space because there are reductions in overfit-ting, feature sparseness, and the learning time.
Domain Adaptation
increased the train time due to the larger feature space .
Experiments and Results
Therefore, when the overlap be-ween the vocabularies used in the source and the arget domains is small, fired cannot reduce the mismatch between the feature spaces .
Experiments and Results
All methods are evalu-ted under the same settings, including train/test plit, feature spaces , pivots, and classification al-;orithms so that any differences in performance an be directly attributable to their domain adapt-,bility.
Introduction
tent feature spaces separately for the source and the target domains using Singular Value Decomposition (SVD).
Introduction
Second, we learn a mapping from the source domain latent feature space to the target domain latent feature space using Partial Least Square Regression (PLSR).
Introduction
The SVD smoothing in the first step both reduces the data sparseness in distributional representations of individual words, as well as the dimensionality of the feature space , thereby enabling us to efficiently and accurately learn a prediction model using PLSR in the second step.
O \
Because the dimensionality of the source and target domain feature spaces is equal to h, the complexity of the least square regression problem increases with h. Therefore, larger k values result in overfitting to the train data and classification accuracy is reduced on the target test data.
feature space is mentioned in 11 sentences in this paper.
Topics mentioned in this paper:
Chen, Yanping and Zheng, Qinghua and Zhang, Wei
Conclusion
The size of the employed lexicon determines the dimension of the feature space .
Discussion
Combining two head nouns may increase the feature space
Discussion
Such a large feature space makes the occurrence of features close to a random distribution, leading to a worse data sparseness.
Feature Construction
Because the number of lexicon entry determines the dimension of the feature space , performance of Omni-word feature is influenced by the lexicon being employed.
Related Work
(2010) proposed a model handling the high dimensional feature space .
feature space is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Wang, William Yang and Hua, Zhenhao
Copula Models for Text Regression
By doing this, we are essentially performing probability integral transform— an important statistical technique that moves beyond the count-based bag-of-words feature space to marginal cumulative density functions space.
Discussions
By applying the Probability Integral Transform to raw features in the copula model, we essentially avoid comparing apples and oranges in the feature space , which is a common problem in bag-of-features models in NLP.
Experiments
model over squared loss linear regression model are increasing, when working with larger feature spaces .
Related Work
For example, when bag-of-word-unigrams are present in the feature space , it is easier if one does not explicitly model the stochastic dependencies among the words, even though doing so might hurt the predictive power, while the variance from the correlations among the random variables is not explained.
feature space is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Kim, Seokhwan and Banchs, Rafael E. and Li, Haizhou
Wikipedia-based Composite Kernel for Dialog Topic Tracking
Since our hypothesis is that the more similar the dialog histories of the two inputs are, the more similar aspects of topic transtions occur for them, we propose a subsequence kernel (Lodhi et al., 2002) to map the data into a new feature space defined based on the similarity of each pair of history sequences as follows:
Wikipedia-based Composite Kernel for Dialog Topic Tracking
The other kernel incorporates more various types of domain knowledge obtained from Wikipedia into the feature space .
Wikipedia-based Composite Kernel for Dialog Topic Tracking
Since this constructed tree structure represents semantic, discourse, and structural information extracted from the similar Wikipedia paragraphs to each given instance, we can explore these more enriched features to build the topic tracking model using a subset tree kernel (Collins and Duffy, 2002) which computes the similarity between each pair of trees in the feature space as follows:
feature space is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Li, Jiwei and Ritter, Alan and Hovy, Eduard
Conclusion and Future Work
Another direction involves incorporating richer feature space for better inference performance, such as multimedia sources (i.e.
Experiments
We evaluate settings described in Section 4.2 i.e., GLOBAL setting, where user-level attribute is predicted directly from jointly feature space and LOCAL setting where user-level prediction is made based on tweet-level prediction along with different inference approaches described in Section 4.4, i.e.
Experiments
This can be explained by the fact that LOCAL(U) sets 256 = 1 once one posting cc 6 L5 is identified as attribute related, while GLOBAL tend to be more meticulous by considering the conjunctive feature space from all postings.
feature space is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: