Abstract | In effect, our approach finds an optimal feature space (derived from a base feature set and indicator set) for discriminating coreferential mention pairs. |
Abstract | Although our approach explores a very large space of possible feature spaces , it remains tractable by exploiting the structure of the hierarchies built from the indicators. |
Introduction | It is worth noting that, from a machine learning point of view, this is related to feature extraction in that both approaches in effect recast the pairwise classification problem in higher dimensional feature spaces . |
Introduction | We will see that this is also equivalent to selecting a single large adequate feature space by using the data. |
Modeling pairs | Given a document, the number of mentions is fixed and each pair of mentions follows a certain distribution (that we partly observe in a feature space ). |
Modeling pairs | 2.2 Feature spaces 2.2.1 Definitions |
Modeling pairs | that casts pairs into a feature space F through which we observe them. |
Conclusions and Future Work | We have introduced RM, a novel online margin-based algorithm designed for optimizing high-dimensional feature spaces , which introduces constraints into a large-margin optimizer that bound the spread of the projection of the data while maXimizing the margin. |
Conclusions and Future Work | Experimentation in statistical MT yielded significant improvements over several other state-of-the-art optimizers, especially in a high-dimensional feature space (up to 2 BLEU and 4.3 TER on average). |
Introduction | However, as the dimension of the feature space increases, generalization becomes increasingly difficult. |
Introduction | This criterion performs well in practice at finding a linear separator in high-dimensional feature spaces (Tsochantaridis et al., 2004; Crammer et al., 2006). |
Introduction | Chinese-English translation experiments show that our algorithm, RM, significantly outperforms strong state-of-the-art optimizers, in both a basic feature setting and high-dimensional (sparse) feature space (§4). |
Learning in SMT | Online large-margin algorithms, such as MIRA, have also gained prominence in SMT, thanks to their ability to learn models in high-dimensional feature spaces (Watanabe et al., 2007; Chiang et al., 2009). |
Cue Discovery for Content Selection | Our feature space X = {$1, :52, . |
Cue Discovery for Content Selection | We search only top candidates for efficiency, following the fixed-width search methodology for feature selection in very high-dimensionality feature spaces (Gutlein et al., 2009). |
Experimental Results | We use a binary unigram feature space , and we perform 7-fold cross-va1idation. |
Prediction | One challenge of this approach is our underlying unigram feature space - tree-based algorithms are generally poor classifiers for the high-dimensionality, low-information features in a lexical feature space (Han et al., 2001). |
Prediction | We exhaustively sweep this feature space , and report the most successful stump rules for each annotation task. |
Experiments | Experiments evaluate the FWD and SemTree feature spaces compared to two baselines: bag-of-words (BOW) and supervised latent Dirichlet allocation (sLDA) (Blei and McAuliffe, 2007). |
Experiments | SVM-light with tree kernels3 (Joachims, 2006; Moschitti, 2006) is used for both the FWD and SemTree feature spaces . |
Methods | 4.2 SemTree Feature Space and Kernels |
Methods | We propose SemTree as another feature space to encode semantic information in trees. |
Related Work | We explore a rich feature space that relies on frame semantic parsing. |
Introduction | In addition, more advanced regularisation functions enable multitask learning schemes that can exploit shared structure in the feature space . |
Methods | Group LASSO exploits a predefined group structure on the feature space and tries to achieve sparsity in the group-level, i.e. |
Methods | In this optimisation process, we aim to enforce sparsity in the feature space but in a structured manner. |