Dependency Parsing | Given the training set {(Xi, MHz-1:1, parameter estimation for log-linear models generally resolve around optimization of a regularized conditional |
Dependency Parsing | In this paper we use the dual exponenti-ated gradient (EG)2 descent, which is a particularly effective optimization algorithm for log-linear models (Collins et al., 2008). |
Experiments | Some previous studies also found a log-linear relationship between unlabeled data (Suzuki and Isozaki, 2008; Suzuki et al., 2009; Bergsma et al., 2010; Pitler et al., 2010). |
Web-Derived Selectional Preference Features | Log-linear dependency parsing model is sensitive to inappropriately scaled feature. |
PCS Induction | The feature-based model replaces the emission distribution with a log-linear model, such that: |
PCS Induction | This locally normalized log-linear model can look at various aspects of the observation :5, incorporating overlapping features of the observation. |
PCS Induction | We adopted this state-of-the-art model because it makes it easy to experiment with various ways of incorporating our novel constraint feature into the log-linear emission model. |
Conclusions | Future work directions include investigating the impact of hierarchical phrases for our models as well as any gains from additional features in the log-linear decoding model. |
Experiments | The induced joint translation model can be used to recover arg maxe p(e|f), as it is equal to arg maxe p(e, f We employ the induced probabilistic HR-SCFG G as the backbone of a log-linear , feature based translation model, with the derivation probability p(D) under the grammar estimate being |
Experiments | We train the feature weights under MERT and decode with the resulting log-linear model. |