Abstract | Although the log-linear model achieves success in SMT, it still suffers from some limitations: (1) the features are required to be linear with respect to the model itself; (2) features cannot be further interpreted to reach their potential. |
Introduction | Recently, great progress has been achieved in SMT, especially since Och and Ney (2002) proposed the log-linear model: almost all the state-of-the-art SMT systems are based on the log-linear model . |
Introduction | Regardless of how successful the log-linear model is in SMT, it still has some shortcomings. |
Introduction | Compared with the log-linear model , it has more powerful expressive abilities and can deeply interpret and represent features with hidden units in neural networks. |
A Joint Model for Two Formalisms | Instead, we assume that the distribution over yCFG is a log-linear model with parameters 601:0 (i.e., a sub-vector of 6) , namely: |
Evaluation Setup | In this setup, the model reduces to a normal log-linear model for the target formalism. |
Experiment and Analysis | It’s not surprising that Cahill’s model outperforms our log-linear model because it relies heavily on handcrafted rules optimized for the dataset. |
Features | Feature functions in log-linear models are designed to capture the characteristics of each derivation in the tree. |
Markov Topic Regression - MTR | log-linear models with parameters, AiméRM , is |
Markov Topic Regression - MTR | labeled data, 712?, based on the log-linear model in Eq. |
Semi-Supervised Semantic Labeling | The a: is used as the input matrix of the kth log-linear model (corresponding to kth semantic tag (topic)) to infer the [3 hyper-parameter of MTR in Eq. |
Inference | We then report the corresponding chains 0(a) as the system output.3 For learning, the gradient takes the standard form of the gradient of a log-linear model , a difference of expected feature counts under the gold annotation and under no annotation. |
Introduction | We use a log-linear model that can be expressed as a factor graph. |
Models | The final log-linear model is given by the following formula: |
Building Dialog Trees from Instructions | Given a single instruction 2' with category au, we use a log-linear model to represent the distri- |
Understanding Initial Queries | We employ a log-linear model and try to maximize initial dialog state distribution over the space of all nodes in a dialog network: |
Understanding Query Refinements | Dialog State Update Model We use a log-linear model to maximize a dialog state distribution over the space of all nodes in a dialog network: |
Experimental Setup | But instead of using just the PMI scores of bilingual NE pairs, as in our work, they employed a feature-rich log-linear model to capture bilingual correlations. |
Experimental Setup | Parameters in their log-linear model require training with bilingually annotated data, which is not readily available. |
Related Work | (2010a) presented a supervised learning method for performing joint parsing and word alignment using log-linear models over parse trees and an ITG model over alignment. |