Experimental Results | Two representative methods were used as baselines: the generative model proposed by (Brill and Moore, 2000) referred to as generative and the logistic regression model proposed by (Okazaki et al., 2008) |
Experimental Results | When using their method for ranking, we used outputs of the logistic regression model as rank scores. |
Introduction | (2008) proposed using a logistic regression model for approximate dictionary matching. |
Related Work | (2008) utilized substring substitution rules and incorporated the rules into a L1-regularized logistic regression model . |
Answer Grading System | We train the isotonic regression model on each type of system output (i.e., alignment scores, SVM output, BOW scores). |
Discussion and Conclusions | This is likely due to the different objective function in the corresponding optimization formulations: while the ranking model attempts to ensure a correct ordering between the grades, the regression model seeks to minimize an error objective that is closer to the RMSE. |
Results | For each fold, one additional fold is held out for later use in the development of an isotonic regression model (see Figure 3). |