Abstract | For each bigram, a regression model is used to estimate its frequency in the reference summary. |
Abstract | The regression model uses a variety of indicative features and is trained discriminatively to minimize the distance between the estimated and the ground truth bigram frequency in the reference summary. |
Experiment and Analysis | We used the estimated value from the regression model ; the ICSI system just uses the bigram’s document frequency in the original text as weight. |
Experiment and Analysis | # bigrams used in our regression model 2140.7 (i.e., in selected sentences) |
Experiments | In our method, we first extract all the bigrams from the selected sentences and then estimate each bigram’s N We f using the regression model . |
Experiments | When training our bigram regression model , we use each of the 4 reference summaries separately, i.e., the bigram frequency is obtained from one reference summary. |
Introduction | To estimate the bigram frequency in the summary, we propose to use a supervised regression model that is discriminatively trained using a variety of features. |
Proposed Method 2.1 Bigram Gain Maximization by ILP | 2.2 Regression Model for Bigram Frequency Estimation |
Proposed Method 2.1 Bigram Gain Maximization by ILP | We propose to use a regression model for this. |
Proposed Method 2.1 Bigram Gain Maximization by ILP | To train this regression model using the given reference abstractive summaries, rather than trying to minimize the squared error as typically done, we propose a new objective function. |
Conclusion | Through experiments carried out on the developed datasets, we showed that the proposed polarity classification and valence regression models significantly improve baselines (from 11.90% to 39.69% depending on the language) and work well for all four languages. |
Task B: Valence Prediction | 5.2 Regression Model |
Task B: Valence Prediction | Full details of the regression model and its implementation are beyond the scope of this paper; for more details see (Scho'lkopf and Smola, 2001; Smola et al., 2003). |
Task B: Valence Prediction | Evaluation Measures: To evaluate the quality of the valence prediction model, we compare the actual valence score of the metaphor given by human annotators denoted with 3/ against those valence scores predicted by the regression model denoted with ac. |
Evaluation | The difference between the persona regression model and the Dirichlet persona model here is not |
Evaluation | by the persona regression model , along with links fn |
Evaluation | In practice, we find that while the Dirichlet model distinguishes between character personas in different movies, the persona regression model helps distinguish between different personas within the same movie. |
Exploratory Data Analysis | To illustrate this, we present results from the persona regression model learned above, with 50 latent lexical classes and 100 latent personas. |
Models | Distribution over topics for persona p in role 7“ 0d Movie d’s distribution over personas pe Character e’s persona (integer, p E {1..P}) j A specific (7“, w) tuple in the data Zj Word topic for tuple j 1113' Word for tuple j oz Concentration parameter for Dirichlet model 6 Feature weights for regression model [1,02 Gaussian mean and variance (for regularizing B) md Movie features (from movie metadata) me Entity features (from movie actor metadata) VT, 7 Dirichlet concentration parameters |
Models | Figure 2: Above: Dirichlet persona model (left) and persona regression model (right). |
Experiments | Figure 3 shows a Precision-Recall (PR) curve for MATCHER and three baselines: a “Frequency” model that ranks candidate matches for TD by their frequency during the candidate identification step; a “Pattern” model that uses MATCHER’s linear regression model for ranking, but is restricted to only the pattern-based features; and an “Extractions” model that similarly restricts the ranking model to ReVerb features. |
Experiments | All regression models for learning alignments outperform the Frequency ranking by a wide margin. |
Extending a Semantic Parser Using a Schema Alignment | For W, we use a linear regression model whose features are the score from MATCHER, the probabilities from the Syn and Sem NBC models, and the average weight of all lexical entries in UBL with matching syntax and semantics. |
Textual Schema Matching | 3.5 Regression models for scoring candidates |
Textual Schema Matching | MATCHER uses a regression model to combine these various statistics into a score for (77,719). |
Textual Schema Matching | The regression model is a linear regression with least-squares parameter estimation; we experimented with support vector regression models with nonlinear kernels, with no significant improvements in accuracy. |
Application to Essay Scoring | From this set, pl-p6 were used for feature selection, data visualization, and estimation of the regression models (training), while sets p7-p9 were reserved for a blind test. |
Application to Essay Scoring | To evaluate the usefulness of WAP in improving automated scoring of essays, we estimate a linear regression model using the human score as a dependent variable (label) and e-rater score and the HAT as the two independent variables (features). |
Application to Essay Scoring | We estimate a regression model on each of setA-pi, i E {1, .., 6}, and evaluate them on each of setA-pj, j E {7, .., 9}, and compare the performance with that of e-rater alone on setA-pj. |
Related Work | 11We also performed a cross-validation test on setA p1-p6, where we estimated a regression model on setA-pi and evaluate it on setA-pj, for all i,j E {1, ..,6},i 7E j, and compared the performance with that of e-rater alone on setA-pj, yielding 30 different train-test combinations. |
Automatically Identifying Biased Language | We trained a logistic regression model on a feature vector for every word that appears in the NPOV sentences from the training set, with the bias-inducing words as the positive class, and all the other words as the negative class. |
Automatically Identifying Biased Language | The types of features used in the logistic regression model are listed in Table 3, together with their value space. |
Automatically Identifying Biased Language | Logistic regression model that only uses the features based on Liu et al.’s (2005) lexicons of positive and negative words (i.e., features 26—29). |
Conclusions | However, our logistic regression model reveals that epistemological and other features can usefully augment the traditional sentiment and subjectivity features for addressing the difficult task of identifying the bias-inducing word in a biased sentence. |