Index of papers in Proc. ACL that mention
  • scoring function
Zhang, Yuan and Lei, Tao and Barzilay, Regina and Jaakkola, Tommi and Globerson, Amir
Abstract
Much of the recent work on dependency parsing has been focused on solving inherent combinatorial problems associated with rich scoring functions .
Abstract
In contrast, we demonstrate that highly expressive scoring functions can be used with substantially simpler inference procedures.
Introduction
Dependency parsing is commonly cast as a maximization problem over a parameterized scoring function .
Introduction
In this view, the use of more expressive scoring functions leads to more challenging combinatorial problems of finding the maximizing parse.
Introduction
We depart from this view and instead focus on using highly expressive scoring functions with substantially simpler inference procedures.
scoring function is mentioned in 29 sentences in this paper.
Topics mentioned in this paper:
Cortes, Corinna and Kuznetsov, Vitaly and Mohri, Mehryar
Boosting-style algorithm
The predictor CHEW“ returned by our boosting algorithm is based on a scoring function h: X x y —> R, which, as for standard ensemble algorithms such as AdaBoost,Ai/s a~convex combination of base scoring functions ht: h 2 23:1 atht, with at 2 0.
Boosting-style algorithm
The base scoring functions used in our algorithm have the form
Boosting-style algorithm
Thus, the~score assigned to y by the base scoring function ht is the number of positions at which y matches the prediction of path expert ht given input X. CHEW“ is defined as follows in terms of h or hts:
Online learning approach
A collection of distributions 1P can also be used to define a deterministic prediction rule based on the scoring function approach.
Online learning approach
The majority vote scoring function is defined by
scoring function is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Lee, Cheongjae and Jung, Sangkeun and Lee, Gary Geunbae
Abstract
Given the agenda graph and n-best hypotheses, the system can predict the next system actions to maximize multilevel score functions .
Agenda Graph
the score function based on current input and discourse structure given the focus stack.
Greedy Selection with n-best Hypotheses
Therefore, we need to select the hypothesis that maximizes the scoring function among a set of n-best hypotheses of each utterance.
Greedy Selection with n-best Hypotheses
Secondly, the multilevel score functions are computed for each candidate node Ci given a hypothesis hi.
Greedy Selection with n-best Hypotheses
Otherwise, the best node which would be pushed onto the focus stack must be selected using multilevel score functions .
scoring function is mentioned in 8 sentences in this paper.
Topics mentioned in this paper:
Tomasoni, Mattia and Huang, Minlie
Discussion and Future Directions
Table 2: A summarized answer composed of five different portions of text generated with the SH scoring function ; the chosen best answer is presented for comparison.
Experiments
At this point a second version of the dataset was created to evaluate the summarization performance under scoring function (6) and (7); it was generated by manually selecting questions that arouse subjective, human interest from the previous 89,814 question-answer pairs.
Experiments
Figure 2: Increase in ROUGE-L, ROUGE-l and ROUGE-2 performances of the S H system as more measures are taken in consideration in the scoring function , starting from Relevance alone (R) to the complete system (RQNC).
Experiments
In order to determine what influence the single measures had on the overall performance, we conducted a final experiment on the filtered dataset to evaluate (the SH scoring function was used).
The summarization framework
2.5 The concept scoring functions
The summarization framework
Analogously to what had been done with scoring function (6), the (I) space was augmented with a dimension representing the
The summarization framework
The concept score for the same BE in two separate answers is very likely to be different because it belongs to answers with their own Quality and Coverage values: this only makes the scoring function context-dependent and does not interfere with the calculation the Coverage, Relevance and Novelty measures, which are based on information overlap and will regard two BEs with overlapping equivalence classes as being the same, regardless of their score being different.
scoring function is mentioned in 7 sentences in this paper.
Topics mentioned in this paper:
Lang, Joel and Lapata, Mirella
Related Work
We operationalize these notions using a scoring function that quantifies the compatibility between arbitrary cluster pairs.
Split-Merge Role Induction
Besides being inefficient, it requires a scoring function with comparable scores for arbitrary pairs of clusters.
Split-Merge Role Induction
After each completion of the inner loop, the thresholds contained in the scoring function (discussed below) are adjusted and this is repeated until some termination criterion is met (discussed in Section 5.2.3).
Split-Merge Role Induction
5.2.2 Scoring Function
scoring function is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Zhao, Xin and Jiang, Jing and He, Jing and Song, Yang and Achanauparp, Palakorn and Lim, Ee-Peng and Li, Xiaoming
Abstract
We propose a context-sensitive topical PageRank method for keyword ranking and a probabilistic scoring function that considers both relevance and interestingness of keyphrases for keyphrase ranking.
Experiments
We have proposed a context-sensitive topical PageRank method (cTPR) for the first step of keyword ranking, and a probabilistic scoring function for the third step of keyphrase ranking.
Method
While a standard method is to simply aggregate the scores of keywords inside a candidate keyphrase as the score for the keyphrase, here we propose a different probabilistic scoring function .
Method
tion (8) into Equation (3) and obtain the following scoring function for ranking:
Method
Our preliminary experiments with Equation (9) show that this scoring function usually ranks longer keyphrases higher than shorter ones.
scoring function is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Sun, Jun and Zhang, Min and Tan, Chew Lim
Introduction
and Imamura (2001) propose some score functions based on the lexical similarity and co-occurrence.
Substructure Spaces for BTKs
The baseline system uses many heuristics in searching the optimal solutions with alternative score functions .
Substructure Spaces for BTKs
The baseline method proposes two score functions based on the lexical translation probability.
Substructure Spaces for BTKs
They also compute the score function by splitting the tree into the internal and external components.
scoring function is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Li, Mu and Duan, Nan and Zhang, Dongdong and Li, Chi-Ho and Zhou, Ming
Collaborative Decoding
2.2 Generic Collaborative Decoding Model For a given source sentence f, a member model in co-decoding finds the best translation 6* among the set of possible candidate translations if (f) based on a scoring function F:
Collaborative Decoding
where CIDm (f, e) is the score function of the mth baseline model, and each Wk(e,17-[k (f)) is a partial consensus score function with respect to dk and is defined over e and 17-[k (f):
Collaborative Decoding
Note that in Equation 2, though the baseline score function CIDm (f, 6) can be computed inside each decoder, the case of Wk (ail-[k (f)) is more complicated.
scoring function is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Nivre, Joakim and McDonald, Ryan
Integrated Models
As explained in section 2, both models essentially learn a scoring function 3 : X —> R, where the domain X is different for the two models.
Integrated Models
The graph-based model, MSTParser, learns a scoring function 3(2', j, l) E R over labeled dependencies.
Integrated Models
The transition-based model, MaltParser, learns a scoring function 3(0, 25) E R over configurations and transitions.
Two Models for Dependency Parsing
The simplest parameterization is the arc-factored model that defines a real-valued score function for arcs 3(2', j, l) and further defines the score of a dependency graph as the sum of the
Two Models for Dependency Parsing
Given a real-valued score function 3(0, 25) (for transition 75 out of configuration 0), parsing can be performed by starting from the initial configuration and taking the optimal transition 75* = arg maxtET 3(0, 25) out of every configuration 0 until a terminal configuration is reached.
Two Models for Dependency Parsing
To learn a scoring function on transitions, these systems rely on discriminative learning methods, such as memory-based learning or support vector machines, using a strictly local learning procedure where only single transitions are scored (not complete transition sequences).
scoring function is mentioned in 6 sentences in this paper.
Topics mentioned in this paper:
Kaji, Nobuhiro and Fujiwara, Yasuhiro and Yoshinaga, Naoki and Kitsuregawa, Masaru
Introduction
defining a score function f (:13, y) and locating the
Introduction
In HMMs, the score function f (:13, y) is the joint probability distribution over (3:, If we assume a one-to-one correspondence between the hidden states and the labels, the score function can be written as:
Introduction
In the perceptron, the score function f (:13, y) is given as f(a:,y) = w - qb(a:,y) where w is the weight vector, and qb(a:, y) is the feature vector representation of the pair (:13, By making the first-order Markov assumption, we have
scoring function is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Zhao, Shiqi and Lan, Xiang and Liu, Ting and Li, Sheng
Statistical Paraphrase Generation
The PTs used in this work are constructed using different corpora and different score functions (Section 3.5).
Statistical Paraphrase Generation
Let (51,72) be a pair of paraphrase units, their paraphrase likelihood is computed using a score function ¢pm(§i,fi).
Statistical Paraphrase Generation
Suppose we have K PTs, (ski, {1%) is a pair of paraphrase units from the k-th PT with the score function gbk(§ki, £191.
scoring function is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Bansal, Mohit and Burkett, David and de Melo, Gerard and Klein, Dan
Structured Taxonomy Induction
Each factor F has an associated scoring function W, with the probability of a total assignment determined by the product of all these scores:
Structured Taxonomy Induction
We score each edge by extracting a set of features f (55¢, :33) and weighting them by the (learned) weight vector w. So, the factor scoring function is:
Structured Taxonomy Induction
The scoring function is similar to the one above:
scoring function is mentioned in 5 sentences in this paper.
Topics mentioned in this paper:
Almeida, Miguel and Martins, Andre
Compressive Summarization
Here, we follow the latter work, by combining a coverage score function g with sentence-level compression score functions hl, .
Compressive Summarization
For the compression score function, we follow Martins and Smith (2009) and decompose it as a sum of local score functions pmg defined on dependency arcs:
Extractive Summarization
By designing a quality score function g : {0, 1}N —> R, this can be cast as a global optimization problem with a knapsack constraint:
Extractive Summarization
Then, the following quality score function is defined:
scoring function is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Morita, Hajime and Sasano, Ryohei and Takamura, Hiroya and Okumura, Manabu
Conclusions and Future Work
Since our algorithm requires that the objective function is the sum of word score functions , our proposed method has a restriction that we cannot use an arbitrary monotone submodular function as the objective function for the summary.
Introduction
By formalizing the subtree extraction problem as this new maximization problem, we can treat the constraints regarding the grammaticality of the compressed sentences in a straightforward way and use an arbitrary monotone submodular word score function for words including our word score function (shown later).
Joint Model of Extraction and Compression
The score function is supermodular as a score function of subtree extraction3, because the union of two subtrees can have extra word pairs that are not included in either subtree.
Joint Model of Extraction and Compression
Our score function for a summary 8 is as follows:
scoring function is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Sartorio, Francesco and Satta, Giorgio and Nivre, Joakim
Dependency Parser
We present here an algorithm that runs the parser in pseudo-deterministic mode, greedily choosing at each configuration the transition that maximizes some score function .
Dependency Parser
Algorithm 1 takes as input a string 21) and a scoring function score() defined over parser transitions and parser configurations.
Dependency Parser
The scoring function will be the subject of §4 and is not discussed here.
Model and Training
We use a linear model for the score function in Algorithm 1, and define score(t, c) = {I} - gb(t, 0).
scoring function is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Wang, Lu and Raghavan, Hema and Castelli, Vittorio and Florian, Radu and Cardie, Claire
Abstract
Under this framework, we show how to integrate various indicative metrics such as linguistic motivation and query relevance into the compression process by deriving a novel formulation of a compression scoring function .
Introduction
Our tree-based methods rely on a scoring function that allows for easy and flexible tailoring of sentence compression to the summarization task, ultimately resulting in significant improvements for MDS, while at the same time remaining competitive with existing methods in terms of sentence compression, as discussed next.
Sentence Compression
postorder) as a sequence of nodes in T, the set L of possible node labels, a scoring function 8 for evaluating each sentence compression hypothesis, and a beam size N. Specifically, O is a permutation on the set {0, l, .
Sentence Compression
Thus, the decoder is quite flexible — its learned scoring function allows us to incorporate features salient for sentence compression while its language model guarantees the linguistic quality of the compressed string.
scoring function is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Zeng, Xiaodong and Wong, Derek F. and Chao, Lidia S. and Trancoso, Isabel
Abstract
The segmentation for an input sentence is decoded by using a joint scoring function combining the two induced models.
Introduction
Moreover, in order to better combine the strengths of the two models, the proposed approach uses a joint scoring function in a log-linear combination form for the decoding in the segmentation phase.
Semi-supervised Learning via Co-regularizing Both Models
3.4 The Joint Score Function for Decoding
Semi-supervised Learning via Co-regularizing Both Models
This paper employs a log-linear interpolation combination (Bishop, 2006) to formulate a joint scoring function based on character-based and word-based models in the decoding:
scoring function is mentioned in 4 sentences in this paper.
Topics mentioned in this paper:
Kothari, Govind and Negi, Sumit and Faruquie, Tanveer A. and Chakaravarthy, Venkatesan T. and Subramaniam, L. Venkata
Introduction
the retrieved questions is formalized using a scoring function .
Problem Formulation
Based on the weight function, we define a scoring function for assigning a score to each question in the corpus Q.
Problem Formulation
For each token 3,, the scoring function chooses the term from Q haVing the maximum weight; then the weight of the n chosen terms are summed up to get the score.
scoring function is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Gao, Wei and Blitzer, John and Zhou, Ming and Wong, Kam-Fai
Introduction
We use this ranking to learn a linear scoring function on pairs of documents given a bilingual query.
Introduction
these heuristics and our learned pairwise scoring function , we can derive a ranking for new, unseen bilingual queries.
Learning to Rank Using Bilingual Information
Then we learn a linear scoring function for pairs of documents that exploits monolingual information (in both languages) and bilingual information.
scoring function is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Özbal, Gözde and Pighin, Daniele and Strapparava, Carlo
Architecture of BRAINSUP
Each partially lexicalized solution is scored by a battery of scoring functions that compete to generate creative sentences respecting the user specification U, as explained in Section 3.3.
Architecture of BRAINSUP
Concerning the scoring of partial solutions and complete sentences, we adopt a simple linear combination of scoring functions .
Architecture of BRAINSUP
,fk] be the vector of scoring functions and w = [2120, .
scoring function is mentioned in 3 sentences in this paper.
Topics mentioned in this paper:
Druck, Gregory and Mann, Gideon and McCallum, Andrew
Generalized Expectation Criteria
unlabeled data), a model distribution p A(y|x), and a score function 8:
Generalized Expectation Criteria
In this paper, we use a score function that is the squared difference of the model expectation of G and some target expectation G:
Generalized Expectation Criteria
The partial derivative of the KL divergence score function includes the same covariance term as above but substitutes a different multiplicative term: G / G ,\.
scoring function is mentioned in 3 sentences in this paper.
Topics mentioned in this paper: