Abstract | The basic idea of 83H is to learn the optimal feature weights from prior knowledge to relocate the data such that similar data have similar hash codes. |
Introduction | Unlike SSH that tries to find a sequence of hash functions, S3H fixes the random projection directions and seeks the optimal feature weights from prior knowledge to relocate the objects such that similar objects have similar fingerprints. |
Semi-Supervised SimHash | where w 6 RM is the feature weight to be determined and d; is the l-th column of the matrix D. |
The direction is determined by concatenating w L times. | Instead, S3H seeks the optimal feature weights via L—BFGS, which is still efficient even for very high-dimensional data. |
The direction is determined by concatenating w L times. | 83H learns the optimal feature weights from prior knowledge to relocate the data such that similar objects have similar fingerprints. |
A Joint Model with Unlabeled Parallel Text | where 5 is a real-valued vector of feature weights and j? |
A Joint Model with Unlabeled Parallel Text | where (5; and 5: are the vectors of feature weights for L1 and L2, respectively (for brevity we denote them as 61 and 62 in the remaining sections). |
A Joint Model with Unlabeled Parallel Text | where the first term on the right-hand side is the log likelihood of the labeled data from both D1 and D2; the second is the log likelihood of the unlabeled parallel data U, multiplied by Al 2 O, a constant that controls the contribution of the unlabeled data; and x12 2 0 is a regularization constant that penalizes model complexity or large feature weights . |
Evaluation | One of the main approaches adopted by previous systems involves the identification of features that measure writing skill, and then the application of linear or stepwise regression to find optimal feature weights so that the correlation with manually assigned scores is maximised. |
Previous work | Linear regression is used to assign optimal feature weights that maximise the correlation with the examiner’s scores. |
Previous work | Feature weights and/or scores can be fitted to a marking scheme by stepwise or linear regression. |