Abstract | The process flow architecture we present here utilizes standard image processing techniques and the multi-tiered application of classification models such as support vector machines ( SVM ). |
Bright-Field Head Identification | However, in addition to informative feature selection and the curation of a representative training set, the performance of SVM classification models is subject to several parameters associated with the model itself and its kernel function [34, 48]. |
Bright-Field Head Identification | Thus, to ensure good performance of the final SVM model, we first optimize model parameters based on fivefold cross-validation on the training set (Fig 3A and 3B, Materials and Methods). |
Bright-Field Head Identification | Therefore, we optimize the SVM parameters via the minimization of an adjusted error rate that penalizes false negatives more than false positives (Fig 3B). |
Discussion | The use of this multivariate information with a classification model such as SVM obviates the need for manually assessing rectilinear thresholds for classification. |
Discussion | Moreover, the performance of our classifiers demonstrate that the potentially nonlinear, multidimensional classification provided by SVM prove more powerful than rectilinear thresholding of individual features or dimensionality reduction techniques (Fig 3C and 3F). |
Identification of Fluorescently Labeled Cells | Using this feature set, we optimize and train a layer 1 SVM classifier using a manually annotated training set (n = 218) (S4A Fig, Materials and Methods) and show that it is sufficient for identifying cellular regions with relatively high sensitivity and specificity (Fig 5D and 84A Fig). |
Identification of Fluorescently Labeled Cells | To construct our layer 2 classifier, we optimize and train an SVM model based on these pairwise relational features (S4B Fig). |
Identification of Fluorescently Labeled Cells | In this case, the probability estimates from the SVM classifier [37, 49] |
Supervised learning: Classification | The PLR approach was generally superior, with RF quite comparable and SVM somewhat degraded but still yielding good performance. |
Supervised learning: Classification | SVM is a kernel-based nonlinear classifier that finds a separating hyperplane (in a space defined by the kernel) between the classes, so as to minimize the risk of classification error. |
Supervised learning: Regression | SVR is based on the same theory as SVM , discussed above, but uses the kernel-based approach to fit a regression model to reduce the quantitative prediction error. |
Supervised learning: Regression | As with SVM , we evaluated the standard linear, polynomial, and radial basis kernels and presented the results for the radial basis function. |
Parameter settings, implementation, and availability | For the multi-class SVM , one-versus-rest (OVR) approach was used in which for each class, a binary classifier is built for the class label and the rest. |
Parameter settings, implementation, and availability | Each binary SVM was built using Gaussian Radial Basis Function (RBF) kernel and the default sigma factor of 1 was used. |
Supporting Information | In each iteration, the phase of all samples that were originally unannotated is predicted, based on an ensample of 4 machine learning methods (Naive Bayes, SVM , Decision Tree, KNN) that produce a consensus outcome, as described in the Methods section of the manuscript. |