The first thing we should be aware of is that, evaluating classifier is not just about how many instances is classified correctly/incorrectly. In many (most of) cases, we need to have a clear understanding on FP (false positive, the instance which is classified as positive but actually negative) and FN (false negative, the instance which is classified as negative but actually positive). For example, when we are developing a classifier to predict for the loaner to invest the loads (discretely), we should pay more attention to lower down the rate of false positive….
In usual, we have 4 metrics to measure the performance of a (binary) classifier
Accuracy = (TP + TN)/(TP + TN + FP + FN) = Pr(C), the probability of a correct classification.
Sensitivity = TP/(TP + FN) = the ability of the test to detect disease in a population of diseased individuals. (also, known as Recall)
Specificity = TN/(TN + FP) = the ability of the test to correctly rule out the disease in a disease-free population.
Precision = TP / (TP + FP) = ability
When we draw TP (sensitivity) against FP for each threshold used. The resulting graph is called a Receiver Operating Characteristic (ROC) curve . See the following example :
ROC is a powerful to select right threshold to minimize the false positive and maximize the true positive classified instances. Regarding the selection of threshold, we should know that we have different rule for selecting thresholds under different scenarios, taken the example in the reference :
"For a cancer screening test, for example, we may be prepared to put up with a relatively high false positive rate in order to get a high true positive, it is most important to identify possible cancer sufferers. For a follow-up test after treatment, however, a different threshold might be more desirable, since we want to minimize false negatives, we don’t want to tell a patient they’re clear if this is not actually the case."
Essentially, we would like to get an ideal ROC graph where the ROC curve will go straight up the Y axis and then along the X axis. i.e. we always try to maximize the area under the curve (AUC).