ROC Curve and AUC — Explained

栏目: IT技术 · 发布时间: 3年前

内容简介:What they mean and when they are usefulROC (receiver operating characteristics) curve and AOC (area under the curve) are performance measures that provide a comprehensive evaluation of classification models.ROC curvesummarizes the performance by combining

ROC Curve and AUC — Explained

What they mean and when they are useful

Photo by Markus Spiske on Unsplash

ROC (receiver operating characteristics) curve and AOC (area under the curve) are performance measures that provide a comprehensive evaluation of classification models.

ROC curvesummarizes the performance by combining confusion matrices at all threshold values. AUC turns the ROC curve into a numeric representation of performance for a binary classifier. AUC is the area under the ROC curve and takes a value between 0 and 1. AUC indicates how successful a model is at separating positive and negative classes.

Before going in detail, let’s first explain the confusion matrix and how different threshold values change the outcome of it.

A confusion matrix is not a metric to evaluate a model, but it provides insight into the predictions. Confusion matrix goes deeper than classification accuracy by showing the correct and incorrect (i.e. true or false) predictions on each class. In case of a binary classification task, a confusion matrix is a 2×2 matrix. If there are three different classes, it is a 3×3 matrix and so on.

Confusion matrix of a binary classification (Image by author)

Let’s assume class A is positive class and class B is negative class. The key terms of confusion matrix are as follows:

  • True positive (TP) : Predicting positive class as positive (ok)
  • False positive (FP) : Predicting negative class as positive (not ok)
  • False negative (FN) : Predicting positive class as negative (not ok)
  • True negative (TN) : Predicting negative class as negative (ok)

Algorithms like logistic regression return probabilities rather than discrete outputs. We set a threshold value on the probabilities to distinguish positive and negative class. Depending on the threshold value, the predicted class of some observations may change.

How threshold value can change the predicted class (Image by author)

As we can see from the image above, adjusting the threshold value changes the prediction and thus results in a different confusion matrix. When the elements in a confusion matrix change, precision and recall also change.

Precision and recall metrics take the classification accuracy one step further and allow us to get a more specific understanding of model evaluation.

The focus of precision is positive predictions . It indicates how many of the positive predictions are true.

The focus of recall is actual positive classes . It indicates how many of the positive classes the model is able to predict correctly.

Note: We cannot try to maximize both precision and recall because there is a trade-off between them. Increasing precision decreases recall and vice versa. We can aim to maximize precision or recall depending on the task. For an email spam detection model, we try to maximize precision because we want to be correct when an email is detected as spam. We do not want to label a normal email as spam (i.e. false positive). On the other hand, for a tumor detection task, we need to maximize recall because we want to detect positive classes as much as possible.

What ROC curve does is providing us a summary of the performance of a model by combining confusion matrices at all threshold values.

ROC Curve ( image source )

ROC curve has two axes both of which take values between 0 and 1. Y-axis is true positive rate (TPR) which is also known as sensitivity . It is the same as recall which measures the proportion of positive class that is correctly predicted as positive. X-axis is false positive rate (FPR). It is equal to 1-specificity which is similar to sensitivity but focused on negative class. Specificity measures the proportion of negative class that is correctly predicted as negative.

Sensitivity vs Specificity (Image by author)
TPR and FPR in terms of confusion matrix elements (Image by author)

If the threshold is set to 0, the model predicts all samples as positive. Thus, we only have true positives and false positives. In this case, Both TPR and FPR are 1. If the threshold is set to 1, we do not have any positive predictions. In this case TP and FP are 0 and so TPR and FPR become 0. Hence, it is not a good choice to set the threshold to 0 or 1.

We aim to increase the true positive rate (TPR) while keeping false positive rate (FPR) low. As we can see on the ROC curve, as TPR increases, FPR also increases. So it comes down to decide how many false positives we can tolerate.

ROC curve gives as an overview of model performance at different threshold values. AUC is the area under ROC curve between (0,0) and (1,1) which can be calculated using integral calculus. AUC basically aggregates the performance of the model at all threshold values. The best possible value of AUC is 1 which indicates a perfect classifier. AUC is zero if all the predictions are wrong.

Note: AUC is not dependent on classification threshold value. Changing the threshold value does not change AUC because it is an aggregate measure of ROC.

AUC for two different classifiers ( Image source )

The figure above shows the ROC curves for classifiers A and B. A is clearly a better classifier than B. The AUC is higher and for same FPR values, A has a higher TPR. Similarly, for same TPR values, A has a smaller FPR.

AUC is classification-threshold invariant. For this very reason, it is not the optimal metric of evaluation for certain tasks. For instance, when working on email spam detection, we do not want to have any false positives. On the other hand, we cannot afford to have a false negative for tumor detection tasks. We can optimize the model to our needs by adjusting the classification threshold value in such cases. Since AUC is not affected by threshold value, it is not a good metric choice. Precision or recall should be used as evaluation metric for those cases.


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

社会再平衡

社会再平衡

[加] 亨利·明茨伯格 / 陆维东、鲁强 / 东方出版社 / 2015-9 / 38.00元

明茨伯格曾坦言:我虽然不是律师,但我觉得有必要质疑法律的失效;我也不算是经济学家,但我觉得有义务来挑战一切事物以经济为指标的标准;我也不是人类学家、社会学家、心理学家,或者政治科学,更不是活动分子,但是在我的讨论中,文化、行为、权力、社会运动都扮演了重要的角色。我是一个合成者,我最成功的书都囊括了不同来源的想法。 明茨伯格创作《社会再平衡》这本书的初衷是因为关注身边的趋势:环境的恶化、民主的......一起来看看 《社会再平衡》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码

SHA 加密
SHA 加密

SHA 加密工具