In this vignette we present plots for evaluation of classification models.
We work on titanic
dataset form the DALEX
package.
titanic <- na.omit(DALEX::titanic)
titanic$survived = as.numeric(titanic$survived)-1
head(titanic)
## gender age class embarked country fare sibsp parch survived
## 1 male 42 3rd Southampton United States 7.11 0 0 0
## 2 male 13 3rd Southampton United States 20.05 0 2 0
## 3 male 16 3rd Southampton United States 20.05 1 1 0
## 4 female 39 3rd Southampton England 20.05 1 1 1
## 5 female 16 3rd Southampton Norway 7.13 0 0 1
## 6 male 25 3rd Southampton United States 7.13 0 0 1
We fit 2 models: glm and svm.
model_glm <- glm(survived ~ ., data = titanic, family = binomial)
library(e1071)
model_svm <- svm(survived ~ ., data = titanic)
The first step is creating explainer
object with the DALEX
package. It’s an object that can be used to audit a model. It wraps up a model with meta-data.
exp_glm <- DALEX::explain(model_glm, data = titanic, y = titanic$survived)
exp_svm <- DALEX::explain(model_svm, data = titanic, y = titanic$survived, label = "svm")
Second step is creating auditor_model_evaluation
object that can be further used for validating a model.
library(auditor)
eva_glm <- model_evaluation(exp_glm)
eva_svm <- model_evaluation(exp_svm)
Receiver operating characteristic (ROC) curve is a tool for visualising a classifier’s performance. It answers the question of how well the model discriminates between the two classes. The boundary between classes is determined by a threshold value. ROC illustrates the performance of a classification model at various threshold settings.
The diagonal line y = x
corresponds to a classifier that randomly guess the positive class half the time. Any model that appears in the lower right part of plot performs worse than random guessing. The closer the curve is to the the left border and top border of plot, the more accurate the classifier is.
plot(eva_glm, eva_svm, type = "roc")
# or
# plot_roc(eva_glm, eva_svm)
The LIFT chart is a rate of positive prediction (RPP) plotted against true positive (TP) on a threshold t.
The chart illustrates varying performance of the model for different thresholds. A random and ideal models are represented by dashed curves (lower and upper respectively). The closer the LIFT curve gets to the upper dashed curve (ideal model), the better a model is.
plot(eva_glm, eva_svm, type = "lift")
# or
# plot_lift(eva_glm, eva_svm)
Other methods and plots are described in vignettes: