corrgrapher

Overview

When exploring data or models we often examine variables one by one. This analysis is incomplete if the relationship between these variables is not taken into account. The corrgrapher package facilitates simultaneous exploration of the Partial Dependence Profiles and the correlation between variables in the model.

The package corrgrapher is a part of the DrWhy.AI universe.

The solution - less numbers, more insights

This package aims to plot correlations between variables in form of a graph. Each node on it is associated with single variable. Variables correlated with each other (positively and negatively alike) shall be close, and weakly correlated - far from each other.

It is achieved through a physical simulation, where the nodes are treated as points with mass (and are pushing each other away) and edges are treated as mass-less springs. The length of a spring depends on absolute value of correlation between connected nodes. The bigger the correlation, the shorter the spring.

When you click on the node of the graph you can view the distribution or the Partial Dependence Plot for the selected variable.

Installation

The easiest way to get corrgrapher is to install it from CRAN:

install.packages("corrgrapher")

Or the the development version from GitHub:

devtools::install_github("ModelOriented/corrgrapher")

Examples

First, load the package

library('corrgrapher')

For data sets

For data frames the corrgrapher shows correlation network and histograms/distributions for features.

df <- as.data.frame(datasets::Seatbelts)
cgr <- corrgrapher(df)
cgr

For models

For models the corrgrapher shows partial dependencies. Use the DALEX::explain() function to create an adapter for any predictive model.

library(DALEX)
library(ranger)

titanic_rgr <- ranger(survived ~ ., data = titanic_imputed, classification = TRUE)
titanic_exp <- explain(titanic_rgr, data = titanic_imputed, y = titanic_imputed$survived, verbose = FALSE)
cgr <- corrgrapher(titanic_exp)
cgr

Acknowledgments

Work on this package was financially supported by the Polish National Science Centre under Opus Grant number 2017/27/B/ST6/0130.