dimnames.shapviz()
has received a replacement method.
You can thus change the column names of SHAP matrix and feature data (as
well as SHAP interactions) by colnames(x) <- ...
, see
https://github.com/ModelOriented/shapviz/issues/98package_version()
applied to numeric value will be
deprecated in the future)sv_dependence2D()
: x and y
coordinates are two features, while their summed SHAP values are shown
on the color scale. If interaction = TRUE
, SHAP interaction
values are shown on the color scale instead. The function is vectorized
in x
and/or y
. This visualization is
especially useful for models with geographic components.split(x, f)
splits a “shapviz” object x
into a “mshapviz” object.fastshap::explain()
offers the option
shap_only
. To conveniently construct the “shapviz” object,
use shapviz(fastshap::explain(..., shap_only = FALSE))
.
This not only passes the SHAP matrix but also the feature data and the
baseline. Thanks, Brandon Greenwell!Sometimes, you will find it necessary to work with several “shapviz” objects at the same time:
To simplify the workflow, {shapviz} introduces the “mshapviz” object (“m” like “multi”). You can create it in different ways:
shapviz()
on multiclass XGBoost or LightGBM
models.shapviz()
on “kernelshap” objects created from
multiclass/multioutput models.c(Mod_1 = s1, Mod_2 = s2, ...)
on “shapviz” objects
s1
, s2
, …mshapviz(list(Mod_1 = s1, Mod_2 = s2, ...))
The sv_*()
functions use the {patchwork} package to glue
the individual plots together.
See the new vignette for more info and specific examples.
sv_dependence()
now allows multiple v
and/or color_var
to be plotted (glued via
{patchwork}).row_id
of
sv_waterfall()
and sv_force()
now also allows
a vector of integers or a logical vector. If more than one row is
selected, SHAP values and predictions are averaged before plotting
(aggregated SHAP values in {DALEX}).x1
, x2
can now
be concatenated in rowwise manner using x1 + x2
or
rbind(x1, x2)
, again thanks to Adrian.colnames()
: “shapviz” objects x
have
received a dimnames()
function, so you can now, e.g., use
colnames(x)
to see the feature names.x
can now be subsetted using
x[cond, features]
.sv_dependence()
, sv_importance(kind="bee")
,
and sv_interaction()
.sv_dependence()
has been shortened to “SHAP
interaction”.show_other
of
sv_importance()
has been removed.S_inter
.print.shapviz()
is much more compact, use
summary.shapviz()
for more info.sv_waterfall()
: Using order_fun()
would
not work as expected with max_display
. This has been
fixed.sv_dependence()
: Passing
viridis_args = NULL
would hide the color guide title. This
has been fixed. But please pass viridis_args = list()
instead.sv_dependence()
now uses
color_var = "auto"
instead of
color_var = NULL
.sv_dependence()
now uses “SHAP value” as y label
(instead of the more verbose “SHAP value of [feature]”).S_inter
(3D
array):
shapviz(object, ..., S_inter = NULL)
shapviz(object, ..., interactions = TRUE)
shapviz(object, ...)
sv_interaction(x)
shows matrix of beeswarm plots.sv_dependence(x, v = "x1", color_var = "x2", interactions = TRUE)
plots SHAP interaction values.sv_dependence(x, v = "x1", interactions = TRUE)
plots
pure main effects of “x1”.sv_dependence(..., color_var = "auto")
uses those to
determine the most interacting color variable.collapse_shap()
also works for SHAP interaction
arrays.get_shap_interactions()
.sv_importance()
: In case of too many features,
sv_importance()
used to collapse the remaining features
into an additional bar/beeswarm. This logic has been removed, and the
show_other
argument has been deprecated.sv_dependence()
automatically adds
horizontal jitter for discrete v
. This now also works if
v
is numeric with at most seven unique values, not only for
logicals, factors, and character v
.sv_importance()
does not use a flipped coordinate
system anymore.sv_importance()
has received a new
argument show_others = TRUE
. Set to FALSE
to
hide the “other” bar/beeswarm.The following dependencies have been removed:
sv_importance()
bee_width
: Relative width of the
beeswarms. The default is 0.4. It replaces the width
argument passed via ...
.bee_adjust
: Relative adjustment factor of
the bandwidth used in estimating the density of the beeswarms. Default
is 0.5....
arguments are now
passed to geom_point()
.plotly::ggplotly()
now works for most functionalities
of sv_importance()
, including beeswarms.X
of the constructor of
shapviz()
is now less picky. If it contains columns not
present in the SHAP matrix, they are silently dropped. Furthermore, the
column order of the SHAP matrix and X
is now determined by
the SHAP matrix.shapviz_from_lgb_predict()
and
shapviz_from_xgb_predict()
format_fun
argument in sv_force()
and
sv_waterfall()
sort_fun
argument in sv_waterfall()
collapse_shap()
is not anymore an S3 method. It is just
a normal function that can be applied to a matrix.sv_importance()
would return
an error.X_pred
from
matrix
to xgb.DMatrix
in
shapviz.xgb.Booster()
.treeshap()
example to a ranger()
model.collapse
argument in
shapviz()
. This is named list specifying which columns in
the SHAP matrix are to be collapsed by rowwise summation. A typical
application will be to combine the SHAP values of one-hot-encoded
dummies and explain them by the corrsponding factor variable.sv_importance()
, see next section.sv_importance()
The calculations behind sv_importance()
are unchanged,
but defaults and some plot aspects have been reworked.
sv_importance()
now shows a
bar plot by default. Use kind = "beeswarm"
to get a
beeswarm plot.sv_importance()
does not show SHAP
feature importances as text anymore. Use
show_numbers = TRUE
to get them back. Furthermore, the
numbers are now printed on top of the bars instead on their bottom.show_numbers
can be used to to add
SHAP feature importance values for all plot types.max_display
has been increased from 10
to 15.bar_width
.color_bar_title
. Set to
NULL
to remove the color bar altogether.format_fun
now uses a right-aligned number
formatter with aligned decimal separator by default.dim()
method for “shapviz” object, implying
nrow()
and ncol()
.format_fun
argument of sv_waterfall()
and sv_force()
has
been replaced by format_shap
to format SHAP values and
format_feat
to format numeric feature values. By default,
they use the new global options “shapviz.format_shap” and
“shapviz.format_feat”, both with default
function(z) prettyNum(z, digits = 3, scientific = FALSE)
.sv_waterfall()
now uses the more consistent argument
order_fun = function(s) order(abs(s))
instead of the
original sort_fun = function(shap) abs(shap)
that was then
passed to order()
.viridis_args = getOption("shapviz.viridis_args")
to
sv_dependence()
and sv_importance()
to control
the viridis color scale options. The default global option equals
list(begin = 0.25, end = 0.85, option = "inferno")
. For
example, to switch to a standard viridis scale, you can either change
the default with options(shapviz.viridis_args = NULL)
or
set viridis_args = NULL
.shapviz_from_lgb_predict()
and shapviz_from_xgb_predict
in favour of the collapsing
logic (see above). The functions will be removed in version 0.3.0.predict()
arguments of LightGBM
(data -> newdata, predcontrib = TRUE -> type = “contrib”).This is the initial CRAN release.