{effectplots} is an R package for calculating and plotting feature effects of any model.
The main function feature_effects()
crunches the
following statistics per feature X over values/bins:
Furthermore, it calculates counts, average residuals, and standard deviations of observed y and residuals, eventually accounting for case weights. We highly recommend Christoph Molnar’s book [3] for more info on feature effects.
It takes 2 seconds on a laptop to get all statistics for ten features on a 10 Mio row data (+ prediction time).
Workflow
feature_effects()
or
the little helpers average_observed()
,
partial_dependence()
etc.update()
:
Combine rare levels of categorical features, sort results by importance
etc.plot()
: Choose
between ggplot2/patchwork and plotly.You can install the development version of {effectplots} from GitHub with:
# install.packages("pak")
::pak("mayer79/effectplots", dependencies = TRUE) pak
We use a 1 Mio row dataset about Motor TPL insurance. The aim is to model claim frequency. Before modeling, we want to study association between features and the response.
library(effectplots)
library(OpenML)
library(lightgbm)
set.seed(1)
<- getOMLDataSet(data.id = 45106L)$data
df
<- c("year", "town", "driver_age", "car_weight", "car_power", "car_age")
xvars
# 0.1s on laptop
average_observed(df[xvars], y = df$claim_nb) |>
plot(share_y = "all")
A shared y axis helps to compare the strength of the association across features.
Next, let’s fit a boosted trees model.
<- sample(nrow(df), 0.8 * nrow(df))
ix <- df[ix, ]
train <- df[-ix, ]
test <- data.matrix(train[xvars])
X_train <- data.matrix(test[xvars])
X_test
# Training, using slightly optimized parameters found via cross-validation
<- list(
params learning_rate = 0.05,
objective = "poisson",
num_leaves = 7,
min_data_in_leaf = 50,
min_sum_hessian_in_leaf = 0.001,
colsample_bynode = 0.8,
bagging_fraction = 0.8,
lambda_l1 = 3,
lambda_l2 = 5,
num_threads = 7
)
<- lgb.train(
fit params = params,
data = lgb.Dataset(X_train, label = train$claim_nb),
nrounds = 300
)
Let’s crunch all statistics on the test data. Sorting is done by weighted variance of partial dependence, a main-effect importance measure closely related to [4].
The average predictions closely follow the average observed, i.e., the model does a good job. Comparing partial dependence/ALE with average predicted gives insights on whether an effect comes from the feature on the x axis or from other, correlated, features.
# 0.3s on laptop
feature_effects(fit, v = xvars, data = X_test, y = test$claim_nb) |>
update(sort_by = "pd") |>
plot()
What about combining training and test results? Or comparing different models or subgroups? No problem:
<- feature_effects(fit, v = xvars, data = X_train, y = train$claim_nb)
m_train <- feature_effects(fit, v = xvars, data = X_test, y = test$claim_nb)
m_test
# Pick top 3 based on train
<- m_train |>
m_train update(sort_by = "pd") |>
head(3)
<- m_test[names(m_train)]
m_test
# Concatenate train and test results and plot them
c(m_train, m_test) |>
plot(
share_y = "rows",
ncol = 2,
byrow = FALSE,
stats = c("y_mean", "pred_mean"),
subplot_titles = FALSE,
title = "Left: Train - Right: Test",
)
In case we want to dig deeper into bias, we can use “resid_mean” as statistic, and show pointwise 95% confidence intervals for the true bias.
c(m_train, m_test) |>
update(drop_below_n = 50) |>
plot(
ylim = c(-0.06, 0.09),
ncol = 2,
byrow = FALSE,
stats = "resid_mean",
subplot_titles = FALSE,
title = "Left: Train - Right: Test",
interval = "ci"
)
Most models work out-of-the box. If not, a tailored prediction function can be specified.
library(DALEX)
library(ranger)
set.seed(1)
<- ranger(Sepal.Length ~ ., data = iris)
fit <- DALEX::explain(fit, data = iris[, -1], y = iris[, 1])
ex
feature_effects(ex, breaks = 5) |>
plot(share_y = "all")