Local Goodness-of-Fit Plots / Wangkardu Explanations – a new DALEX companion

The next DALEX workshop will take place in 4 days at UseR. In the meantime I am working on a new explainer for a single observation.
Something like a diagnostic plot for a single observation. Something that extends Ceteris Paribus Plots. Something similar to Individual Conditional Expectation (ICE) Plots. An experimental version is implemented in ceterisParibus package.
 
Intro

For a single observation, Ceteris Paribus Plots (What-If plots) show how predictions for a model change along a single variable. But they do not tell if the model is well fitted around this observation.

Here is an idea how to fix this:
(1) Take N points from validation dataset, points that are closest to a selected observation (Gower distance is used by default).
(2) Plot N Ceteris Paribus Plots for these points,
(3) Since we know the true y for these points, then we can plot model residuals in these points.
 
Examples

Here we have an example for a random forest model. The validation dataset has 9000 observations. We use N=18 observations closest to the observation of interest to show the model stability and the local goodness-of-fit.


(click to enlarge)

The empty circle in the middle stands for the observation of interest. We may read its surface component (OX axis, around 85 sq meters), and the model prediction (OY axis, around 3260 EUR).
The thick line stands for Ceteris Paribus Plot for the observation of interest.
Grey points stands for 18 closest observations from the validation dataset while grey lines are their Ceteris Paribus Plots. 
Red and blue lines stand for residuals for these neighbours. Corresponding true values of y are marked with red and blue circles. 

Red and blue intervals are short and symmetric so one may say that the model is well fitted around the observation of interest.
Czytaj dalej Local Goodness-of-Fit Plots / Wangkardu Explanations – a new DALEX companion