What’s new in DALEX v 0.4.9?

Few days ago a new version of DALEX was accepted by CRAN (v 0.4.9). Here you will find short overview what was added/changed.

DALEX is an R package with methods for visual explanation and exploration of predictive models.
Here you will find short overview with examples based on Titanic data.
For real world use cases:
Here you will find a conference talk related to credit scoring based on FICO data.
Here you will find an example use case for insurance pricing.

Major changes in the last version

Verbose model wrapping

Function explain() is now more verbose. During the model wrapping it checks the consistency of arguments. This works as unit tests for a model. This way most common problems with model exploration can be identified early during the exploration.

Support for new families of models

We have added more convenient support for gbm models. The ntrees argument for predict_function is guessed from the model structure.
Support for mlr, scikit-learn, h2o and mljar was moved to DALEXtra in order to limit number of dependencies.

Integration with other packages

DALEX has now better integration with the auditor package. DALEX explainers can be used with any function form the auditor package. So, now you can easily create an ROC plot, LIFT chart or perform analysis of residuals. This way we have access to a large number of loss functions.

Richer explainers

Explainers have now new elements. Explainers store information about packages that were used for model development along with their versions.
Latest version of explainers stores also sampling weights for the data argument.

A bit of philosophy

Cross-comparisons of models is tricky because predictive models may have very different structures and interfaces. DALEX is based on an idea of an universal adapter that can transform any model into a wrapper with unified interface that can be digest by any model agnostic tools.

In this medium article you will find a longer overview of this philosophy.

Tell Me a Story: How to Generate Textual Explanations for Predictive Models

TL;DR: If you are going to explain predictions for a black box model you should combine statistical charts with natural language descriptions. This combination is more powerful than SHAP/LIME/PDP/Break Down charts alone. During this summer Adam Izdebski implemented this feature for explanations generated in R with DALEX library. How he did it? Find out here:

Long version:
Amazing things were created during summer internships at MI2DataLab this year. One of them is the generator of natural language descriptions for DALEX explainers developed by Adam Izdebski.

Is text better than charts for explanations?
Packages from DrWhy.AI toolbox generate lots of graphical explanations for predictive models. Available statistical charts allow to better understand how a model is working in general (global perspective) or for a specific prediction (local perspective).
Yet for domain experts without training in mathematics or computer science, graphical explanations may be insufficient. Charts are great for exploration and discovery, but for explanations they introduce some ambiguity. Have I read everything? Maybe I missed something?
To address this problem we introduced the describe() function, which
automatically generates textual explanations for predictive models. Right now these natural language descriptions are implemented in R packages by ingredients and iBreakDown.

Insufficient interpretability
Domain experts without formal training in mathematics or computer science can often find statistical explanations as hard to interpret. There are various reasons for this. First of all, explanations are often displayed as complex plots without instructions. Often there is no clear narration or interpretation visible. Plots are using different scales mixing probabilities with relative changes in model’s prediction. The order of variables may be also misleading. See for example a Break-Down plot.

The figure displays the prediction generated with Random Forest that a selected passenger survived the Titanic sinking. The model’s average response on titanic data set (intercept) is equal to 0.324. The model predicts that the selected passenger survived with probability 0.639. It also displays all the variables that have contributed to that prediction. Once the plot is described it is easy to interpret as it posses a very clear graphical layout.
However, interpreting it for the first time may be tricky.

Properties of a good description
Effective communication and argumentation is a difficult craft. For that reason, we refer to winning debates strategies as for guiding in generating persuasive textual explanations. First of all, any description should be
intelligible and persuasive. We achieve this by using:

  • Fixed structure: Effective communication requires a rigid structure. Thus we generate descriptions from a fixed template, that always includes a proper introduction, argumentation and conclusion part. This makes the description more predictable, hence intelligible.
  • Situation recognition: In order to make a description more trustworthy, we begin generating the text by identifying one of the scenarios, that we are dealing with. Currently, the following scenarios are available:
    • The model prediction is significantly higher than the average model prediction. In this case, the description should convince the reader why the prediction is higher than the average.
    • The model prediction is significantly lower than the average model prediction. In this case, the description should convince the reader why the prediction is lower than the average.
    • The model prediction is close to the average. In this case the description should convince the reader that either: variables are contradicting each other or variables are insignificant.

Identifying what should be justified, is a crucial step for generating persuasive descriptions.

Description’s template for persuasive argumentation
As noted before, to achieve clarity we generate descriptions with three separate components: an introduction, an argumentation part, and a summary.

An introduction should provide a claim. It is a basic point that an arguer wishes to make. In our case, it is the model’s prediction. Displaying the additional information about the predictions’ distribution helps to place it in a context — is it low, high or close to the average.

An argumentation part should provide evidence and reason, which connects the evidence to the claim. In normal settings this will work like that: This particular passenger survived the catastrophe (claim) because it was a child (evidence no. 1) and children were evacuated from the ship in the first order as in the phrase women and children first. (reason no. 1) What is more, the children were traveling in the 1-st class (evidence no. 2) and first-class passengers had the best cabins, which were close to the rescue boats. (reason no. 2).

The tricky part is that we are not able to make up a reason automatically, as it is a matter of context and interpretation. However what we can do is highlight the main evidence, that made the model produce the claim. If a model is making its’ predictions for the right reason, evidences should make much sense and it should be easy for the reader to make a story and connect the evidence to the claim. If the model is displaying evidence, that makes not much sense, it also should be a clear signal, that the model may not be trustworthy.

A summary is just the rest of the justification. It states that other pieces of evidence are with less importance, thus they may be omitted. A good rule of thumb is displaying three most important evidence, not to make the picture too complex. We can refer to the above scheme as to creating relational arguments as in winning debates guides.

The logic described above is implemented in ingredients and iBreakDown packages.

For generating a description we should pass the explanation generated by ceteris_paribus() or break_down() or shap() to the describe() function.

There are various parameters that control the display of the description making it more flexible, thus suited for more applications. They include:

  • generating a short version of descriptions,
  • displaying predictions’ distribution details,
  • generating more detailed argumentation.

While explanations generated by iBreakDown are feature attribution explanations that aim at providing interpretable reasons for the model’s prediction, explanations generated by ingredients are rather speculative. In fine, they explain how the model’s prediction would change if we perturb the instance being explained. For example, ceteris_paribus() explanation explores how would the prediction change if we change the values of a single feature while keeping the other features unchanged.

Applications and future work

Generating natural language explanations is a sensitive task, as the interpretability always depends on the end user’s cognition. For this reason, experiments should be designed to assess the usefulness of the descriptions being generated. Furthermore, more vocabulary flexibility could be added, to make the descriptions more human alike. Lastly, descriptions could be integrated with a chatbot that would explain predictions interactively, using the framework described here. Also, better discretization techniques can be used for generating better continuous ceteris paribus and aggregated profiles textual explanations.

dime: Deep Interactive Model Explanations

Hubert Baniecki created an awesome package dime for serverless HTML interactive model exploration. The experimental version is at Github, here is the pkgdown website. It is a part of the DrWhy.AI project.

How does it work?

With the DALEX package you can create local and global model explanations for machine learning models. Each explanation can be visualized with a genetic plot() function.
Hubert created a generic plotD3() function which turns each explanation into an interactive D3 plot (with the help of r2d3 package). With the dime package you can combine few interactive explanations into a single dashboard. And the dashboard is serverless, you can host it at github or anywhere.

For example, the gif below shows how to combine a break down plot (local feature attribution) with ceteris paribus profiles (detailed analysis of a single variable). You can click a variable of interest to activate an appropriate ceteris paribus profile (click to play).

With the dime package you can combine any number of interactive widgets into a single dashboard. You can connect local, global explanations or EDA tools like histograms or barplots.

It’s very easy to generate such website. Just create an explainer and call the modelStudio() function.

Find examples and R codes here: https://github.com/ModelOriented/dime/blob/master/README.md

The dime package is still in the experimental phase. Your feedback is welcomed. Feel free to submit an issue with comments or ideas.

Learn about XAI in R with ,,Predictive Models: Explore, Explain, and Debug”

XAI (eXplainable artificial intelligence) is a fast growing and super interesting area.
Working with complex models generates lots of problems with model validation (on test data performance is great but drops at production), model bias, lack of stability and many others. We need more than just local explanations for predictive models.

The more complex are models the better tools are needed to understand how models are working, explore model behaviour and debug potential errors.

Two years ago I’ve initiated work on the DALEX package. Library packed with functions for local and global model exploration.
Over the time the package went through few architectural changes and now it is part of a larger universe of tools for model exploration developed at MI2DataLab with an increasing support of external contributors (join us).

To explain our philosophy behind the model exploration we (together with Tomasz Burzykowski from Hasselt) started a book ,,Predictive Models: Explore, Explain, and Debug’’.

First part, devoted to local exploration, is ready to read. It explains how to use DALEX with iBreakDown and ingredients packages for instance level explanations.
Later we will describe other packages from our universe.

Find the book-down version of here.

Find a one-page-cheatsheet here.

Let us improve these descriptions by adding pull requests or issues at the GitHub repo.
One day there will be a paper version 😉

Ile punktów potrzeba by się dostać do szkoły średniej w Warszawie?

W tym artykule Polityki przeczytałem, że ponad 3 tysiące uczniów nie dostało się do żadnej z wybranych szkół średnich w Warszawie. Pomimo wysiłku szkół by przyjąć możliwie wielu uczniów.

Marcin Luckner (MiNI PW) przesłał mi ciekawą analizę progów punktowych w różnych oddziałach w Warszawie. Poniżej umieszczam wybrane wykresy po drobnych zmianach. Dane pochodzą z serwisu edukacja.warszawa.pl. Przy okazji też będziemy mogli porównać kilka sposobów pokazywania rozkładów.

W powyższych danych znajduje się informacja ile punktów było potrzeba aby dostać się do wskazanego oddziału we wskazanej szkole średniej. W rozbiciu na typ szkoły i na to czy rekrutowały się dzieci z podstawówek czy gimnazjów.
Poniższy wykres (histogram) pokazuje jak wyglądają progi punktowe w różnych typach oddziałów. Na wykresie nie ma szkół sportowych, ponieważ tam były dodatkowe punkty sprawnościowe i trudno te progi porównać.

W różnych mediach można znaleźć informację o uczniu, który miał 190 punktów i nie dostał się do żadnej wybranej szkoły. Ale były też szkoły, które miały znacznie niższe progi przyjęcia. Bardzo wiele oddziałów miało progi przyjecia w okolicy 160 punktów.

John Tukey lata temu zaproponował by rozkłady opisywać za pomocą piątki liczb – min, max, mediana i kwartyle. To 5 liczb które dzieli wartości na 4 równe przedziały. Można je pokazać za pomocą wykresów pudełkowych.

Poniżej mamy wykresy pudełkowe z rozkładem progów punktowych podziałem na dzielnice. Im szersze pudełko tym więcej szkół jest w danej grupie. Najwyższe progi były w szkołach w Śródmieściu (ponad połowa oddziałów miała próg przyjęcia powyżej 165 punktów). łatwiej było się dostać do szkół średnich na Pradze czy w Ursusie.

Okazuje się, że i moją i Marcina ulubioną techniką pokazywania rozkładów jest dystrybuanta empiryczna. Wykres poniżej pokazuje jaki procent oddziałów ma prób przyjęcia mniejszy niż x.

Przykładowo szara linia odpowiada progowi 150 punktów. Tyle punktów wystarczyły by dostać się do praktycznie wszystkich oddziałów integracyjnych, ale już tylko do około 60% oddziałów ogólnych (1 na 3 oddziały ogólne ma wyższy prób punktowy), do około 33% oddziałów w szkołach dwujęzycznych (2 na 3 oddziały w szkołach dwujęzycznych ma wyższy próg przyjęcia). Nie wystarczy na szkoły z międzynarodową maturą.

To jaki jest Wasz ulubiony sposób pokazywania rozkładów?

modelDown is now on CRAN!

The modelDown package turns classification or regression models into HTML static websites.
With one command you can convert one or more models into a website with visual and tabular model summaries. Summaries like model performance, feature importance, single feature response profiles and basic model audits.

The modelDown uses DALEX explainers. So it’s model agnostic (feel free to combine random forest with glm), easy to extend and parameterise.

Here you can browse an example website automatically created for 4 classification models (random forest, gradient boosting, support vector machines, k-nearest neighbours). The R code beyond this example is here.

Fun facts:

archivist hooks are generated for every documented object. So you can easily extract R objects from the HTML website. Try


– session info is automatically recorded. So you can check version of packages available at model development (https://github.com/MI2DataLab/modelDown_example/blob/master/docs/session_info/session_info.txt)

– This package is initially created by Magda Tatarynowicz, Kamil Romaszko, Mateusz Urbański from Warsaw University of Technology as a student project.

xaibot – conversations with predictive models!

If you could talk to a predictive machine learning model, what would you ask for?

Try! Michał Kuźba is developing a mind-blowing project – xai chat-bot. Dialog based system that helps to explore and understand predictive models through natural language conversations (type, speak or phone the model 😉 ).

For example, imagine that you have a random forest model that predicts survival for titanic data. With xai-bot you can chat about your chances of survival, variables that influence survival, options that you have to increase your odds or just chat about life models.

The chatbot is based on dialog-flow google infrastructure. It communicates with DALEX explainers written in R through plumber REST API.

Find the chatbot here: https://kmichael08.github.io.

The project is under development, but the bot is already pretty smart.

So, have fun!

How to design a model visualisation @ Gdansk satRdays

I had amazing weekend in Gdansk thanks to the satRday conference organized by Olgun Aydin, Ania Rybinska and Michal Maj.

Together with Hanna Piotrowska we had a talk ,,Machine learning meets design. Design meets machine learning”. Hanna redesigned DALEX visualisations (DALEX is a set of tools for visual explanation of predictive ML models). During the talk she explained what and why was changed.

See for example the metamorphosis of the Break Down explainer. How many differences can you spot?

Every change (axis, reading order, spacing, colors, descriptions, background, annotations) serves some purpose.

Find our presentation at slideshare.

List of satRday talks (machine learning was quite popular).

Hanna design is implemented in ggplot2 thanks to Tomasz Mikołajczyk and in D3 thanks to Huber Baniecki! Find more examples of how to use new plots here.

Make it explainable!

Most people make the mistake of thinking design is what it looks like… People think it’s this veneer — that the designers are handed this box and told, ‚Make it look good!’ That’s not what we think design is. It’s not just what it looks like and feels like. Design is how it works.

Steve Jobs, The New York Times, 2003.

Same goes with interpretable machine learning.
Recently, I am talking a lot about interpretations and explainability. And sometimes I got impression that techniques like SHAP, Break Down, LIME, SAFE are treated like magical incantations that converts complex predictive models into ,,something interpretable’’.

But interpretability/explainability is not a binary feature that you have it or not. It’s a process. The goal is to increase our understanding of the model behavior. Try different techniques to broaden the knowledge about the model or about model predictions.
Maybe you will never explain 100%, but you will understand more.

XAI/IML (eXplainable Artificial Intelligence/Interpretable Machine Learning) techniques can be used not only for post-hoc explainability, but also for model maintenance, debugging or in early phases of crisp modeling. Visual tools like PDP/ALE/CeterisParibus will change the way how we approach modeling and how we interact with models. We as model developers, model auditors or users.

Together with Tomasz Burzykowski from UHasselt we work on a book about the methodology for visual exploration, explanation and debugging predictive models.

Find the early version here https://pbiecek.github.io/PM_VEE/.

There is a lot of R snippets that shows how to use DALEX (and sometimes other packages like shapper, ingredients, iml, iBreakDown, condvis, localModel, pdp) to better understand some aspects of your predictive model.

It’s a work in process and even in an early dirty phase (despite the fact that we have started a year ago).
Feel free to comment it, or suggest improvements. Easiest way to do this is to add a new issue.

Code snippets are fully thanks to archivist hooks. I think that it’s a first book that uses archivist hooks for blended experience. You can read about a model online and in just one line of code you can download an object to your R console.

First chapters show how to use Ceteris Paribus Profiles / Individual Conditional Expectations to perform what-if/sensitivity analysis of a model.

DALEX for keras and parsnip

DALEX is a set of tools for explanation, exploration and debugging of predictive models. The nice thing about it is that it can be easily connected to different model factories.

Recently Michal Maj wrote a nice vignette how to use DALEX with models created in keras (an open-source neural-network library in python with an R interface created by RStudio). Find the vignette here.
Michal compared a keras model against deeplearning from h2o package, so you can check which model won on the Titanic dataset.

Next nice vignette was created by Szymon Maksymiuk. In this vignette Szymon shows how to use DALEX with parsnip models (parsnip is a part of the tidymodels ecosystem, created by Max Kuhn and Davis Vaughan). Models like boost_tree, mlp and svm_rbf are competing on the Titanic data.

These two new vignettes add to our collection how to use DALEX with mlr, caret, h2o and others model factories.