Techniques for Analyzing ML models

This is an article I had for quite a while as a draft. As part of my yearly cleanup, I've published it without finishing it. It might not be finished or have other problems.

Techniques for model analysis:

Prediction-Based: * Decision boundaries * LIME * Feature importance * SHAP values * Partial Dependence Plots * Sensitivity analysis / perturbation importance * Model parameter analysis * ELI 5 * Attention mapping / saliency mapping

Error-Based: * Confusion matrix

Data-Based: * Dimensionality reduction * Feature correlations

If you're interested in analysis of CNNs, have a look at my masters thesis:

Analysis and Optimization of Convolutional Neural Network Architectures

Decision boundaries ¶

Drawing this is only an option if you have 3 or less features. So not really useful in most problem settings.

SHAP values ¶

SHAP Values (an acronym from SHapley Additive exPlanations) go in the direction of feature importance.

Let me explain them with an example of the Titanic dataset: You have a survival probability of a given person, e.g. 76%. You want to understand why it is 76%.

So what you can do is to twiddle the features. How does the survival probability change when the person has less / more siblings? When the person has the median number of siblings?

There is the shap package for calculating the shap values.

Techniques for Analyzing ML models

Decision boundaries ¶

SHAP values ¶

Published

Category

Tags

Contact