Techniques for Analyzing ML models

This is an article I had for quite a while as a draft. As part of my yearly cleanup, I've published it without finishing it. It might not be finished or have other problems.

Techniques for model analysis:

Prediction-Based: * Decision boundaries * LIME * Feature importance * SHAP values * Partial Dependence Plots * Sensitivity analysis / perturbation importance * Model parameter analysis * ELI 5 * Attention mapping / saliency mapping

Error-Based: * Confusion matrix

Data-Based: * Dimensionality reduction * Feature correlations

If you're interested in analysis of CNNs, have a look at my masters thesis:

Analysis and Optimization of Convolutional Neural Network Architectures

Decision boundaries

Drawing this is only an option if you have 3 or less features. So not really useful in most problem settings.

SHAP values

SHAP Values (an acronym from SHapley Additive exPlanations) go in the direction of feature importance.

Let me explain them with an example of the Titanic dataset: You have a survival probability of a given person, e.g. 76%. You want to understand why it is 76%.

So what you can do is to twiddle the features. How does the survival probability change when the person has less / more siblings? When the person has the median number of siblings?

There is the shap package for calculating the shap values.

Techniques for Analyzing ML models

Decision boundaries

SHAP values

Published

Category

Tags

Contact