Explainable Artificial Intelligence (XAI) is an important subfield of AI that focuses on making the decision-making processes of machine learning models understandable to humans. This is especially important for black-box models, such as ensemble or deep learning algorithms, which are too complex even for experts to understand (see also my article on Explainable Artificial Intelligence). A better understanding of such black-box models leads, among other things, to more trust among users and gives developers hints on how to improve a model.
A popular method of Explainable Artificial Intelligence is SHAP (SHapley Additive ExPlanations). SHAP can be applied to any machine learning model and explains how individual features affect a model’s predictions. This series of articles will give you an easy-to-understand introduction to SHAP and the game-theoretic concept of Shapley values that is central to it. You will also learn how to explain the predictions of black-box models using the Python library SHAP with a practical programming example. The series consists of 3 articles:
- How to better understand black box models with Shapley values?
- The ultimate guide that makes mastering SHAP easy
- A comprehensive Python tutorial for quickly diving into SHAP
This is the first article in the series. In this article, you will learn how the concept of Shapley values from game theory can be used to explain machine learning algorithms. Using a simple example, you will learn how to calculate and interpret Shapley values for machine learning models. By the end of this article, you will have a solid understanding of Shapley values. You will know how Shapley values can help you better understand black box models.
Overview
This blog post is divided into the following sections:
- What are Shapley values?
- How to calculate Shapley values?
- What is the mathematical definition of Shapley values?
What are Shapley values?
Shapley values are a method from game theory for distributing the payoff of a cooperative game among a group of players. Typically, each player’s contribution to the payoff of the game is not equal. Therefore, it would be unfair to distribute the payoff equally among all players. In 1953, Lloyd S. Shapley proposed Shapley values as a solution to this problem. Using Shapley values, each player can be assigned a value that corresponds to his or her contribution to the total winnings of the group. But what does all this have to do with Explainable Artificial Intelligence?
Shapley values as an explanation method for machine learning algorithms
Shapley values can also be used to explain the behavior of machine learning models. In particular, Shapley values can be used to determine how different features affect the individual predictions of a model. Shapley values thus reflect the collective contributions of features to a model’s prediction. Analogous to the definition of Shapley values in game theory, the “game” here is the machine learning task of making a prediction for a single instance of the underlying data set. The “payoff” is the difference between the model’s actual prediction for that instance and the expected value of the model’s predictions. This payoff should be fairly distributed across the contributions of each feature. The “players” are the features of the machine learning algorithm that work together to predict a particular value.
The great advantage of Shapley values is that they can be applied to any machine learning model. Moreover, they can be used to explain individual predictions as well as the overall behavior of the model (as described in detail in my article here). It should be noted, however, that Shapley values only describe the behavior of the model and not necessarily causal relationships in the real world. Machine learning algorithms typically learn correlations in the training data and use them to make their predictions. Shapley values can make these learned correlations transparent. However, correlations do not necessarily allow you to infer causal relationships, as described in this article.
In the next section, we will use a simple example to learn how to calculate Shapley values for machine learning models.
How to calculate Shapley values?
To better understand Shapley values, let us first explain how they are calculated with a simple example. Suppose we have trained a regression model that predicts the property value of a castle based on three different features. The three features that the machine learning model uses for prediction are the square footage of the castle (Feature A), the number of rooms (Feature B), and the age of the castle (Feature C). For Neuschwanstein Castle, a 154-year-old castle with 6000 square feet of floor space and 15 rooms, the model estimates a value of $100 million. In pseudo-mathematical terms, this means that:
\[
\text{model}(“Neuschwanstein Castle“) = \text{model}(A=6000, B=15, C=154) = 100,
\]
where the unit of measure is “million dollars”.
Estimating the base value
We now want to calculate the Shapley values for this model prediction to see how the different features affected it. To calculate the Shapley values, we must first estimate a base value. The base value is the expected value of all the model predictions. In practice, we can estimate this expected value by taking the average of all of our model’s predictions for all of the instances in the data set. The base value is a constant and is independent of which model prediction we are computing Shapley values for. The base value depends only on our model and the data set we are using to compute the average model prediction.
To estimate the base value, we must first compute predictions for the property values of all the castles in our data set using our model. We then calculate the average of all these predictions. In our example, the model predicts an average property value of $70 million. The prediction for Neuschwanstein Castle of $100 million is therefore $30 million higher than the average prediction of the model, see Table 1.
Model prediction for Neuschwanstein Castle (in millions of dollars) | 100 |
Base value: Average model prediction (in millions of dollars) | 70 |
Difference (in millions of dollars) | 30 |
Relationship between Shapley values, base value, and model prediction
Now that we have calculated the base value, we can calculate the Shapley values of the model prediction for Neuschwanstein Castle. To do this, we calculate a Shapley value for each of the three features. The Shapley value of a feature describes the contribution of the feature to the difference between the actual prediction (100) and the base value (70). Thus, the sum of the Shapley values of all the features is always exactly equal to the difference between the model’s actual prediction and the base value. Thus, the relationship between the model prediction and the Shapley values of the three features can be expressed as follows:
\[\begin{align}
\text{model}(“Neuschwanstein Castle“) – base value = &\text{Shapley}_A(“Neuschwanstein Castle”) \\ &+ \text{Shapley}_B(“Neuschwanstein Castle”) \\ &+ \text{Shapley}_C(“Neuschwanstein Castle”) \end{align}
\]
Rearranging the equation yields the following relationship:
\[ \begin{align}\text{model}(“Neuschwanstein Castle“) = &base value \\ &+ \text{Shapley}_A(“Neuschwanstein Castle“) \\ &+ \text{Shapley}_B(“Neuschwanstein Castle“) \\ &+ \text{Shapley}_C(“Neuschwanstein Castle“) \end{align} \]
Thus, the model’s prediction for Castle Neuschwanstein is the sum of the model’s average prediction and the Shapley values for each of the feature values used to predict the value of Castle Neuschwanstein. In other words, the Shapley values describe how the prediction for Castle Neuschwanstein is achieved from the base value through the contributions of the three features.
Calculate the marginal contributions of each feature
The next step is to compute the corresponding Shapley value for each of the three features A, B, and C. The Shapley value of a feature is the weighted average of all the marginal contributions of the feature. To compute this weighted average, we must first consider the power set of our features and compute the marginal contribution of each feature for each subset.
In our example with three features, the power set is the set of all possible combinations of these three features with 0, 1, 2, and 3 elements. In total, we have 8 possible combinations, namely the sets { }, {A}, {B}, {C}, {A, B}, {A, C}, {B, C}, and {A, B, C}. To compute the Shapley values of the three features, we need to retrain our model for each of these feature combinations. In this process, only the features included in the combination are used to train each model, and the remaining features are removed from the training data set. We then use each of these models to predict the property value of Neuschwanstein Castle and calculate the marginal contributions.
Model for the combination without features
However, for the combination without features, i.e. { }, it is not possible to train a model. This is because a machine learning algorithm always needs a training data set with at least one feature from which it can learn during training. To get around this problem, for the combination { }, the model whose prediction is always constant to the base value is used. Thus, this model predicts a value of 70 million for each castle. The prediction of the model for the combination { } for Neuschwanstein Castle also corresponds to the base value of 70 million.
For the newly trained models of the eight different feature combinations, the predictions for the property value of Neuschwanstein Castle in our example are as follows
Combination | Prediction for Neuschwanstein Castle (in millions of dollars) |
---|---|
{ } | 70 |
{A} | 80 |
{B} | 30 |
{C} | 90 |
{A, B} | 85 |
{A, C} | 80 |
{B, C} | 65 |
{A, B, C} | 100 |
Calculate the marginal contributions of feature A
The following shows how to calculate the Shapley value for feature A. To calculate the Shapley value for feature A, we need to determine, for all feature combinations that include feature A, how the prediction for Neuschwanstein Castle changes when feature A is removed from those combinations. Thus, to obtain the marginal contribution of feature A to a feature combination, we calculate the predicted value of the castle using the two respective models for the feature combination with and without feature A. The marginal contribution of feature A to the corresponding feature combination is then the difference between these two predictions. The calculation of the marginal contributions for features B and C is analogous, see Figure 1.

Calculate the marginal contribution of feature A to the combination {A, B}
To illustrate, we now compute this for a combination of features. For example, to compute the marginal contribution of feature A to the combination {A, B}, we need to subtract the prediction of the model trained on feature B only from the prediction of the model trained on both features A and B. The marginal contribution of feature A to the combination {A, B} is thus
\[ \begin{align}model_{\{A, B\}}(“Neuschwanstein Castle“) – model_{\{B\}}(“Neuschwanstein Castle“) = 85 – 30 = 55. \end{align} \]
For the other feature combinations, the calculation of the marginal contributions of feature A is analogous. Table 3 shows the calculation of all marginal contributions of feature A.
Combination with A | Prediction with A | Combination without A | Prediction without A | Marginal contribution of A | Weight |
---|---|---|---|---|---|
{A} | model{A}(x{A}) = 80 | {} | model{}(x{}) = 70 | model{A}(x{A}) – model{}(x{}) = 80 – 70 = 10 | 1/3 |
{A, B} | model{A, B}(x{A, B}) = 85 | {B} | model{B}(x{B}) = 30 | model{A, B}(x{A, B}) – model{B}(x{B}) = 85 – 30 = 55 | 1/6 |
{A, C} | model{A, C}(x{A, C}) = 80 | {C} | model{C}(x{C}) = 90 | model{A, C}(x{A, C}) – model{C}(x{C}) = 80 – 90 = -10 | 1/6 |
{A, B, C} | model{A, B, C}(x{A, B, C}) = 100 | {B, C} | model{B, C}(x{B, C}) = 65 | model{A, B, C}(x{A, B, C}) – model{B, C}(x{B, C}) = 100 – 65 = 35 | 1/3 |
Calculate the Shapley value for feature A
The Shapley value for feature A can now be calculated from these marginal contributions. It is the weighted average of the marginal contributions of feature A to the different combinations of features. The weight of a marginal contribution to a particular combination of features depends on the number of elements in the combination. Thus, the weight of the marginal contribution of feature A to a feature combination with n elements is always equal to the inverse of the product of the number of all combinations with n elements containing A and the total number of features.
For example, consider the marginal contribution of feature A to the combination {A, B}. The combination {A, B} consists of two elements, the two features A and B. In total, there are two feature combinations consisting of two elements that contain feature A. These are the combinations {A, B} and {A, C}. Since we have a total of three features (A, B, and C), the weight for the marginal contributions of feature A to each of these two combinations is 1/(2*3) = 1/6. Similarly, the weights for the marginal contributions of A to the remaining feature combinations can be calculated. These weights are also shown in Table 3.
To calculate the Shapley value of feature A for Neuschwanstein Castle, we simply multiply the marginal contributions from Table 3 by their respective weights and add them together:
\[\text{Shapley}_A = \frac{1}{3} \cdot 10 + \frac{1}{6} \cdot 55 + \frac{1}{6} \cdot (-10) + \frac{1}{3} \cdot 35 = 22,5.
\]
Shapley values of all features
Similarly, we can calculate the Shapley values for features B and C. We then obtain the Shapley values for features A, B, and C, as shown in Table 4.
Feature A | Feature B | Feature C | Total |
---|---|---|---|
22,5 | -10 | 17,5 | 30 |
In the model prediction for Neuschwanstein Castle, feature A made a positive contribution of $22.5 million, feature B made a negative contribution of $10 million, and feature C made a positive contribution of $17.5 million. Thus, based on these Shapley values, it can be seen that both the area (feature A) and the age (feature C) of Neuschwanstein Castle contributed to an above average predicted property value. However, the Shapley value for the number of rooms (feature B) is negative and thus reduces the predicted property value of the castle.
The sum of the contributions of the three features is $30 million, which is exactly the difference between the model prediction for Neuschwanstein Castle ($100 million) and the base value ($70 million). This is an important property of Shapley values that is not unique to this simple example. In general, the sum of the Shapley values of all features is always exactly equal to the difference between the model prediction and the base value. The mathematical definition of Shapley values is discussed in more detail in the next section.
What is the mathematical definition of Shapley values?
In theory, Shapley values can be computed for any number of features, not just three. Shapley values help us understand how different features affect a model’s prediction. As we saw in the previous example, the Shapley value of a feature is the weighted average of all the marginal contributions of the feature to the different combinations of features. Thus, Shapley values take into account that the influence of a feature on the model prediction also depends on the other features. Mathematically, the calculation of the Shapley value of a feature i for a model prediction of an instance x can be expressed by the following formula:
\[\text{Shapley}_i (x)=\sum_{S\subseteq N_{i}}\frac{1}{n} \binom{n-1}{\vert S\vert-1}^{-1}\left(\text{model}_{S} \left( x_S \right)- \text{model}_{S\setminus\{i\}} \left( x_{S\setminus\{i\}} \right) \right).
\]
Where n is the number of features, Ni is the set of all possible feature combinations containing feature i, S is a feature combination, and S\{i} is the feature combination S without feature i. ModelS(xS) is the prediction for instance x based on the model trained with the features contained in S. Here, xS denotes the instance containing only the feature attributes of instance x that are also contained in the feature combination S.
The marginal contribution of a feature i to a feature combination S is given by
\[\left(\text{model}_{S} \left( x_S \right)- \text{model}_{S\setminus\{i\}} \left( x_{S\setminus\{i\}} \right) \right)
\]
The other terms in the formula are just the weights of the different marginal contributions of the feature.
The definition of Shapley values in simple terms
Don’t be confused by this formula. It is simply a mathematical generalization of the example above. It can be used to compute Shapley values for models with any number of features. In simple terms, this formula means nothing more than
\[\text{Shapley}_i (x) =
\sum_{\text{combinations with feature i}}
\small
\frac{\text{marginal contribution of feature i to the combination}}{(\text{number of features}) \cdot (\text{number of combinations with feature i in this size})}.
\]
Disadvantages of Shapley values
A major drawback of Shapley values is that their computation is usually NP-hard and thus very time consuming. The computational cost of computing Shapley values grows exponentially with the number of features, since 2n models must be trained. While only 32 models need to be trained for 5 features, 1024 models need to be trained for 10 features. Now imagine how many models are needed if the features represent an image of 28×28 pixels. This is extremely resource and time consuming and often not feasible in practice.
Fortunately, there are several approximation methods to quickly compute Shapley values even for large numbers of features and instances. Probably the best known approximation method is SHAP, which will be described in detail in the next blog post.
Further reading
In this section you will find additional literature to help you learn more about Shapley values.
Books
Scientific publications
- Shapley, Lloyd S. “17. A value for n-person games.” Contributions to the Theory of Games (AM-28), Volume II. Princeton University Press, 2016. 307-318.
- Lundberg, Scott M., and Su-In Lee. “A unified approach to interpreting model predictions.” Advances in Neural Information Processing Systems (2017).
Summary
In this blog post, you learned how to use Shapley values to explain how machine learning models work.
Specifically, you learned:
- Shapley values describe the contributions of different features to a machine learning model’s prediction relative to a base value.
- The base value is the expected value of all model predictions.
- A feature’s Shapley value is calculated as the weighted average of the feature’s marginal contributions.
Do you have any questions?
Feel free to leave them in the comments below and I will do my best to answer them.
P.S.: Of course I also appreciate constructive feedback on this blog post 😊

Hi, my name is René Heinrich. I am a data scientist doing my PhD in the area of trustworthy artificial intelligence. On this blog I share my experiences and everything I have learned on my own knowledge journey.