An absolutely easy introduction to Explainable Artificial Intelligence

Artificial Intelligence (AI) is one of the most important disruptive technologies of this century. AI is now delivering significant economic and societal value across many industries. AI-based systems have become sophisticated enough to automate many tasks and minimize the need for human intervention. From virtual assistants like Siri and Alexa to movie recommendations on Netflix and chatbots like ChatGPT, the possibilities for AI seem endless and are now having a significant impact on society. However, these AI systems lack transparency, which is why Explainable AI has gained traction in many areas in recent years.

A growing number of companies and organizations are deploying AI in critical infrastructures such as healthcare, finance, transportation, and energy systems. The decisions made by AI systems in these high-risk areas affect people’s lives. As a result, it is important to ensure that AI in these sensitive applications performs flawlessly at all times. To that end, it is particularly important that humans are able to understand how AI systems make decisions and the reasons behind those decisions. It would be extremely careless and dangerous to entrust important decisions to a system that cannot explain the reasons for its decisions.

This is where Explainable Artificial Intelligence (XAI) comes in. XAI is an important subfield of artificial intelligence that focuses on making the decision-making process of machine learning algorithms understandable to humans.

In this article, you’ll learn the basics of Explainable AI, why it’s important, and the types of explanation methods that exist.

Overview

This blog post is divided into the following sections:

  • What is Explainable Artificial Intelligence?
  • Why is Explainable Artificial Intelligence important?
  • Who are the target audiences for Explainable Artificial Intelligence?
  • Categorization of explanation methods

What is Explainable Artificial Intelligence?

The field of Explainable Artificial Intelligence (XAI) is concerned with the development of AI systems that provide details or reasons to make their operation clear and easy to understand for a specific audience. Explainable Artificial Intelligence therefore encompasses various techniques that lead to more explainable AI models. These AI models are capable of providing explanations for their decisions.

Explainable AI also draws on insights from the social sciences and takes into account the psychology of explanations. This is important so that the AI system is able to provide the best and most helpful explanations that help humans understand why the AI system made certain decisions. At the same time, Explainable AI aims to create AI systems that are as powerful as black-box models. Thus, the added explainability of an AI system should not come at the expense of its performance.

The need for Explainable Artificial Intelligence is greater than ever

The problem of explainability in machine learning has been around since the mid-1970s, when researchers tried to explain expert systems. However, this topic has regained much of its relevance in recent years due to disruptive developments in the field of AI. This is because most machine learning techniques used today in many fields to process unstructured data such as images, text, or audio are black-box models. Deep learning methods (e.g., deep neural networks) are typically used in these areas because they generally have very high performance in processing unstructured data.

Due to their exceptional performance, deep learning algorithms are increasingly used in all industries and critical decision-making processes. However, deep learning algorithms have the major disadvantage of being black box models that are very opaque in terms of their explainability. Due to their many model parameters and nonlinearities, black box models represent such a complex mathematical function that their inner workings are beyond human comprehension. As a result, they do not provide detailed information about how they arrive at certain decisions, recommendations, predictions, or actions.

Imagine that a bank has rejected your application for a loan. You realize that this decision was not made by a bank employee, but by an artificial intelligence. If the bank uses a pure black-box model without recourse to Explainable Artificial Intelligence techniques, it cannot tell you exactly why your loan application was rejected. That would be pretty unsatisfying, wouldn’t it? However, using XAI methods, the bank could tell you exactly what the main reasons were for rejecting your loan application. They could also tell you what you should have done differently to get your loan approved.

Current challenges in Explainable Artificial Intelligence

Explainable AI still faces many challenges. For example, due to the complexity of black-box models, often only parts of these models can be explained, such as the reasons for individual decisions or the learned concepts of individual layers of a deep neural network. Another challenge is to produce good explanations that can be easily understood and processed by humans. An incomprehensible or misleading explanation would miss the point. In this context, defining general metrics to evaluate the quality of explanations is also a major challenge. In fact, assessing the quality of an explanation is anything but straightforward.

Whether or not an explanation is helpful to a person cannot usually be judged objectively. Explanation quality is highly subjective. Different stakeholders have individual requirements for the format and scope of explanations and need them for different purposes. In addition, the comprehensibility of an explanation depends heavily on a person’s cognitive abilities and prior knowledge. Therefore, it is important that explanations for an AI system are always tailored to the intended audience. We discuss the different target audiences for Explainable Artificial Intelligence in more detail below.

Why is Explainable Artificial Intelligence important?

In the past, when developing AI systems, companies and organizations usually only considered the performance of the AI algorithms when evaluating them. However, it is important to consider other aspects besides performance when evaluating AI systems, such as security, robustness, and moreover, the explainability of an AI system.

The explainability of an AI system is important for several reasons. For example, explainability can help to correct misbehavior of the AI system more easily, gain the trust of stakeholders, and discover new knowledge. Legal requirements may also require an AI system to be explainable. Below we discuss these four main reasons for the need for Explainable AI, and then introduce the different audiences for XAI.

1. Quality assurance and troubleshooting

Explainable AI is particularly important for high-stakes applications where the decisions of AI systems can affect people’s lives, such as lending, medical diagnosis, or legal decisions. In these areas, poor decisions by AI systems can lead to economic or social harm, such as discrimination, social inequality, or even loss of life.

Explainable AI can help detect biases in training data and misbehavior of an AI system early on. Because explainable AI systems also provide reasons for their decisions, the cause of the error can be better identified, making it easier to correct the error. Explainable AI thus contributes to the development of better AI systems and helps minimize the risk of incorrect or discriminatory decisions.

For example, an autonomous car should be able to correctly detect and react to cyclists in any situation. Explainable AI provides information about which features the AI system is specifically using to detect cyclists. This can be used to verify that these features are causal and meaningful. For example, one would expect the AI to recognize a cyclist by the presence of two wheels, among other things.

However, if an Explainable AI analysis shows that an AI system recognizes a cyclist primarily based on the bike lane markings, this indicates that the AI is behaving incorrectly. In this case, the AI system may not correctly detect cyclists who are not in the bike lane. This can lead to dangerous situations and put cyclists’ lives at risk. In this case, the AI must be corrected immediately.

2. Building trust

Explainability can also build trust in AI systems. This trust is necessary for AI systems to be accepted by society and used by individuals and businesses. This is because people are generally very reluctant to use technology that they do not understand. Explanable AI, on the other hand, can show human users the reasons behind certain decisions made by an AI system. If these reasons are consistent with the logic or intuition of human users, this will increase the acceptance of an AI system.

3. Discovery of new scientific knowledge

Machine learning methods typically gather a lot of knowledge from their training data, which in the case of black-box models is initially encoded in the model weights. Explainable Artificial Intelligence can help extract this knowledge and use it to gain new scientific insights. For example, in the future, XAI could help discover new laws in biology, chemistry, and physics.

It should be noted, however, that machine learning methods only learn correlations in the training data. These correlations are not necessarily causal relationships in the real world. Thus, Explainable AI does not necessarily uncover causal relationships in the real world. However, it can provide initial indications of causality that can then be further explored and verified by domain experts.

4. Regulatory requirements

In addition, legal or regulatory requirements may require the explainability of AI systems in the future. For example, on April 21, 2021, the European Union published the Artificial Intelligence Act, an initial legislative proposal for increased regulation of AI systems. This proposed legislation includes explainability requirements for AI systems:

High-risk AI systems shall be designed and developed in such a way to ensure that their operation is sufficiently transparent to enable users to interpret the system’s output and use it appropriately.

– Artificial Intelligence Act, European Commission

Although the Artificial Intelligence Act is still very vague, it suggests that Explainable AI will become much more important in the coming years. For AI applications in high-risk areas such as healthcare and finance, explainability of AI systems will likely be essential in the future.

Who are the target audiences for Explainable Artificial Intelligence?

The need for Explainable Artificial Intelligence can also be justified from the perspective of different stakeholders. After all, explanations are provided to specific stakeholders who benefit from or even depend on the explanations. There are several types of stakeholders at different stages in the development and use of AI systems. Each of these stakeholders needs the explanations of an AI system for different purposes and therefore has individual requirements for the type of explanations.

The five most common target audiences for Explainable AI are developers, domain experts, managers, regulators, and people affected by an AI system’s decisions (see Figure 1).

Explainable AI addresses 5 different target audiences
Figure 1: The different target audiences for Explainable Artificial Intelligence. (This graphic was created using icons by juicy_fish from www.flaticon.com).

Different target audiences and their needs

  • Developers, data scientists, and product managers: They need explanations to ensure the quality of their AI models. Explainable AI helps them identify and correct unknown weaknesses and sources of error (e.g., incorrect model behavior or bad data). In addition, the explanations can provide clues to potential improvements, leading to increased product efficiency or new functionality.
  • Users and domain experts: For users of an AI system and for domain experts (e.g., physicians in the case of AI for medical applications), explanations of the AI system are important so that they can develop trust in the system (especially with respect to its functionality, fairness, and ethical defensibility). In addition, Explainable AI can help subject matter experts gain new scientific insights.
  • Managers and board members: Explainable AI is also of interest to executives and board members. It helps them better understand the various AI applications in the organization and assess their alignment with the organization’s goals and values. In addition, an explainable AI system enables a better assessment of its compliance. Thus, explanations can help to better assess the risks and impacts of the AI system.
  • Regulators: For supervisors and regulators, explainability of AI systems is important primarily because it increases the transparency of the systems. This makes it easier for regulators to audit AI systems and verify compliance. In this way, it is possible to ensure that an AI model complies with applicable laws.
  • Affected persons: Explanations provided by an AI system are also relevant to people who are directly or indirectly affected by the system’s decisions. Explanations help these people better understand their situation. They can also use the explanations to verify that the AI’s decisions are fair and protect their rights and interests. Explainable AI thus also contributes to the social acceptance of AI systems.

The need for audience-centered explanations

As we have seen, Explainable AI is essential for commercial, ethical, and regulatory reasons. It helps different audiences understand, appropriately trust, and effectively use the results of an AI.

However, each of these target audiences requires explanations of an AI system that are specific to their needs. For example, each audience has different background knowledge about the application domain of the AI system and about the AI system itself. In addition, each audience needs the explanations for a different purpose. Therefore, a good explanation takes into account the needs of the target audience with respect to the subject matter to be explained. For this reason, there are a variety of different categories of explanation methods, which we explain in more detail in the following section.

Categorization of explanation methods

Explainable AI offers a variety of explanation methods that can be categorized by type, scope, purpose, and format, see Figure 2. We will discuss this taxonomy for categorizing different explanation methods in more detail below.

A taxonomy for Explainable AI
Figure 2: Taxonomy for categorizing different explanation methods of Explainable Artificial Intelligence. (This figure was created using icons from Freepik by www.flaticon.com).

Type of explanation method

First, explanation methods can be categorized according to their methodological approach. Thus, two broad categories of Explainable Artificial Intelligence can be distinguished, namely post hoc explanation methods and inherently interpretable models.

Inherently interpretable models

Inherently interpretable models are machine learning algorithms whose model architecture is designed to be inherently transparent and understandable to humans. In this type of Explainable AI, the complexity of the machine learning models is limited to achieve better model explainability. Inherently interpretable models use very simple model architectures or include components embedded in the model architecture to help explain their decisions.

Inherently interpretable models provide an explanation of their behavior on their own. This means that explanations do not need to be generated by downstream techniques. Examples of such models include classic linear regression or simple decision trees. With such models, it is easy for humans to read from the model weights how the model makes its decisions. Importantly, it is not only the model architecture that matters for inherent interpretability, but also the features of the input data. Thus, it is important that the features themselves are meaningful and understandable to humans.

Post hoc explanation methods

While the problem with inherently interpretable models is to design model architectures that are not black boxes, the problem with post hoc explanation methods is different. The area of post hoc explanation methods focuses on already developed AI models that cannot be easily explained a priori due to their complexity. Hence, the question is how to explain such black-box models without changing their model architecture and model weights.

The field of post hoc explainability therefore involves the development of additional, separate techniques that are applied to a black-box model after it has been trained. In this way, the black-box model can be analyzed and its decisions explained after the fact. An example of a post hoc explanation method is SHAP.

Post hoc explanation methods are further divided into model-specific and model-agnostic explanation methods. Model agnostic explanation methods make no assumptions about the underlying machine learning model. They typically analyze only the input and output data of a model to explain its behavior. However, they do not make any assumptions about the model architecture or model weights. Therefore, model-agnostic explanation methods can be applied to all types of black-box models.

In contrast, model-specific post hoc explanation methods are tailored to specific machine learning algorithms. Therefore, their applicability is limited to these specific classes of models and cannot be generalized to other models. For example, there are some model-specific post hoc explanation methods that compute gradients to generate explanations for a model’s decisions. Such gradient-based explanation methods work very well for neural networks whose training is itself gradient-based. However, gradients cannot be computed for ensembles of decision trees, such as random forests, so gradient-based post hoc explanation methods are not applicable in these cases.

Inherently interpretable models vs. post hoc explanation methods

Post hoc explanation methods usually approximate the original model in order to explain it. Their explanations are therefore less accurate and less reliable than those of inherently interpretable models. This is because the explanations of inherently interpretable models are part of the decision-making process of the model itself and are therefore not affected by approximation errors.

Scope of an explanation method

Explanation methods can also be categorized according to the scope of the explanation. A distinction is made between local explanation methods, which explain individual model predictions, and global explanation methods, which describe the entire model behavior.

Local explanations

Local explanations provide information about the reasons for individual predictions of a machine learning technique. Let’s take the example of AI-based credit scoring again. A local explanation would describe why an individual applicant’s credit application was rejected. For example, it could explain which features of the input data were particularly important to the machine learning decision. Was it the applicant’s low income? Or because he had outstanding debts?

Local explanations can also shed light on exactly how certain features influenced the AI model’s prediction. For example, was the loan applicant’s income more likely to cause the model to reject or approve the loan application? In addition, local explanations can also provide information about how the features would need to be changed for the AI model to make a different decision. For example, a local explanation might lead to the conclusion that the loan application would have been approved if the applicant’s income had been $1,000 higher.

Global explanations

Unlike local explanations, global explanations do not focus on individual predictions made by a machine learning model. Instead, global explanations help understand the overall behavior of an AI model and the mechanisms by which the model operates. They identify common patterns in a model’s decision making across a large set of input data (e.g., an entire dataset).

Global explanations can help understand which features are most important to an AI model’s decisions overall. For example, is a loan applicant’s income generally the most important feature in assessing his or her creditworthiness? Or rather his or her freedom from debt? Global explanations can also shed light on what concepts a machine learning process has learned (e.g., stripe patterns are important for recognizing zebras) and what criteria a model uses to make its decisions.

Purpose of an explanation method

Explanation methods can serve several purposes. In general, these methods either attempt to identify the features learned by an AI model, determine feature attributions, provide example-based or counterfactual explanations, or describe model behavior.

Feature attributions

Explanation methods for identifying feature attributions assign a value to each feature in the input data. This value quantifies the importance of the feature for model predictions. Such explanation methods also provide information about how specific features affect model predictions. For example, feature attributions can be used to determine which areas of an image an AI model focuses on when classifying the image. Classic explanation methods such as SHAP, LIME, and Integrated Gradients all fall into the category of feature attribution methods.

Learned features

Explanation methods can also aim to identify the features that an AI model has learned during the learning process. In these methods, individual parts of a machine learning process (e.g., individual neurons or layers of a neural network) are assigned a set of features or concepts that they can recognize.

Example-based explanations

Example-based explanation methods extract representative instances from the training data and use them as examples to explain the behavior of a machine learning model. The example instances show patterns that the model sees as similar and for which the model behaves similarly. Example-based explanations thus explain processes in a similar way to how humans do. This category includes, for example, the k-Nearest Neighbors algorithm, which finds the k most similar instances from the training data set for each input and then makes a prediction in the form of a majority decision.

Counterfactual explanations

Counterfactual explanation methods aim to identify the minimum necessary change to an instance that will result in a different prediction for that instance. Thus, they help to understand which features of the instance need to be changed to obtain a different model prediction. Counterfactual explanations find counterfactual example instances that are as similar as possible to the original instance, but lead to a different prediction.

Explaining model behavior

Explanation methods can also try to figure out how an AI model behaves in a given situation. They do this by identifying patterns in the model’s behavior. An example of this is explanation methods that try to explain misclassifications caused by adversarial examples.

Format of an explanation method

Explanations of machine learning models can be presented in a variety of formats. The most common formats are statistics, diagrams, human language, and data points.

  • Statistics: Explanation methods often provide explanations in the form of summary statistics. These statistics present an explanation in numbers and tables. An example is the assignment of scores for the importance of individual features of the input data, e.g., feature A has an importance of 3, feature B has an importance of 1.5, and feature C has an importance of 0.5.
  • Diagrams: Explanations of AI models can also be visualized and presented in the form of diagrams. Diagrams provide a graphical representation of model behavior that is easy for humans to understand. For example, the importance of different features can be illustrated in the form of a bar chart.
  • Human language: Explanation methods can also provide explanations in human language. Human language explanations can take the form of text, audio, or, in the case of sign language, visuals. An example of this is an explanation of the denial of a loan application in the form “This loan application was denied due to the low monthly net income of $1,000”.
  • Data points: Explanations of AI models can also be given in the form of representative example instances, as is the case with example-based explanations. These example instances can be either real data points from the data set or artificially generated data points. It is important that the data points given as explanations can be understood by humans. Otherwise, the explanation is of little use. Explanations in the form of data points therefore work particularly well when the data points are images, text, or audio.

Different explanation methods for different audiences

Each type of explanation has its own advantages and disadvantages, and may be appropriate for different use cases and audiences. For example, board members need global and not too detailed explanations in the form of plain language or visualizations to develop a rough understanding of an AI system. In contrast, developers of AI algorithms need very detailed explanations on a local and global level to be able to detect and correct misbehavior of a model.

Further reading

In this section, you will find further reading that will help you delve deeper into the topic of Explainable Artificial Intelligence.

Books

Scientific publications

Summary

In this article, you studied different types of explanation methods and learned more about what Explainable Artificial Intelligence is and why it is important.

Specifically, you learned:

  • Explainable Artificial Intelligence is concerned with the development of AI systems that provide details or reasoning to make their operation clear or easy to understand for a particular audience.
  • Explainable AI is important for correcting AI system misbehavior, gaining stakeholder trust, discovering new knowledge, and complying with regulatory requirements.
  • Explainable AI has different target audiences depending on the application, including developers, users, managers, regulators, and people affected by AI decisions.
  • Explanations generated by Explainable Artificial Intelligence can be distinguished by their type, scope, purpose, and format.

Do you have any questions?

Feel free to leave them in the comments below and I will do my best to answer them.

P.S.: Of course I also appreciate constructive feedback on this blog post 😊

Leave a Comment

Your email address will not be published. Required fields are marked *