Illustration of an abstract 3D cube in blue colour

VAlex / stock.adobe.com

2022-07-01 publication

Trust in AI: Looking into the black box

The decisions made by artificial intelligence are often difficult to understand, which can act as a deterrent. Businesses and researchers are thus striving to make AI more transparent. VDE, for example, is contributing a specification that will make it possible to measure whether systems conform to certain values.

By Markus Strehlitz

People’s perception of artificial intelligence could be better. Its great potential is constantly touted, but it tends to suggest something vaguely threatening, as well. AI is often seen as a mysterious, uncanny force. Reports of discriminatory algorithms reinforce this image. Past examples of AI systems demonstrating prejudice against people of certain backgrounds have caused a stir, including in companies’ recruiting efforts or banks’ practices in issuing loans.
Detailed studies were required to reveal the underlying reasons. In many of these cases, the training data itself was already discriminatory, and the AI perpetuated the biases at hand.

The fundamental challenge here is that it’s often impossible to understand how an AI produces its results. Such systems aren’t programmed in the conventional way; they are trained. And the more complex the AI is, the less transparent its decision-making processes become. The neural networks used in deep learning are like a black box. This makes it difficult to trust solutions based on such technology.

As a result, many companies, research institutes and initiatives are currently working on explainable AI. Various approaches are being taken to let some light into the black box, or at least find ways to increase trust in AI. IBM, for example, has compiled a set of tools to make AI decisions comprehensible. AI Explainability 360 uses a variety of approaches, just like we explain decisions to people in different ways in our everyday lives. We use examples and counter-examples, refer to rules or highlight certain characteristics. For instance, a doctor using AI for diagnostics may find it helpful if the system shows certain cases that are similar to or completely different from their current patient. A bank customer whose credit application has been rejected will want to know what the underlying criteria were and what parts of their application need to be changed.

The torso of a man is shown, and he is holding up a credit card.

Credit applications: Knowing whether an AI application is attuned to and compliant with certain values helps in deciding whether to use it.

| stock.adobe.com/Peshkova

For potential users, AI needs to be trustworthy

IBM’s set of tools includes algorithms for case-specific thinking and “post-hoc explanations”, which make certain decisions comprehensible in retrospect. For example, one of the algorithms explains not only why a given result was obtained, but also why another was excluded.

The US healthcare network Highmark Health uses this AI to analyze patient data. The aim is to identify the risk of sepsis in patients at an early stage and initiate appropriate countermeasures. To this end, data scientists created a corresponding model that also includes insurance data. The explainable AI technology makes the results transparent while also ensuring that any biases are detected.

In principle, trust in AI is relevant in every area in which it’s used. Those who use AI in industrial settings need to be able to rely on these systems just like healthcare and finance employees do. That’s where the startup IconPro comes in. It uses AI methods to evaluate image data for quality control and to safeguard production processes. The results users receive might include a forecast of expected quality or a suggested optimization.

CEO Markus Ohlenforst knows that user confidence in AI is important. IconPro therefore lists the biggest influencing factors after each round of AI model training and shows how they will affect the model’s output. “If a company is interested, we offer this initial analysis for free,” Ohlenforst explains. “The results we produce then include the most important contributing factors.” These could include certain settings on a manufacturing machine that are causing it to produce defective goods. “When our customers see that this information is accurate, it helps them to start trusting the technology.”

Safe, reliable AI can save lives

A factory that produces defective goods leads to unnecessary costs and stress; when AI is used in vehicles, however, human lives are at stake. Autonomous driving in particular would be impossible without artificial intelligence. Everyone on the road needs to be sure that the AI will properly analyze its surroundings and make the right decisions. The Fraunhofer Institute for Cognitive Systems (IKS) is among the research institutes working on this. Here, the goal is less about demystifying the black box and more about making the entire system safe and reliable by installing appropriate monitoring channels. The AI calculates several route alternatives based on information from various vehicle sensors. Conventional algorithms then check whether the routes are safe.

The many other locations doing research on the use of AI in engineering include the Fraunhofer Institute of Optronics, System Technologies and Image Exploitation (IOSB), KIT and the FZI Research Center for Information Technology, which have joined forces to found the Competence Center Karlsruhe for AI Systems Engineering (CC-KING). “AI engineering aims to make artificial intelligence and machine learning usable in a manner similar to conventional engineering,” says Professor Jürgen Beyerer, the center’s scientific director. Among other subjects, CC-KING focuses on making system behavior predictable and ensuring that decisions can be explained. The objective is to develop a standard procedure for AI engineering.

Bar diagram on acceptance of algorithms in everyday life

Increasing acceptance of algorithms in everyday life. Can a computer make its own decision in these areas? The Bertelsmann Foundation posed this question in representative studies in 2018 and 2022. The trajectory of the “yes” answers (see chart) shows that acceptance of automated decisions has grown in almost every application area. The data also reveals that people have fewer reservations about (partially) automated decisions in areas that have relatively little social impact.

| Institut für Demoskopie Allensbach, Bertelsmann Stiftung

Can artificial intelligence act in line with certain values?

The basis for trust in AI, however, is the ability to even recognize how transparent a system is – and to do so according to standardized criteria. VDE has decided to tackle this challenge. The association wants to create a VDE SPEC that will make it possible to measure how well AI systems comply with certain values. “We’re using the well-known energy efficiency labels for household appliances as a model,” says Sebastian Hallensleben, VDE’s leading AI expert and head of the VDE SPEC AI Ethics project. “We use a scale from A to G to show how a system meets specific requirements.” This scale is then applied to different categories such as transparency, fairness and robustness to show users how closely a solution conforms to relevant specifications.

As a scientific basis, the experts are relying on the VCIO model, which defines values, criteria, indicators and observables in a tree structure. At the top is a particular value, such as transparency. This is followed by specified criteria – the origin and characteristics of the training data or the comprehensibility of the algorithm, for instance.
“A set of questions is then defined for each of these criteria in order to refine everything. There’s also a whole range of possible answers for each question,” Hallensleben explains. This is how a result is produced for a single category. These findings are of interest to more than just users, as well; AI developers can also gain insights from them. They can see what is needed to improve the system in question and improve its fairness rating (for example) from level E to level A.

Explainability has its limits, especially in individual cases

Bosch, Siemens, SAP, BASF and TÜV Süd are involved in the SPEC development. On the scientific side, the participants include the think tank iRights.Lab, the Karlsruhe Institute of Technology and the Ferdinand Steinbeis Institute, as well as the universities of Tübingen and Darmstadt. The consortium aims to develop a universally binding, internationally recognized trust label for AI. An initial version of the label published by VDE at the end of April drew plenty of attention.

Hallensleben emphasizes that it is not a standard specifying how fair or transparent AI systems must be. That depends not only on the application in question, but also on political decisions. Instead, the VDE SPEC is meant to ensure that compliance with such values becomes measurable in the first place.

“This will provide a better basis for deciding whether or not to use AI for a specific application,” Hallensleben says. For example, if a company wants to use a particular system to process credit applications, it can look to the transparency classification on the label – and reconsider its choice if necessary. However, lawmakers could also specify the levels a solution must achieve on the AI trust scale in order to be used for a certain purpose.

It’s clear that there are many approaches to making artificial intelligence a bit more comprehensible. Explainability does have its limits, however. While it may be possible to create transparency by revealing things like the data used to train an AI, we still can’t explain why this data leads an AI to reject loan applicants in individual cases – at least when complex systems like neural networks are involved, as Hallensleben reports. “We have no idea whether this is even fundamentally possible with neural networks,” he admits.

In the end, opening up the black box entirely may not be absolutely necessary in every case. After all, we use AI because it delivers results or does things that other systems can’t, even if its mode of operation isn’t readily apparent. Perhaps artificial intelligence will always remain a bit mysterious in that regard.

Markus Strehlitz is a freelance journalist and editor for VDE dialog.

Trust in AI: Looking into the black box

For potential users, AI needs to be trustworthy

Safe, reliable AI can save lives

Can artificial intelligence act in line with certain values?

Explainability has its limits, especially in individual cases

“AI automates human decisions – and human biases”