Transparency and Trust in Artificial Intelligence Systems

By Philipp Schmidt, Felix Biessmann and Timm Teubner

Many of us use Artificial Intelligence (AI) Systems based on Machine Learning (ML) every day. When judges, police, or doctors use assistive AI, the proper level of trust in such systems is especially critical for their responsible use. Too much, or even blind, trust can lead to ill-considered decisions – and not enough trust may ignore valuable information. In recent years, many methods were proposed to render AI systems and their predictions more transparent in order to foster trust in these systems. To what extent, however, transparency really increased user trust remained largely unexplored.

In this recent study, BCCP Senior Fellow Felix Biessmann, BCCP Fellow Timm Teubner, and co-author Philipp Schmidt investigate whether and when transparency in assistive AI actually increases trust in AI systems.

In a behavioral experiment, the authors asked 200 subjects to classify short text as either “positive” or “negative.” Subjects were paid for each correctly classified text. In addition, subjects were able to draw on an ML-based decision support tool for text classification, which also gave an assessment (positive or negative). The authors then experimentally varied the information that subjects received in a 2-by-2 treatment design. The AI system “explained” its decision by 1) highlighting the most relevant words in the text (i.e. “wonderful” as an indication of a positive assessment) and/or 2) by providing a score on the confidence of its prediction (i.e. 65% or 98%).

In contrast to the common assumption that transparency is always beneficial, the results demonstrate that increased transparency does not necessarily increase trust in an AI System. Quite to the contrary, subjects relied significantly less often on the AI prediction and deviated in their own assessment from the AI’s assessment – and were wrong more often by doing so. Interestingly, there are many cases in which the AI was correct but attributed a high uncertainty to its prediction. Communicating this uncertainty to subjects resulted in subjects not following the suggestion of the AI.

The right amount of trust also implies not following incorrect AI predictions. This is precisely what transparency should achieve. However, the results indicate that humans made up to six times more mistakes when they followed incorrect AI predictions rather than when ignoring correct ones. The results show that transparency in AI systems does not always increase trust in such systems. Furthermore, transparency often does not lead to humans recognizing incorrect AI assessments. As a next step, the authors seek to investigate whether and how fast trust in AI Systems can be restored after incorrect AI predictions have led to a loss of trust.

The full paper “Transparency and Trust in Artificial Intelligence Systems” is forthcoming in Journal of Decision Systems and is available online here.