logo
ResearchBunny Logo
What do algorithms explain? The issue of the goals and capabilities of Explainable Artificial Intelligence (XAI)

Computer Science

What do algorithms explain? The issue of the goals and capabilities of Explainable Artificial Intelligence (XAI)

M. Renftle, H. Trittenbach, et al.

Discover the intriguing world of Explainable AI as Moritz Renftle, Holger Trittenbach, Michael Poznic, and Reinhard Heil examine the flaws in current reasoning schemes in XAI literature. They challenge conventional understandings of interpretability and trust, proposing a fresh, user-centric viewpoint on simplifying machine learning models with interpretable attributes. This paper sheds light on the dual challenges of approximation and translation in AI interpretation.

00:00
00:00
~3 min • Beginner • English
Introduction
The paper addresses growing concerns about using ML systems without understanding their behavior and limitations. It critiques a prevalent reasoning scheme in XAI that assumes algorithms can imbue ML models with capabilities (interpretability, explainability, transparency) which in turn yield goals like trust, despite ambiguous definitions and unclear causal links. The authors propose focusing on modest, concrete capabilities that XAI can realistically deliver from the user’s perspective: (1) using interpreted attributes and (2) providing simple functional representations. They adopt an interdisciplinary (computer science, philosophy, technology assessment) and user-centered approach, using a thought experiment (an email spam filter) to elicit reasonable user questions about ML models. By analyzing which questions XAI can genuinely address, they reformulate the core, answerable question as how to represent an ML model as a simple function over interpreted attributes. This reframing sets up two central challenges for XAI: approximation and translation.
Literature Review
The paper surveys and synthesizes debates on XAI capabilities and goals. It highlights longstanding concerns over definitional ambiguity of terms like interpretability and explainability (Lipton, 2018; Krishnan, 2020; Robbins, 2019; Erasmus et al., 2021), and critiques the assumed linkage between explainability and trust (Ribeiro et al., 2016; Mittelstadt et al., 2019; Gilpin et al., 2018; Molnar, 2020). The authors reconstruct a common reasoning scheme connecting explainability to justification and trust, arguing its insufficiency and non-necessity for achieving trust. They note stakeholder- and question-dependent perspectives in XAI (Tomsett et al., 2018; Zednik, 2021; Liao et al., 2020). On methods, they reference surveys and techniques for approximating complex models via surrogates or local explanations (Guidotti et al., 2019; Adadi & Berrada, 2018; LIME: Ribeiro et al., 2016; SHAP: Lundberg & Lee, 2017; Integrated Gradients: Sundararajan et al., 2017), model extraction and rule-based surrogates (Craven & Shavlik, 1995; Bastani et al., 2019; Bénard et al., 2021). For translation from technical to interpreted attributes, they review concept-based and interpretability work in vision and language (Bau et al., 2017; Zhou et al., 2019; Kim et al., 2018; Cammarata et al., 2020; Goh et al., 2021; Ghorbani et al., 2019; Poerner et al., 2018; Krug et al., 2018) and discuss neurosymbolic directions (Garcez & Lamb, 2020; Nauta et al., 2021). Philosophical accounts relating approximation and interpretation (Erasmus et al., 2021) are contrasted with the authors’ framing of approximation as surrogate modeling not necessarily interpreted.
Methodology
The study is a philosophical and conceptual analysis employing a user-centered thought experiment and question-driven framework. Method steps: (1) Adopt the perspective of a lay user ("Alice") interacting with a supervised ML spam filter (S). (2) Elicit a reasonable set of user questions about ML models (Q1–Q7) spanning why a specific classification occurred, how the model distinguishes classes, what defines the classes, how the model works, model-model comparisons, disagreement between user and model, and universal definitions. (3) Disambiguate questions where needed (e.g., ex-ante vs ex-post explanations; types of philosophical explanations; distinctions between interpreted and technical attributes). (4) Analyze which of these questions existing XAI algorithms can and cannot answer, showing that only Q2 (how S distinguishes spam from no spam) is addressed. (5) Generalize Q2 into a core question (Q*): how to represent an ML model as a simple function using interpreted attributes. (6) From Q*, identify two key challenges for XAI: approximation (surrogate modeling with fidelity-complexity trade-offs) and translation (mapping technical attributes to interpreted attributes). (7) Provide a targeted review of XAI methods relative to these challenges, noting current strengths in approximation and gaps in translation, and discuss combined approaches (e.g., neurosymbolic models). No empirical experiments are conducted; the contribution is analytical and integrative across CS and philosophy.
Key Findings
- The common reasoning scheme linking XAI capabilities (e.g., explainability) to goals (e.g., trust) via justification is under-specified: capabilities are ambiguously defined, and causal links to goals are neither necessary nor sufficient. - From a user-questions perspective, existing XAI algorithms primarily address one core question (Q*): How to represent a complex ML model as a simple function that uses interpreted attributes. - Other reasonable user questions (Q1, Q3–Q7) are generally not addressed by current XAI: • Q7 (what is spam?) concerns real-world phenomena beyond model explanations; • Q1 and Q4 require knowledge of model creation mechanisms not typically provided by XAI; • Q5 and Q6 are trivial or intractable for complex models; • Q3 only partially addressed via Q2-like approximations or feature-engineered simple models. - Two central challenges define XAI’s realistic capabilities: • Approximation: building surrogate models balancing fidelity to the original model with simplicity; complexity is multi-definitional and user-dependent. • Translation: mapping technical attributes (inputs/intermediate representations) to interpreted attributes is context-dependent and ranges from trivial to unsolvable. - State of methods: • Rich toolkit for approximation (global and local): decision-tree surrogates, rule lists, LIME, SHAP, Integrated Gradients, model extraction; fidelity vs complexity trade-offs are standard. • Translation is less developed: concept-based methods and neuron-level analyses in vision and NLP show promise but often require annotated concept datasets; automated concept discovery exists but is limited. - Joint approximation-and-translation is an open challenge in domains where inputs are at lower abstraction levels than user concepts (vision, speech, NLP, scientific ML). Neurosymbolic approaches and prototype-based models (e.g., Neural Prototype Trees) are promising directions. - Overreliance on normative terms (trust, fairness) without shared definitions risks obfuscating XAI results and inflating expectations among policymakers and society.
Discussion
By reframing XAI around the questions users can reasonably ask and algorithms can answer, the paper clarifies that current XAI provides, at best, simplified functional representations grounded in interpreted attributes (Q*). This addresses the need for user-oriented understanding without overpromising normative outcomes such as trust or fairness. The identification of approximation and translation as the two key challenges explains why some applications are amenable to XAI (e.g., when interpreted attributes are available and surrogates can be accurate enough) and why others remain difficult or impossible (e.g., when mapping from technical to interpreted attributes lacks a shared vocabulary). This perspective disentangles capabilities from goals, guiding more precise expectations: XAI can support understanding by offering simpler, interpretable surrogates with known fidelity; it does not, by itself, guarantee trust or justification. The analysis also highlights implications for design: choosing surrogates requires explicit fidelity-complexity trade-off decisions tailored to stakeholders; translation methods must be domain- and context-sensitive; and integrated approaches that jointly translate and approximate may deliver more meaningful user-facing explanations. Finally, situating XAI capabilities within this framework encourages more rigorous discourse and prevents conflation of technical explanations with broader ethical or social objectives.
Conclusion
The paper concludes that current XAI algorithms primarily answer a single, technical question: how to represent complex ML models as simple functions over interpreted attributes. Emphasizing this modest, user-centered capability helps avoid inflated expectations tied to ambiguous or normative terminology (e.g., trust, fairness). The authors identify two key research challenges—approximation and translation—as central to XAI’s progress. While many approximation approaches exist, translation remains comparatively underdeveloped; future, holistic methods that address both simultaneously (e.g., neurosymbolic and prototype-based models) are likely to be pivotal. Focusing XAI research and communication on the concrete questions algorithms can answer will better align scientific outcomes with stakeholder understanding and needs.
Limitations
- The work is a philosophical and conceptual analysis without empirical studies; no user experiments or quantitative evaluations are presented. - The review of XAI methods is selective rather than comprehensive. - The thought experiment (spam filtering) serves as a guiding example; generalization to all ML contexts, stakeholders, or domains may be context-dependent. - The paper does not resolve the broader normative relationship between XAI capabilities and goals like trust or fairness; it intentionally sets these aside to focus on capabilities.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny