The usefulness of machine learning algorithms has led to their widespread adoption prior to the development of a conceptual framework for making sense of them. One common response to this situation is to say that machine learning suffers from a “black box problem.” That is, machine learning algorithms are “opaque” to human users, failing to be “interpretable” or “explicable” in terms that would render categorization procedures “understandable.” The purpose of this paper is to challenge the widespread agreement about the existence and importance of a black box problem. The first section argues that “interpretability” and cognates lack precise meanings when applied to algorithms. This makes the concepts difficult to use when trying to solve the problems that have motivated the call for interpretability (etc.). Furthermore, since there is no adequate account of the concepts themselves, it is not possible to assess whether particular technical features supply formal definitions of those concepts. The second section argues that there are ways of being a responsible user of these algorithms that do not require interpretability (etc.). In many cases in which a black box problem is cited, interpretability is a means to a further end such as justification or non-discrimination. Since addressing these problems need not involve something that looks like an “interpretation” (etc.) of an algorithm, the focus on interpretability artificially constrains the solution space by characterizing one possible solution as the problem itself. Where possible, discussion should be reformulated in terms of the ends of interpretability.
CITATION STYLE
Krishnan, M. (2020). Against Interpretability: a Critical Examination of the Interpretability Problem in Machine Learning. Philosophy and Technology, 33(3), 487–502. https://doi.org/10.1007/s13347-019-00372-9
Mendeley helps you to discover research relevant for your work.