Illustrations by: Olek Modzelewski
Artificial intelligence captures our imagination like almost no other technology: from fears about killer robots to dreams of a fully-automated, frictionless future. As numerous authors have documented, the idea of creating artificial, intelligent machines has entranced and scandalized people for millennia. Indeed, part of what makes the history of ‘artificial intelligence’ so fascinating is the mix of genuine scientific achievement with myth-making and outright deception.
A certain amount of hype and myth making can be harmless, and might even help to fuel real progress in the field. However, the fact that ‘AI systems’ are now being integrated into essential public services and other high-risk processes means that we must be especially vigilant about combatting misconceptions about AI.
At various points throughout 2019, we saw users of Amazon’s Alexa, Google’s Assistant, and Apple’s Siri being shocked to discover that recordings of their private family conversations were being reviewed by real living humans. This was hardly surprising to anyone familiar with how these voice assistants are trained. But to the majority of customers, who do not question the presentation of these systems as 100% automated, it came as a shock that poorly paid overseas workers had access to what were often intimate and sensitive conversations. Concerns about how such systems operate are only sharpened when we see contracts between Amazon and Britain’s National Health Service for Alexa to provide medical advice and, of course, to thereby have access to patient data.
The myths and misconceptions of the black box
There are many myths and misconceptions about AI, but in cases where these systems are being used in sensitive, high-risk scenarios such as public health and criminal justice, arguably the most damaging misconception is that these systems are ‘black boxes’ about which we simply cannot know anything. Worse still, the claim is often made that non-interpretable models have superior performance (which, as Cynthia Rudin points out in her work, is often false) and that demanding explanations will lead to a reduction in accuracy and effectiveness. We should thus simply trust, and even learn to embrace, the black box.
But what precisely do we mean when we say that an AI system is a black box? In their recently published guidance on ‘Explaining AI decisions’ (hereafter ICO-AT Guidance), the UK’s ICO and Alan Turing Institute define black boxes as “any AI system whose inner workings and rationale are opaque or inaccessible to human understanding” and then provide a detailed explanation of the interpretability of different technical approaches. There are, however, different causes of such opacity.
On the technical side, we typically refer to certain approaches to AI (such as neural networks) as black boxes because they operate in a manner which is simply too complex for a human to follow in detail or retrace after the fact. However, an AI system can also become a ‘black box’ because of its proprietary nature: we are prevented from knowing about how it works due to concerns about protecting trade secrets, or to stop the system ‘being gamed’ (the possibility of which is often a sign of a badly designed, arbitrary system). There are, of course, proprietary deep learning systems which combine both forms of opacity.
What we want to demonstrate in this article is that neither form of opacity, the technical or proprietary, is entirely inevitable, and that even when some essential element of opacity remains about a system, there are a whole host of transparency measures which can and should be taken to remove as much opacity as possible.
Before we get into the technical and even legal sources of AI opacity, there is one fundamental source of opacity that we must address: the very definition of AI. The term ‘artificial intelligence’ is currently used to refer to a bewildering range of technologies. At the most mundane end of the scale, the hype around AI has led companies to describe things as banal as recommendation features in Microsoft PowerPoint as AI, while at the most outrageous end of the scale we have speculations about non-value aligned Superintelligent AI overlords engaging in conflicts with alien species. What, if anything, unites the range of technologies that now fall under this label?
There is arguably one basic idea underlying the project of creating ‘artificial intelligence’: to make a machine that can accomplish a complex task. Such complex tasks are typically ones that require human intelligence to solve, but there is no reason to define AI solely in terms of mimicking human intelligence. We may want to develop machines to replicate how the human brain works as a way to understand our own minds better, but also to create machines that can solve problems which would be impossible for humans and which have an entirely ‘inhuman intelligence’. In highly speculative cases where the goal would be to develop AI system with actual self-awareness or consciousness, we could potentially speak of the AI system itself having aims, goals or intentions. Of course, nothing of the sort is on the horizon, and remains the stuff of science fiction.
The AI models which are the subject of today’s hype, meanwhile, are just statistical models (also known as machine learning models) – not unlike those which have been in use by social scientists, biologists, statisticians and psychologists for years.
Those statistical models have, for decades, been used for tasks such as predicting future values of financial stocks, estimating the effects of treatments on health outcomes based on collected data and recognizing written text from images.
The main difference in this new “AI era” is the increased efficacy that some machine learning models were able to achieve, thanks to technological advances in processing power and access to large collections of data. These models include neural networks – a class of algorithms which have been studied since the 1950s, but which only recently found successful applications to tasks such as image recognition and machine translation. However, other statistical models which have been around for decades – such as linear or logistic regression, decision trees, support vector machines – are now being rebranded as “AI”, leading to increased enthusiasm in their efficacy – while their uses and capabilities remain largely identical.
Many well-known applications which have been publicly discussed because of exhibited bias were in fact *not* technical black boxes (i.e. did not rely on neural networks or other highly complex machine learning techniques).
In such cases we are looking at much less sophisticated, interpretable techniques such as linear and logistic regression, decision trees/rule lists or case-based reasoning. There is certainly some level of mystification going on when these techniques are all lumped under the umbrella term of AI, and the first question we should ask when presented with ‘AI systems’ is precisely what techniques they are employing.
It is a common mistake made by non-expert commentators and journalists to apply the same ‘black box’ narrative to simple and complex systems alike. As a result, designers of simple systems also get excused for the lack of transparency. In many cases, the public is kept in the dark not because the inner workings of the system are obscure but because transparency would threaten trade secrets or expose controversial choices made by the owners of the ‘AI system’.
This dynamic is a good reason in itself to question the ‘black box’ narrative and educate the public, so that not all statistical models land in the same black box. Bearing in mind then that today’s AI, and indeed the only type of AI that is on any realistic horizon of development, is nothing more than advanced statistical models, let’s first examine how non-technical factors can turn potentially interpretable AI systems into black boxes.
iBorderCTRL: when even the ethics report is opaque
A perfect example of this type of non-technical opacity can be seen in the case of iBorderCtrl, a project funded by the European Commission’s Horizon 2020 funding initiative. iBorderCtrl claims to offer an AI-based lie detector service to help police the borders of the European Union. Numerous commentators have pointed to the lack of scientific basis for such technology, and Pirate Party MEP and civil liberties activist Patrick Breyer has launched a freedom of information request, which the European Commission has tried to dismiss.
Breyer requested that the relevant authorities make public certain documents, including an ethics assessment, but they refused “on the grounds that the ethics report and PR strategy are “commercial information” of the companies involved and of “commercial value”. In this case, an extremely controversial and scientifically dubious technology (AI lie detection) is being used in an extremely sensitive context (migration) and is being funded by public money. It seems to defy common sense that any company providing such technology could be allowed to escape public scrutiny here.
Neither the European Commission nor the developers of iBorderCTRL have provided any evidence to suggest that there is any technical reason why we cannot inspect how their software works.
However, this system becomes a ‘black box’ because the software is being protected from scrutiny due to concerns about trade secrets. This seems particularly egregious in a case where the technology is so controversial and the risks of discrimination and injustice are so high. Moreover, such a flagrant disregard for transparency is totally at odds with the idea of ‘Trustworthy AI’ which the EU is so keen to promote.
In contrast to this approach of keeping ethics assessments hidden from public scrutiny, the Council of Europe has recommended that human rights impact assessments (HRIAs) must be conducted by public authorities and must be publicly available:
Public authorities should not acquire AI systems from third parties in circumstances where the third party is unwilling to waive restrictions on information (e.g. confidentiality or trade secrets) where such restrictions impede or frustrate the process of (i) carrying out HRIAs (including carrying out external research/review), and (ii) making HRIAs available to the public.
Such measures would do a great deal to eliminate unnecessary opacity from the use of AI systems by public authorities.
The sad reality, however, is that we not only have no access to ethics reports, HRIAs, or the technical specifications of such systems – we do not even know if and where such systems are in use. Much of our knowledge of what systems are actually in use comes from ground work done by investigative journalists and civil society organisations such as Algorithm Watch, a German NGO which has produced a report mapping the use of ‘automated decision making’ in several European countries.
Eliminating these unnecessary obstacles to transparency would be a basic first step that public authorities could take. Knowing what systems are being used in the public realm (and often being funded by public money) is a necessary condition for effective oversight, but there are also many more ways in which opacity can be tackled in AI systems.
Why we need to prioritize explainability in AI systems
The discussion about what explanations should be required and at what level of detail is ongoing. In their draft guidance on ‘Explaining AI decisions’ (hereafter ICO-AT Guidance), the UK’s ICO and Alan Turing Institute advise that the intelligibility and interpretability of an AI model should be prioritised from the outset and that end-to-end transparency and accountability should be optimised above other parameters. In other words, whenever possible, a company or a public institution that aims to automate decisions that will affect humans should use a model that can be interpreted – not a technical black box.
Making this choice from the outset would address some of our problems – making it easier for those using AI systems as decision support to reason about their limitations, for those affected by the systems to dispute their incorrect decisions, and for society to have better oversight over the way in which they are used. But we can safely assume that there will be companies and public institutions alike willing to make a different choice: for dubious beliefs about opaque systems automatically leading to improved accuracy; for PR purposes to attract funding and interest because their company is using the most advanced models; out of convenience by just letting the neural network do all the work to avoid complex preprocessing work; or even for the convenience of evading scrutiny by deliberately picking an opaque model. What can we, the concerned public, do with these cases? Should we simply acknowledge that there will always be systems which function in opaque ways?
Indeed, it can be extremely difficult and time consuming to reverse engineer exactly how an AI system has made a decision – for example, how exactly the combinations of the individual pixel patterns of an image lead to the system deciding that it has seen a cat. Machine learning practitioners use the term ‘local explanation’ to refer to an interpretation of individual predictions or classifications. Such explanations can have different levels of granularity, and would typically approximate the process through which the algorithm generates a response, rather than attempting to reconstruct it in all its minutiae.
To provide such an explanation, one can focus on identifying specific input variables that had the most influence in generating a particular prediction or classification – often a difficult task if a prediction or classification was produced by a neural network (although for some neural network models, tools do already exist that can help generate local explanations). This difficulty is often used as a general caveat to rebuff any further conversation about the logic behind an AI system or its fairness. But we should never accept an obstacle in producing “local”, case-by-case explanations as a reason to not produce any explanation at all.
Instead, following the ICO-AT Guidance once more, we can demand that owners of AI systems that qualify as ‘black boxes’ provide supplementary explanations which can shed light on the logic behind the results and behaviour of their system. These should include internal model explanations, including model type and architecture, the data used to train the models, the results of “stress tests” – describing the model sensitivity under a variety of cases – as well as the results of internal reverse-engineering, providing examples of local explanations for chosen inputs whenever possible. Frameworks for generating such descriptions (including Model Cards for Model Reporting or Datasheets for Datasets) are widely known among data scientists and often used for companies’ internal purposes.
The ICO-AT Guidance shows that a lot can be done to explain the inner workings of any AI-assisted system, if those involved are just willing to make the effort. We will not go into the technical details of how to produce internal or post-hoc explanations in this text, as it has been extensively covered by the ICO-Turing Guidance.
The point that we want to reiterate, however, echoing the reasoning of the UK authorities, is that there are numerous decisions that owners or designers of an AI-assisted system need to make, starting with the framing of the problem they want to solve and ending with the choice of model they will use and its evaluation method. These decisions should be well-documented and justified because this is where the conversation about the values embedded into AI systems really starts: with human decisions regarding the design and optimisation of the whole system.
How human decisions shape every AI system
The key point that we are making here is that explanations of automated decisions need not hinge on the general public understanding how algorithmic systems function. In order to maintain a level of scrutiny over an AI system we do not have to understand every step in the machine learning process. What we do have to understand, however, are the choices, assumptions and trade-offs made by the people who designed this system – which all shape the behavior of the algorithm. This level of explainability can be achieved without opening the technical black box.
Every algorithm and every ML-assisted decision-making system is designed by humans to achieve certain goals. These goals are not defined by ‘artificial intelligence’, but rather by humans who designed it, who decide why an automated decision-making system is needed in the first place and what problem or question it is supposed to solve. Such a system can be employed to help humans find patterns in vast amounts of data, which is what AI systems do best. AI systems can also be employed to support humans in making judgements, based on predictions – or to replace a human decision-maker completely. There are relatively benign examples of automating judgement, such as spam detection or the choice of targeted advertising for consumer goods, and more controversial ones such as choosing targeted housing or job advertisements or detecting hate speech – where historical patterns of discrimination are more likely to be replicated, and where potential errors have serious consequences.
Finally, and most controversially, these systems can also be used to predict certain outcomes, including future human behaviour. Assuming that those outcomes follow clear patterns and that we have sufficient and appropriate data to discover them, this may be an achievable task. Such ideal conditions are rarely present, however – leading many to be opposed to the use of AI systems to predict things such as criminal recidivism or to replace human interviewers in choosing the ideal candidates for jobs.
Regardless of the task or function attributed to an AI system, engineers always begin with a problem to be solved: identifying cats and dogs in a huge dataset composed of pictures, for the sake of producing better search results; predicting the current interest or mood of a person browsing the internet, in order to maximise the chance that he or she will click on suggested content and remain engaged; or predicting whether a defendant is likely to commit a crime in the future.
Unpacking what the system really does: technical choices
In the case of each and every ML system, its goals (what it has been set to achieve/optimise for) can be extrapolated from a set of technical and design decisions. In order to understand how a machine learning system functions – and under what circumstances it is likely to fail – we need to first understand the task it was given or the question it was designed to answer. We can use sophisticated methods to ‘interrogate’ the system to detect the actual task it was given, but we can also simply ask the humans who designed it to tell us. What we want to argue is that this information should be available to the public – without having to ask for it.
So let us assume that we know why an AI assisted system was implemented – to identify customers most likely to buy a certain product; to maximize click-through rate or to maximize how long users spend on a website. Are we satisfied? Not yet. There is usually a long path between defining an overall objective and calibrating the system so that it delivers specific, desirable outcomes. In order to verify whether the system achieves its original aim, and is unlikely to behave in undesirable (e.g. discriminatory) ways, we have to understand and verify key technical and design choices.
Whenever we are dealing with a complex task or question, system designers have to make certain assumptions when translating (or “operationalizing”) a general goal into mathematical formulas or functions. It is one thing to say “we want users to stay engaged with online content”, it is another thing to define what “engagement” really means, how it is measured and what data about an individual can be used to predict his or her online behavior. It is one thing to define a goal as “matching social benefits with individuals who really need them”, and another thing to actually define who is “in need”, formulate the optimal allocation of outcomes, and encode unacceptable worst-case outcomes, acceptable tradeoffs and evaluation procedures in order to produce a fair result.
A general goal behind an AI system must be translated from business or political language into the language of mathematical formulas. These decisions and choices made “internally” by data scientists are no less meaningful than the choice of a general goal – just as the details of how a public policy was implemented are no less meaningful than the overall objective of that policy.
In fact, a success or failure of an AI system in fulfilling its general goal depends to a large extent on these smaller, “internal” decisions, often made by data scientists alone, and never communicated to the public. We argue that these technical decisions are especially crucial to understand when we reason about fairness and impact of ML systems that are applied to humans.
What are these choices? To illustrate a typical decision-making process in the development of an AI-system, we imagined the story of a data scientist tasked with building a model meant to help hospital administrators select people who will be enrolled in a health management program.
One such model – used in hospitals across the United States – has recently fallen under public scrutiny after a Science article revealed that the model systematically recommends healthier White patients over Black patients with higher medical need for enrollment. The effect, as the authors explain, can be traced back to the choice to define patients with high health needs as those who also generate high healthcare costs – an assumption that is known to be wrong for Black patients, on whose treatment American hospitals spend less.
Our story is fictionalized and simplified, but, as a matter of fact, similar scenarios happen every day.
The management agrees that a custom model would be more appropriate in this case – they want to avoid using the proprietary system, especially given the recent Science article which showed using cost prediction as a proxy for health leads to an unfair allocation of resources between people of different ethnicities.
But some key questions remain. The meeting time is coming to an end, but Jasmine asks everybody to stay a few extra minutes and writes out a few bullet points on the whiteboard. What is the definition of a “needy” patient we can all agree on? A model will not stand ambiguity, she says – and it is best that doctors, not data scientists, make the decision. What potential errors do we need to make sure to avoid? Models are known to perpetuate discrimination, and the hospital cannot afford to be called out as discriminatory for not assigning women, people of color or poorer patients to the program, she warns. What are the acceptable trade offs – should we prioritize finding all patients in need, even if some will be included incorrectly, or should we avoid including healthy patients, knowing this will lead to us missing some sick patients? And what data should we be using – will using sensitive data in the analysis make it more difficult for management to access and use the model, given privacy regulations?
All these questions need to be answered in the course of Jasmine’s team’s work, and the answers they settle on will get translated into mathematical formulas that will define the functioning of the system.
Let’s now break those decisions – and their mathematical formulations – into separate components to see how they are embedded throughout the process of building an automated prediction model.
There will be trade-offs
The process of translating real-world problems into mathematical formulas always includes simplifications, assumptions and tradeoffs. AI metrics must correspond to quantifiable phenomena, which do not always correspond to human definitions of success in a given scenario, and their reliability is limited by the mathematical properties of the models we use. For example, we have to accept that the system calibrated to achieve maximum accuracy will often not perform equally well in making sure that no group is discriminated against (as in the case of the COMPAS algorithm). As Cathy O’Neil put it in A RedTail interview: “You can’t minimize false positives, maximize accuracy and minimize false negatives all at once. There are always trade-offs, so this is a way of surfacing those trade-offs in ways that it doesn’t take a math PhD to do.”
No matter how well-intentioned system owners are, and how much they might wish for a fair outcome, the outcome the system delivers will be determined not by their wishful thinking but by mathematical and statistical constraints.
Data does not always reflect the real world
AI systems are as good as the data they have been trained on. Above we explain how human decisions shape an AI system but also show the limitations of what can be done. Even the best model will not perform well if it was trained on datasets that are only remotely connected to the phenomena the system is supposed to predict. We have to keep in mind that many phenomena that people may be interested in predicting – such as “health level” or “recidivism” – do not have simple definitions corresponding to measurements present in datasets. In these cases, data scientists depend on proxy variables – ‘close replacements’ for the variables of interests, such as the number of active chronic conditions or number of arrests (we can measure if people were rearrested after being released on bail, but ‘being arrested’ is not a good substitute for ‘having committed a crime,’ especially when we take over-policing of minorities into account).
If there is bias, there was a human decision behind it
Many studies that look at “discriminatory” algorithms explain bias by bringing external factors such as poor data quality (“the only data that was available to train the system reflected the same bias” or “data scientists missed data about a certain group”) and a flawed choice of objective. Although such reasons may seem extrinsic, arising from external realities that designers of the system in question could not control, the truth is that there is a human decision behind each of these choices.
Owners of AI systems and the data scientists they employ are responsible for the choices made at each stage of development: for the choice of the data, even if that choice was very limited; for deploying the system despite the fact that they could not avoid bias; for not revising their main objective, regardless of the fact that the fair outcome they hoped for could not be achieved; and, finally, they are responsible for choosing to use an automated system in the first place, despite being aware of its limitations and possible consequences.
Not an isolated decision, but a decision-making cycle
Every well-managed process of designing and training an AI system involves rethinking objectives, improving the quality of data, changing the model and re-calibrating the system so that it does what it was supposed to be doing – or abandoning it altogether if it does not achieve desired objectives. It is not a one-way process.
Designers and data scientists have to iterate, come back and rethink the design. Although there are no legal requirements for logging the reasons behind design decisions, in practice this circular process is quite common and consistent between data scientists. However, the decision about when to stop this iterative process is not merely a technical or business decision.
Indeed, in cases where the AI system will have significant impact on people’s lives, this decision becomes political. For example, how can a software engineer alone decide that a predictive model is sufficiently good in not discriminating against minorities?
How can such decisions be left to the intuitions of people without expertise in these topics and, most importantly, how can it be that there is no possibility of public scrutiny about such significant decisions?
Conclusion: If the ‘technical’ is political, we need democratic control over these decisions
At the end of her book, Automating Inequality, Virginia Eubanks quotes a data science evangelist who dreams of replacing civil servants with AI systems:
“The information and insights will be immediate, real time, bespoke and easy to compare over time. And, ideally, agreed by all to be perfectly apolitical.”
What would it mean to replace civil servants, who are responsible for incredibly sensitive decisions with a massive impact on people’s lives, with perfectly apolitical AI systems? It would in fact mean hiding politics in the black box. It would mean obscuring the fact that arbitrary human decisions went into constructing these systems at every stage of their development, and it would mean masking these decisions – whether they be about granting bail or deciding if a child should be separated from their parents – beneath a facade of objectivity. In the process, such decisions are seemingly depoliticised, and this depoliticisation is very much in the interest of those who wish to avoid accountability for their political choices.
Enough cases of dangerous errors leading to discrimination and safety concerns have been reported in recent years to understand the potential implications, which in the most acute circumstances may mean life or death. A lot has been said about their negative impact after it happened, but key choices made by the owners of these systems or their teams – with regard to (sources of) data, the model used or the loss function that led to such results – were not debated in public. As Ruha Benjamin points out, we have a tendency to see glitches and failures in such systems as “a fleeting interruption of an otherwise benign system,” as merely technological problems, whereas we should see them as “rather a kind of signal of how the system operates.” If these decisions had been transparent from the start instead of only when things go wrong, and the key stakeholders had been involved and consulted, many of these negative impacts could have been avoided.
Many designers of AI systems care deeply about the impact of their work, and are often drawn to AI by imagining the potential of using these technologies for social good. Researchers and practitioners alarmed by publicized cases of machine learning bias now take extra steps to ensure their algorithms are fair. However, while admirable, these motivations are not enough – especially as claims of “social good” and “fairness” implicate normative, political choices, as there is no single agreed upon definition of “social good” and there are at least 21 definitions of “fairness” – many of them mutually exclusive. While in low-stakes scenarios those decisions may be innocuous, they become political in nature once they are made in the context of systems that will impact human lives. But while we might never be able to agree on the definition of “good” or “fair” AI, we should agree on the process of evaluating and discussing the political choices made by the designers of these systems. Currently the field lacks the language and perspective to evaluate and debate these decisions. This allows computer scientists to make broad claims about solving social challenges while avoiding rigorous engagement with the social and political impacts – which often leads to adding complexity to, if not exacerbating, the very challenges they hoped to address.
If we agree that in the context of AI implementations technical often becomes political, the next step is to call for transparency of these decisions and some form of democratic oversight.
While full public scrutiny over every design decision in every AI system being designed today is neither feasible nor necessary, we can – and should – demand transparency in the technical choices behind AI systems used to make decisions that affect humans so that we can see, and potentially challenge, the political decisions that inform technical choices – particularly when used in the public sector. Without such transparency, we are denied the possibility of political participation, as contentious decisions are hidden inside the ‘black box’ narrative.
Initiatives and guidelines that inspired us:
- Explaining AI decisions in practice
- Model Cards for Model Reporting
- Datasheets for Datasets
- Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
- Accountability of AI Under the Law: The Role of Explanation
- A Framework for Understanding Unintended Consequences of Machine Learning
- Problem Formulation and Fairness
- Roles for Computing in Social Change
- The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning
- The fallacy of inscrutability