IGF 2020: Aiming for AI explainability: lessons from the field. Summary of the session

7 min. read

AI systems will soon determine our rights and freedoms, shape our economic situation and physical wellbeing, affect market behaviour and natural environment. With the hype for ‘problem-solving’ AI, claims for (more) accountability in this field are gaining urgency.

On the other hand, one of recurrent themes in AI discussions is what – for the purpose of this session – we called “the black box argument”. It is often argued that the logic of most sophisticated (and allegedly most efficient) AI systems cannot be explained and we simply have to accept that; that explainability would come at the price of stalling progress in developing successful AI applications.

In this session we reflected on the value and implication of demanding AI, especially for those AI systems that have serious impact on individuals and the society. At the outset of the discussion we noted that explainability is only one of the demands formulated by those who advocate for responsible AI and it should not be seen as a goal in itself.

[Explanatory note: in scientific debate "explainability" is understood as an active feature of a learning model describing the processes undertaken by the learning model with the intent of clarifying the inner working of the learning model. It is related to the notion of an argument or explanation where there is an interface between the user and the decision-maker (see)

During the session we discussed explainability in a broader sense, as a set of demands made by various stakeholders (incl. researchers, data scientists and civil society orgs) towards machine learning models in order to increase their accountability. In this broad sense the demand for "AI explainability" includes:

  • interpretability of individual decisions
  • interpretability of AI model
  • transparency with  regard to the purpose of the system, input data, training methods and key decisions taken in the training process
  • availability of documentation, which should include results of tests performed to check the robustness and other key features of the system.

The session was composed of interviews with invited speakers (first half) and open, moderated discussion (second half). We discussed the following topics and made the following arguments:

Practical approach to explainability: how can it be achieved?

While there are test and edge cases for hardware and software, there is no "testing standard" for AI. Speakers agreed that responsibility of the impact of AI systems needs to start inside of organisations that produce them. AI system developers should perform tests to detect potential flaws and risks ("surface unintended consequences" – Hong Qu), before implementation and they should make results of these tests public (e.g. using the model cards framework).

AI can have significant impact on society, not only on individuals. In order to capture this impact we need to designing different types of explanations (ICO proposed this framework in their guidance on explainability). It is also advisable to separate local vs global explainability, e.g. for a single decision (“why it has been reached and what can I do about that”) vs the entire system (understanding how the model functions, whether it discriminates certain groups, whether it is biased etc). We must take a human-centred design approach to monitor for disparate impact which harm real people, and second order externalities that might affect vulnerable populations.

The most important aspect to be explained is not technical, but rather political: what is the purpose of the system and what are its success factors ("this is often the most striking blind-spot" – Hong Qu).

Even black boxed systems can be explained, given the choice of the right type of explanation.  On the other hand speakers acknowledged that 'unknown unknowns' in sophisticated AI systems are very complex. There will always be emergent risks in complex systems that no one can explain or expect. There is no "full" explanation as the results (lived experiences) come in, which in turn are used to continuously tune (both manually and automatically) the AI.

Auditing the systems they deploy on millions of people is only a starting point to more substantial reform. In many cases, legal regulations that we need to challenge the corporate and government uses of AI are already in place, they just need to be enforced or pursued through litigation. Authorities and courts are prepared to handle confidential information and therefore can be employed to audit/investigate AI systems. They also have the democratic mandate to do so. The only element often missing is political will to challenge unfair AI systems.

Value of explainability for people affected by AI systems

People should know when they are being dealt with by a computer rather than a human being. It's disrespectful to allow people to think they are interacting with a person when you are actually processing them through a machine. We also see the inverse: sham 'AI' start-ups which pretend to be doing AI when they are actually relying on human labour, often poorly paid in the digital equivalent of a sweatshop.

On the other hand, explainability as we see it today is not always actionable for end users. Most stakeholders don't need to understand this level of technical information. Also, the way field experts talk about explainability often misses the bigger picture. It tends to focus on the interpretation of individual result (i.e. how the AI system went from general features to individual decision), but ignore other essential features of the system that are equally important, such as its robustness and safety or fairness.

Different kinds of explanations of how AI decisions are made might not have that much effect on whether people think they are fair; their intuitions about whether it's ok to make a decision about someone using a statistical generalisation tend to override any differences between different ways of explaining those generalisations.

Affected individuals are often more concerned with transparency and obtaining access to basic information such as: what data is used by the system, who may have access to their data, what types of decisions are produced, how to challenge them (rather than interrogating the model itself). It is crucial to give affected individuals access to justice -- a chance to contest unfair or erroneous decision – rather than mere explanation of why the system produced certain result.

Explainability should be a means to establish trust, which is the end goal. Everyone has a mental model of how the system should work. Whenever there is a mismatch between people's mental models and how the system actually works, the onerous is on the system operator to offer transparent processes for appeals, recourse, oversight, and community-led challenges to improve or shut down the AI system.

Immaturity of the field and need for further action

Explainability, as an evolving field of research, remains relatively immature and is geared to AI practitioners. There are still no standard approaches, benchmarks, or indeed definitions of the field, though different standardization processes, including ISO JTC SC 42, are making progress.

Good news is that explainable AI becomes attractive in B2B environment: business clients start to see value in explainability/transparency, because it improves AI systems' performance, increases trust and helps them maintain credibility. Speakers noted interesting outside-inside dynamic: tech workers need external pressure on their companies so they can effectively advocate for improvements internally.

Participatory design, which engages affected individuals and communities throughout the process, seems to be a very promising approach. Inclusion of people should be one the key objectives in the design process, as it is the best way to learn about their needs and check how they receive (grasp) information about the functioning of AI systems. However, real question in this context is how to be inclusive and tackle obvious power imbalances. In order to solve this problem we really need to work together - ethicists, lawyers, computer scientists etc. 

Among other things, we need more research on AI accountability and harms coming from non-English speaking countries and less developed counties, esp. global South. Also, further research is needed to develop effective ways for human oversight, to make sure that humans are prepared and well-equipped to challenge the recommendations made by AI systems.


In conclusion, speakers agreed that explainability is not a silver bullet. In particular it “does not solve fairness”. But, if implemented well, it may empower system users to question bias and lead to other desirable results, such as increased reliability and trust. We need to continue to exercise caution in integrating automated or AI systems into more complex social decision-making processes, given that they are not yet reliably understandable or actionable by the public (Philip Dawson).

“Transparency should be proportional to power. So where AI systems are used by the powerful in ways which affect people, especially people without power, then we should insist on more transparency. AI systems which can't be explained should not be used in those circumstances. But we need to think beyond transparency, as well as thinking about multiple forms of transparency. Perhaps more important than breaking open the black box of the AI model, is breaking open the corporate or governmental black box” (Reuben Binns).

Katarzyna Szymielewicz