Data-driven technologies are not neutral. A decision to collect, analyse and process specific kind of information is structured and motivated by social, economic and political factors. Those data operations may not only violate the right to privacy but also lead to discrimination and oppression of socially marginalised communities. Discriminatory data processes and algorithms are a massive challenge for the modern human rights movement that requires non-standard solutions. The report “Between Anti-discrimination and Data” – tries to shed light on this problem from the perspective of European civil society organisations.
Data of oppression
Debate on the potential impact of automated computer systems in the field of human rights may a shared ground in histories of surveillance over historically marginalised communities. Through the centuries various institutions have created registers to identify “undesirable”, “abnormal” or “dangerous” individuals and populations. Usually, these were groups that now we would call them as socially marginalised. Governments in totalitarian as well as democratic countries have used census data to target ethnic, racial, and linguistic minorities and indigenous populations and commit systematic abuses like crimes against humanity, genocide, or forced migration. The most extreme examples of those misuses are tied to the Holocaust, the Cultural Revolution in China, Apartheid in South Africa and genocide in Rwanda.
One of the recent examples of such discriminatory data collection was seen in Sweden. In 2013 press revealed that police from the Skåne region use an electronic register containing data on people of Roma origin. This database included information on over 4,000 people, including one thousand children. Officially, the registry was used to combat crime and conduct immigration policy. At least 70 officers had access to the database. Press reports caused public outrage and led to the public investigation. As a result, the register was closed, and due to the intervention of a civil society organisation, some of the affected individuals were compensated.
While the examples above feature “low-tech” tools of oppression, today the rapid transformation in data storage, processing, and transmission is giving rise to concerns about “hi-tech” discrimination. The vast amount of data in today's world means that institutions and companies need more and more sophisticated methods of analysing this information. Various types of algorithms and models allow categorisation, the search of correlation, regularities and patterns as well as automated decision-making.
These techniques have a different character, purpose and level of advancement. For example, banks are using algorithms to asses creditworthiness of its clients. Welfare administration by using automated systems tries to predict frauds among peoples receiving benefits. The dominant narratives about these digital systems stress their ability to modernize decision-making process, making it more efficient and timesaving. However techno-analytical progress has also some darker side. There is an on-going discussion whether big data and algorithms may lead to discrimination and social exclusion.
Here there are two problems. First, one can be related to poorly selected data, data that is incomplete or outdated or for which specific social groups are, for example, underrepresented. As K. Crawford points out, data is not collected “equally” from everyone. There are situations when groups or individuals are omitted due to their financial status, place of residence, knowledge of digital technologies or lifestyle. Data can also replicate historical biases. One of the early examples of this problem was the case of St. George’s Medical School in the United Kingdom. An automated system was used to screen the incoming applications from potential students. The system was modeled on previous job recruitment data. Unfortunately, it incorporated historical biases in its analytical process and discriminated against women and people with non-European names.
Concerns are not only related to the quality of input data but also the design of the algorithm that is using those inputs. And this is a second big problem here. Programming decisions are essentially human judgments and reflect a vision about how the world ought to be. For example, humans must decide on error types and rates for algorithmic models. In other words, someone has to decide whether to measure the algorithmic “reliability” concerning the cases wrongly included in an algorithmic decision (e.g., false positives) or wrongly excluded (e.g., false negatives) from an analytic model. Someone also needs to decide what an acceptable level of wrongful inclusion or exclusion might be.
One of the best examples of such a problem was the widely discussed case of a sentencing algorithm used in the USA, in a story published by ProPublica. The system assesses the risk of a defendant committing more crimes in the future, which affects the decision that determines a defendant’s temporary detention. ProPublica journalists found that that system may treat black people less fairly than their white counterparts.
The box full of errors
Unequal treatment that is related to data processing can occur in different situations. Very often, these cases will be difficult to detect. This is a black box – a non-transparent mechanism that performs some unintelligible mathematical operations using data. Lack of transparency means that people affected by such decisions (for example, they did not get credit, did not gain space at the university, did not get a job) have limited opportunities to learn their causes and possibly to stir decisions.
This discriminatory or dangerous black box scarce is an effect of deliberate action from humans. Developers sometimes make mistakes. An excellent example of such a situation was the system used in the state of Michigan in the USA. The employment agency introduced an automated system to detect welfare frauds. The system that operated between 2013 and 2015 was full of mistakes. Due to bad design, it erroneously accused more than 20,000 claimants of fraud, stopped their unemployment payments, and reported fines as high as $100,000. The case ended with multiple lawsuits that were filed in 2015 against state administration.
A challenge for civil society?
Oppressive data collection and discriminatory algorithms may lead to a violation of human rights on a massive scale. To fight with it, human rights organisations will have to develop new non-standard approaches and out-of-box tactics. Detections and neutralisation of data-misuses require form civil society both technological expertise and social justice sensitivity. However, those two are not always going hand-by-hand. Very often civil society organisations operate in silos – separately digital rights, antidiscrimination and protection of minority rights. But in times of algorithms discrimination forces should combine.
In our report, we recommend that few ways to deal with the problem of algorithmic discrimination. We see some potential to work with investigative journalists, progressive use of newly adopted GDPR and building collations between organisations from different sectors. There is also a tremendous need to develop new narratives and strategies that combine digital privacy and social justice concerns. Generally, we see three ways to support CSOs in engaging with the problem of automated discrimination:
- Resource digital rights or data privacy advocates to recognise anti-discrimination as a critical concern for data protection, and undertake automated discrimination as a priority for their work;
- Support anti-discrimination groups and other groups focused on equity and justice in recognising connections between their core work and values and “high-tech” discrimination; and,
- Acknowledge, cultivate, and support a flexible approach to highlighting and problem-solving for automated discrimination.
Jędrzej Niklas – PhD, expert in the Department of Media and Communications, London School of Economics and Political Science, former expert of Panoptykon Foundation.