KI-Portraits

Facing the Truth

New Follies in Artificial Intelligence

Every human being has a comparatively unique appearance – and can be identified by a picture or recording of them. Automated facial recognition, too, makes use of this circumstance. Because governments and corporations increasingly use these technologies, critics have been vocal about their ethical and political implications. Less frequent are critiques of the technical aspects of facial recognition: Photography still gets a benefit of doubt as a representation of reality. Not only is this confidence no longer warranted in the age of ubiquitous photo manipulation. We also overlook that in artificial intelligence techniques of “racial recognition” have been developed, which create new ethnic categories. These are no less dubious than earlier 19th and 20th century ones.

 

Automated facial recognition significantly relies on the indexical nature of photography, that what you see is what is. The credibility of photographs in turn is based on their objectiveness, their proximity to the event they depict, even at one remove. Roland Barthes begins Camera Lucida with an exclamation about a photograph of Napoleon’s youngest brother Jérôme: “I am looking at the eyes that looked at the emperor.” Crucial here is the photograph’s capacity to authenticate: We presume that every photograph is co-natural with its referent, more so than is the case with paintings or drawings.

But as always, things are neither that simple, nor can we understand contemporary developments in artificial intelligence without drawing on much older and historical insights on questions of representation. Hence, my intervention approaches this issue’s theme of “truth” in artificial intelligence from the perspective of what happens to humans that are physically caught up in the machinery of algorithmic logics. In these instances, especially where digital images are concerned, the true/false binary is of limited value in clarifying ethical problems in artificial intelligence. More than that, we are coming face to face with questions that threaten to explode the fundamental unit of social scientific analysis: What is a person, and how do we recognize them?

Essays on truth in images frequently begin with a vignette on the world’s earliest famous battlefield photograph: Roger Fenton's Valley of the Shadow of Death. The 1855 photograph shows a bleak landscape and directs our gaze towards a road snaking through shallow hills that is strewn with cannonballs. We are meant to consider the desolate aftermath of war: Material debris. But as various critics have pointed out, the image is likely a fake; Fenton is presumed to have moved the cannonballs from the ditch to the street for dramatic effect.

Fentons critics’ point is that the testimonial value of photographs should be interrogated because the photographer can alter the social world through addition or subtraction at different stages of the production process. And the testimonial and representational value of photographs has been under attack from multiple angles. In Facial Weaponization Suite, the artist Zach Blas builds amorphous masks from aggregate facial data that disguise the wearer’s identity. Think Tanks have been warning about the significant risks of deepfakes – digitally altered videos created using deep learning that appear realistic, but depict things that have not occurred. Compared to photographs, videos for a long time appeared to be relatively immune to falsification, but now even their testimonial backstop function is in question. This allows political actors to use them strategically for purposes of disinformation. Fenton’s photographs, Blas’ masks and the by now widespread deepfakes all show how the causal link between referent and likeness can break.

Manipulation made easy

In the age of big and digital data, this ease and apparent ubiquity of manipulation has rightfully given us pause – and scholars of disinformation have been doing an admirable job at drawing clarifying lines between true and false, authentic and fake. But there remain instances of manipulation that are harder to adjudicate. Take for example cases of personal identification for ascriptive identity categories like gender or race, where research has shown just how incompetent AI is when making sense of what it considers edge or ambiguous cases. In these instances where a machine “reads” a photo and offers up ethnoracial classification as output, we have to ask ourselves: What do images produce? Or put differently: What happens when artificial intelligence is unleashed on photographic images of human faces?

Answers that follow the true/false logic of Fenton’s cannonball discussion will lead to a dead end: Because a category like “race” is socially constructed, there cannot exist a “truthful” image of it. Let us try for a different approach and time travel to 1842, when famed photographer Nadar convinced recalcitrant French novelist Honoré de Balzac to have his daguerreotype taken. Balzac balked at having photographic portraits because he believed them to strip layers off their subjects’ bodies. This fear of extraction of something essential has carried through the history of photography. More than that, the history of photography is enmeshed with the history of violence, a point that has been taken up since to narrate the coevolution of the medium of photography alongside capitalism and colonialism.

Balzac’s metaphor of a body that is stripped layer by layer may strike us as hopelessly superstitious. But once we remove the element of superstition, his beliefs remain prescient in two senses. The lack of consent remains a problem today, and much like Balzac, we might be worried about what is left of the person once our likenesses are turned into vectors and matrices. This moves us away from concerns about whether photographs provide accurate representations of the world and leads us to the concern of what layers of meaning statistical ways of seeing impose on faces. Which returns us to the earlier question: what makes a person, and how can we capture them by scientific means?

How algorithms work in facial recognition

Automated Facial Recognition (AFR) is increasingly used by companies and states, for example in stop and frisk police maneuvers, but also at borders in order to identify refugees who are presumed to make fraudulent claims about their ethnicity. At the same time, the use cases are not necessarily explicitly racist: Facial recognition can also be used for profane marketing aims to increase company profits. And yet, automated facial recognition’s allure consists in no little part of its casual non-consensuality: Computer scientists praise it as method of identification precisely because it can be done from a distance, without consent, and even without knowledge of the person photographed. Less efficient at one-to-one identity verification than iris scans, its value is nonetheless redeemed because of its ease of use on populations presumed recalcitrant. The issue of missing consent in the creation of image databases extends to other domains, too, for example when police mugshots are put to secondary uses in training new algorithms that are then used for facial recognition.

That these methods are extremely problematic is clear; as is that they may subvert what we may take to be “true.” For one, the missing consent leads to an ethical or moral distance between camera and person; and second, automated facial recognition for “race” reveals the increasing distance to the person as a person: In comparison to other forms of artificial intelligence, the data remains surprisingly low dimensional. Although algorithms use masses of images, they do so in ways that are less complex than prior methods of classification. This is perplexing at first: The moment we talk about artificial intelligence, machine learning and training sets, we talk about big data in the sense of a lot of input. This might lead us to presume that the complexity of analysis ought to match the vastness of the data. But as I show elsewhere, contemporary computer vision technologies are more data-poor than earlier technologies of racial persecution. Compared to National Socialist racial assessments, for example, they work largely on 2D representations of the face and ignore other types of data that could be deployed for classification. And before you might take this to be good news: The fact that contemporary technologies are flatter and may be less obviously racially motivated does not mean that the selection and discrimination processes that they afford are lesseffective”.

Balzac’s second worry concerns a reservation that economic sociologists define as the problem of the category of the person. The argument goes something like this: an individual attains their status on the basis of group membership, but algorithmic logics break this link of shared class characteristics, which leads to individuals without clear pathways towards collective association. Population groups that were previously recognizable and in sharp focus are now watered down and may become statistically unrecognizable. This not only weakens their potential to politically mobilize, but their lack of “legibility” can also translate directly into redistributive disparities. The sociologists Greta Krippner and Dan Hirschman showed this with regards to gender and credit markets: Where previously, actuarial logics consolidated risk based on specific and socially salient group criteria, algorithmic logics determine status based on attributes that are not necessarily shared across the group, and in fact may allocate each individual their own unique value. This then leads to allocations of creditworthiness independent of sociologically meaningful social criteria. Such an abstract form of evaluation refracts mathematical, but not social logics. Which in turns leads to a form of new opacity in credit decisions that makes it harder than before for disadvantaged groups to prove that they are being disadvantaged or discriminated against.

Computer vision strips the person layer by layer in both the statistical and a metaphorical sense; and yet to different consequences than what Balzac feared. The result: A perfidious new invention of visual stereotypes in digital space that is amazingly less complex – and therefore, even more stupid – than its 20th century predecessors. 

References

Barthes, Roland. 1981. Camera Lucida: Reflections on Photography. London: Hill and Wang.

Coleman, Kevin, and Daniel James. 2021. Capitalism and the Camera. New York: Verso.

Lee-Morrison, Lila. 2019. Portraits of Automated Facial Recognition: On Machinic Ways of Seeing the Face. Bielefeld: transcript-Verlag.

Rini, Regina. 2020. "Deepfakes and the Epistemic Backstop." Philosophers' Imprint 20 (24).

Skarpelis, A.K.M. 2022. "What do Computer Vision Engineers Do All Day? On the Making of Ethnoracial Categorization in Computer Vision Practice."

—. forthcoming. "Horror Vacui: Racial Misalignment, Symbolic Repair and Imperial Legitimation in German National Socialist Portrait Photography." American Journal of Sociology.

Sontag, Susan. 2008. On Photography. London: Penguin Books.

13.12.2022

Image description:  To illustrate the contribution about automated face recognition, we have gone one step further with the help of artificial intelligence, as these persons do not exist. Their faces have been created by Image editor Gesine Born using the software DALL-E 2. Her input: “portrait photograph of {… description; z.B. woman with a baby}, looking worried, street photography, Leica style, 35 mm, warm colors”. More information about the images you can find here (PDF in German).

 

This text is licensed under a Creative Commons Attribution 4.0 International License.