Because of You (2024)

Datafication and Distortion in Generative AI

In a 2022 paper, Can There Be Art Without an Artist?, Dr. Avijit Ghosh and Genoveva Fossas discussed the work of human artists within training data for generative AI tools. In the appendix, they connect the practice of scraping training data without consent to its precedents in the biological sphere, citing the case of Henrietta Lacks:

“Henrietta Lacks is known as “immortal” for a reason – though she died of cervical cancer at age 30, scientists have used her remarkable cells countless times since. HeLa cells, that never stop dividing and hence are functionally immortal, have played a role in some of the most important medical advancements of our time. They were used to develop the polio vaccine, chemotherapy and cloning technology, among others. However, the original cells that started the immortal HeLa cell line were taken from her without her consent or the awareness of her family. Now her family is demanding compensation from Johns Hopkins University who first took the cells. The HeLa cell controversy is yet another cautionary tale about the dangers of cutting out human creators in the pursuit of technology and a lesson in ethics, privacy and consent in technological progress.”

Because of You is a digital video work inspired by this connection, and subsequent conversations between Eryk Salvaggio and Dr. Avijit Ghosh, which began at a presentation on AI and art at SXSW in 2023.

The piece is defined by its soundtrack, a recording of jazz musician Tab Smith’s 1951 instrumental ballad, “Because of You,” slowed down to a quarter of its speed while isolating specific harmonic features. The song is both a historical marker — it was released the year that Lacks’ cells were taken without her consent, reflecting Lacks’ love of dancing — recollections by family members shared that Lacks would frequent dance halls at the time that these cells were being studied and analyzed without her knowledge.

In its title, and in the act of stretching this song out across a broader expanse of time, we emphasize the reach of that event, but also the extension of time through Lacks’ cells and the distortions created by time and the reprocessing of memory into the abstraction of data. Who is Henrietta Lacks? She is a person, and she is the progenitor of a series of cells that survived her without her name — data about her body without reference to the person.

The video piece explores links between the datafication of the body and the datafication of human expression. AI generated images cannot faithfully generate an image of Lacks’ face, as there is sparse documentation of who she was. In this piece, we drawing from an AI generated image of Henrietta Lacks, which appears close to the original but then drifts toward the image of a black woman, clearly styled for 2024. We revive the memory of Henrietta Lacks through using her name in the prompt, but that memory is distant from Lacks herself: much as it is with her cells.

A still from Because of You.

The film is narrated by a digital voice double of Dr. Avijit Ghosh — trained on a short sample of his voice. This regenerated voice describes the abstraction of Lacks’ life into cellular data, drawing parallels to the distortions introduced to his own voice — including the removal of his Bengali Indian accent, unexpectedly replaced with the inflections of a statistically averaged North American.

Noise is a central element of this piece. In a sense, it envisions the training of a diffusion model as an in-between place: between information, contributed by humans, and the complete stripping away of that information, which is an aspect of AI model training.

Here, digital static stands in for the role of Gaussian noise in the training and generation of AI images. Noise is deeply embedded into the visual language of AI images, in that its training requires the removal of information from human archives in order to learn how to rebuild images from nothing. This process of information removal and restoration — the addition and subtraction of noise — is repeated millions of times while training a diffusion based model.

Here, Salvaggio draws parallels to the “removal of information” from Lacks, in her cells and in her visage, and the removal of information from the images created by humans for training data: images which, Salvaggio has noted, includes not only visual art, but private photographs, snapshots, and even documentation of child abuse, trauma, and historic atrocities. Salvaggio’s work often emphasizes this abstract destruction and recombination of all visual culture in AI models.


You can view this project on Hugging Face as well! Because of You was accepted in the CVPR 2024 AI Art Gallery running Jun 19-21, 2024 as part of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition.


Eryk Salvaggio

Eryk Salvaggio (cyberneticforests.com) is a researcher and new media artist interested in the social and cultural impacts of artificial intelligence. His work, which is centered in creative misuse and the right to refuse, critiques the mythologies and ideologies of technologies that ignore the gaps between datasets and the world they claim to represent. A blend of hackerpolicy researcherdesigner and artist, he has been published in academic journals such as PatternsLeonardo, and Interactions of the ACM, spoken at music and media festivals, and has consulted on tech policy at the national level.

Dr. Avijit Ghosh

Dr. Avijit Ghosh is an Applied Policy Researcher in the Machine Learning and Society Team at Hugging Face 🤗. He works at the intersection of machine learning, ethics, and policy, aiming to implement fair ML algorithms into the real world. He has published and peer-reviewed several research papers in top ML and AI Ethics venues, and has organized academic workshops as a member of QueerInAI. His work has been covered in the press, including articles in The New York Times, Forbes, The Guardian, Propublica, Wired, and the MIT Tech Review. Dr. Ghosh has been an invited speaker as a Responsible AI expert, at events held by organizations such as SXSW, Trustworthy ML Initiative, AI Risk and Vulnerability Alliance, and AI Village. He has also engaged with policymakers, having spoken to US Congressional Staffers and to the UK Government Centre for Data Ethics and Innovation. His research and outreach have led to real-world impact, such as helping shape regulation in New York City and causing Facebook to remove their biased ad targeting algorithm.