NSFW
Venus

In 2018, Laura Ghianda posted an image of the Venus of Willendorf on Facebook. Facebook flagged and removed the image, deeming the artwork “inappropriate content,” despite four attempts to repeal their decision. In October 2021, in response to Instagram’s similarly heavy-handed content moderation practices, Vienna’s tourist board announced they were starting an OnlyFans site where visitors could view sexually suggestive artworks from the collections of several prominent Viennese museums.

prefix_GantMan_0DD44440-02A9-43D1-8294-B44FAFB7844E

EXPOSED_BELLY
EXPOSED_BUTTOCKS
EXPOSED_BREAST_F

As museums digitize and disseminate their collections through networked platforms, curatorial departments face off against precariously employed content moderators and pornography detection algorithms to negotiate cultural, moral, and artistic distinctions — between art and pornography, original and copy — in real time.

GantMan_B1A544AA-5F99-45B6-858F-6C1761DD33EE

EXPOSED_BELLY
EXPOSED_BREAST_F
EXPOSED_BREAST_F
EXPOSED_GENITALIA_F

But in this new image economy, much remains unchanged. Wealthy institutions continue to generate enormous social and economic capital by displaying, classifying, and quantifying images of bodies. As the power to determine the bounds of appropriate expression shifts from cultural gatekeepers to libertarian technocrats, whose gaze — colonial, computerized, or some hybrid of the two — is encoded?

prefix_GantMan_8AE9152D-1EF2-42AD-911E-6DEE122847C3

EXPOSED_BELLY
EXPOSED_BREAST_F
EXPOSED_BREAST_F

These images sit uncomfortably between photograph and data. I found them by searching GitHub’s ‘nudity detection’ tag, in an open source repository called NudeNet: Neural Nets for Nudity Classification, Detection and Selective Censoring.

prefix_GantMan_4EE1AE54-3432-4772-A07A-F1DD4875D371

EXPOSED_BREAST_F
EXPOSED_BREAST_F

Tucked away at the end of the repository’s readme notes, following instructions in several coding languages for using the algorithm to classify, detect, and selectively censor nudity on one’s own machine or website, the algorithm’s author had placed a link to a .zip file containing 20,000 images. The images, bedapudi6788 wrote, were a small fraction of “the auto-labelled data” he “used to train the base Detector.”

prefix_GantMan_3B3A4B03-2673-4DCD-8289-EA7AA4E97977

EXPOSED_BELLY
EXPOSED_BREAST_F
EXPOSED_BREAST_F

I clicked the link and downloaded the file, and there, in a folder titled detector_auto_generated_data, was 696.3 MB of pornography and two spreadsheets. One spreadsheet consists of a simple key to the algorithm’s ‘classes,’ coded 0–5: exposed_belly, exposed_buttocks, exposed_breast_f, exposed_genitalia_f, exposed_breast_m, exposed_genitalia_m. The other spreadsheet contains 35,502 lines, one or more for each of the 20,000 images, with information about which category of nudity was detected and its precise coordinates.

B8CBA30E-E1D5-46F5-8FCA-76769F884295

EXPOSED_BREAST_F
EXPOSED_GENITALIA_M

Pornography is notoriously difficult to define. Justice Potter Stewart’s infamous phrase, “I know it when I see it,” uttered in the 1964 Supreme Court case Jacobellis v. Ohio, remains more or less the state of the art. “Any explicit sexual matter with the purpose of eliciting arousal” is the vague definition often cited in academic papers where computer scientists detail the design and training of their pornography-detecting algorithms — when they bother to define it at all.

prefix_GantMan_8E7D87DA-D3BE-4AFC-AD7A-C0D9CBF8BE92

EXPOSED_BELLY
EXPOSED_BREAST_F
EXPOSED_GENITALIA_F
EXPOSED_GENITALIA_M

In a literature review of papers describing methods for automatically detecting internet porn, researchers found that 84% simply “did not define pornography,” and, among those that did, “no studies gave the same definition.” They attributed this to the “difficulties in developing a universal definition” of pornography, voicing hope that “a better understanding” will lead to “more consistent and standardized ways of measuring these issues.”

GantMan_403FF588-9CDF-427D-8107-FE3381199364

EXPOSED_BUTTOCKS
EXPOSED_GENITALIA_F

Although published in 2010, the authors’ hoped-for insights have not emerged in over a decade of intensive research and widespread public use. A paper published in 2020 admits that the “currently available method to detect nude image[s] is still crude,” while another concedes that “the term ‘pornographic’ itself is ambiguous.” In a 2014 paper on the evocatively titled “Skin Sheriff” algorithm, its authors, in a brief aside, lament the fact that determining whether an image is pornographic “is not always possible. Even for humans it can be a subjective decision.” Nevertheless, the paper goes on to describe in precise detail the algorithmic process their “sheriff” uses to definitively classify an image into one of two binary categories: pornographic or non-pornographic.

GantMan_C29D8AD5-F1DD-4ADF-92BC-498AD0D22439

EXPOSED_BUTTOCKS

These images are stolen. bedapudi6788, the algorithm’s author, doesn't mention where the images used to train NudeNet come from; their source is assumed. Almost every computer vision algorithm — the technology underlying applications built for tasks like recognizing faces, detecting emotions, and reading license plates — is trained on a massive dataset of images scraped, without consent, from the internet.

prefix_GantMan_7D06A294-55DF-4D30-886F-D98BA6F0B357

EXPOSED_BREAST_F
EXPOSED_BREAST_F
EXPOSED_BREAST_F

On GitHub, it’s easy to find repositories with names like NSFW Data Scraper and NSFW data source URLs. The latter, described as “lists of URLs that will help you download NSFW images” for “building big enough dataset to train robust NSFM classification model,” boasts that “after downloading and cleaning it's possible to have ~ 500GB or in other words ~ 1 300 000 of NSFW images.”

prefix_GantMan_8CBA4A17-5FCB-42CE-868B-3BA9FA64ECB7

EXPOSED_BUTTOCKS

As I scroll through the detector_auto_generated_data folder, I see photographs of thousands of people. Some of them likely posted their images on Tube sites and Reddit forums to advertise their paid content. Some may have shared their photographs freely, as expressions of their sexuality. Many of the images were undoubtedly stolen: reposted from subscription sites by anonymous fans, perhaps, or from phones and hard drives by blackmailers or abusive exes.

prefix_GantMan_8E40AA63-89C0-4BD3-B7BB-A7E2834A176F

EXPOSED_BELLY
EXPOSED_BREAST_F
EXPOSED_GENITALIA_F

No matter how these photographs got here, it’s safe to say that the people laboring and loving in them would not consent to their use in training NudeNet, an algorithm built expressly to expunge the internet of images like them.

prefix_GantMan_8F2FFA6F-B4AE-47B1-814C-5E8728EDD5CC

EXPOSED_BUTTOCKS

There is a direct line from the colonial archive to the machine learning data set. From phrenology to sexology, scholars have traced the integral role photographic archives and bodily measurement played in colonial constructions of race and gender, intelligence and morality, human and something less-than. Today, once again, scientists are using technologies of vision and quantification to transform bodies into data — and using that data to classify, predict, discipline, and erase.

prefix_GantMan_51B35E4A-F4BF-48F5-901F-82187CB43880

EXPOSED_BELLY
EXPOSED_BUTTOCKS
EXPOSED_BREAST_F
EXPOSED_BREAST_F
EXPOSED_GENITALIA_F

The people captured in NudeNet’s training data are not enslaved. But in their stolen images they are rendered inhuman. Their bodies are painstakingly measured, labeled, and sorted into detailed, hierarchical taxonomies.

prefix_GantMan_57DD4E1B-7911-4576-822D-CAE5A02CCC80

EXPOSED_BREAST_F
EXPOSED_BREAST_F
EXPOSED_GENITALIA_F

NudeNet, and other algorithms like it, learns from these bodies-that-are-data how to flatten complex gradients into binaries — pornographic or non-pornographic, male-presenting nipple or female-presenting nipple, acceptable or censored — whose enforcement brings material consequences. As always, the most marginalized are most harmed, online and off.

prefix_GantMan_89D84913-DFD0-419B-B7D7-F6291FC9EAF4

EXPOSED_BUTTOCKS

Machines only know what we teach them. In the archive of millions and millions of images that constitute their training, what are they learning to remember?

prefix_GantMan_896D158B-8AA0-4D91-AA9A-442F479B48AF

EXPOSED_BELLY
EXPOSED_BREAST_F
EXPOSED_GENITALIA_F

These images are not mine to use. Their existence, in this context, is a violation. The privacy, labor, and agency of the people captured in them has been so thoroughly denied that their consent, or lack thereof, never even occurred to the researchers who used their intimate photographs to train an algorithm.

GantMan_AA0F4C92-1588-49C9-8B9B-51C7AEDC8CDE

EXPOSED_BUTTOCKS

To the algorithm and its designers, the people in these stolen images matter only in aggregate, each body consumed and reconstituted as one data point among millions. Although they are painfully exposed, they were never meant to be seen individually. As I scroll and scroll, I want to acknowledge and address the people captured in the 20,000 images that make up a vanishingly small slice of NudeNet’s training data.

prefix_GantMan_4DF1D3F7-0B91-4FAC-A448-E3EDA1418A62

EXPOSED_BELLY
EXPOSED_BREAST_F

How do you show something that shouldn’t exist? How do you show something that’s not yours to show? How do you hold the archive accountable without reinscribing the violence that produced it?

prefix_GantMan_146A4D39-01DC-4E12-AE48-A053F925BCCD

EXPOSED_BELLY
EXPOSED_BELLY
EXPOSED_BREAST_F
EXPOSED_BREAST_F
EXPOSED_BREAST_F
EXPOSED_BREAST_F

For decades, artists, activists, and scholars have asked these questions as they confront and untangle the violations of the colonial archive. Their clearest answers — repatriation and reparations — may be impossible in the machine learning archive, filled with digital objects that are endlessly copied and circulated. But other, less material, practices may offer answers.

prefix_GantMan_5C3711AD-5A8E-4547-8B19-331552D62E57

EXPOSED_BELLY
EXPOSED_BREAST_F
EXPOSED_BREAST_F
EXPOSED_GENITALIA_F

Stephanie Syjuco, a contemporary Filipina artist, works with archives of anthropological photographs from the Philippines. She uses her body and a range of formal methods to intervene in them by shielding the subjects from the camera’s extractive gaze. In a recent work, Shutter/Release, Syjuco uses a Photoshop tool called the “healing brush” to digitally remove the subjects from old prison mugshots.

prefix_GantMan_93C3F525-CD28-4397-84D8-B04822916986

EXPOSED_BELLY
EXPOSED_BREAST_F

In the now-“healed” images, the documents, along with spectral traces of their inhabitants, remain. But the people have been, in Syjuco’s words, “liberated” from their colonial and carceral environments.

prefix_GantMan_8A202B14-D0AB-4789-A882-196EDF98D7A9

EXPOSED_BUTTOCKS
EXPOSED_GENITALIA_F
EXPOSED_GENITALIA_M

In NSFW Venus, I use Syjuco’s healing brush strategy on these images that are not mine to use. I draw my finger across the trackpad, moving my mouse over some of the people captured in NudeNet’s archive, erasing them from view. Photoshop’s healing brush also uses a computer vision algorithm. The algorithm reads the pixels surrounding the areas I’ve covered, and, drawing on its memories of the millions of images it’s seen, guesses which pixels should fill the void where a person was.

prefix_GantMan_0542A06E-DE13-47CC-BABF-75C93E2D3BAF

EXPOSED_BELLY
EXPOSED_BUTTOCKS
EXPOSED_BREAST_F
EXPOSED_BREAST_F

But, as Syjuco implicitly asks, can the harms of the algorithmic archive be healed?