SLSA 2021 – medieninitiative

DeepFake videos pose significant challenges to conventional modes of viewing. Indeed, the use of machine learning algorithms in these videos’ production complicates not only traditional forms of moving-image media but also deeply anchored phenomenological categories and structures. By paying close attention to the exchange of energies around these videos, including the consumption of energy in their production but especially the investment of energy on the part of the viewer struggling to discern the provenance and veracity of such images, we discover a mode of viewing that both recalls pre-cinematic forms of fascination while relocating them in a decisively post-cinematic field. The human perceiver no longer stands clearly opposite the image object but instead interfaces with the spectacle at a pre-subjective level that approximates the nonhuman processing of visual information known as machine vision. While the depth referenced in the name “deep fake” is that of “deep learning,” the aesthetic engagement with these videos implicates an intervention in the depths of embodied sensibility—at the level of what Merleau-Ponty referred to as the “inner diaphragm” that precedes stimulus and response or the distinction of subject and intentional object. While the overt visual thematics of these videos is often highly gendered (their most prominent examples being so-called “involuntary synthetic pornography” targeting mostly women), viewers are also subject to affective syntheses and pre-subjective blurrings that, beyond the level of representation, open their bodies to fleshly “ungenderings” (Hortense Spillers) and re-typifications with far-reaching consequences for both race and gender.

Let me try to demonstrate these claims. To begin with, DeepFake videos are a species of what I have called discorrelated images, in that they trade crucially on the incommensurable scales and temporalities of computational processing, which altogether defies capture as the object of human perception (or the “fundamental correlation between noesis and noema,” as Hussserl puts it). To be sure, DeepFakes, like many other forms of discorrelated images, still present something to us that is recognizable as an image. But in them, perception has become something of a by-product, a precipitate form or supplement to the invisible operations that occur in and through them. We can get a glimpse of such discorrelation by noticing how such images fail to conform or settle into stable forms or patterns, how they resist their own condensation into integral perceptual objects—for example, the way that they blur figure/ground distinctions.

The article widely credited with making the DeepFake phenomenon known to wider public in December 2017 notes with regard to a fake porn video featuring Gal Gadot: “a box occasionally appeared around her face where the original image peeks through, and her mouth and eyes don’t quite line up to the words the actress is saying—but if you squint a little and suspend your belief, it might as well be Gadot.” There’s something telling about the formulation, which hinges the success of the DeepFake not on the suspension of disbelief—a suppression of active resistance—but on the suspension of belief—seemingly, a more casual form of affirmation—whereby the flickering reversals of figure and ground, or of subject and object, are flattened out into a smooth indifference.

In this regard, DeepFake videos are worth comparing to another type of discorrelated image: the digital lens flare, which is both to-be-looked-at (as a virtuosic display of technical achievement) and to-be-overlooked (after all, the height of their technical achievement is reached when they can appear as transparently naturalized simulations of a physical camera’s optical properties). The tension between opacity and transparency, or objecthood and invisibility, is never fully resolved, thus undermining a clear distinction between diegetic and medial or material levels of reality. Is the virtual camera that registers the simulated lens flare to be seen as part of the world represented on screen, or as part of the machinery responsible for revealing it to us? The answer, it seems, must be both. And in this, such images embody something like what Neil Harris termed the “operational aesthetic” that characterized nineteenth-century science and technology expos, magic shows, and early cinema alike; in these contexts, spectatorial attention oscillated between the surface phenomenon, the visual spectacle of a machine or a magician in motion, and the hidden operations that made the spectacle possible.

It was such a dual or split attention that powered early film as a “cinema of attractions,” where viewers came to see the Cinematographe in action, as much as or more than they came to see images of workers leaving the factory or a train arriving at the station. And it is in light of this operational aesthetic that spectators found themselves focusing on the wind rustling in the trees or the waves lapping at the rocks—phenomena supposedly marginal to the main objects of visual interest.

DeepFakes also trade essentially on an operational aesthetic, or a dispersal of attention between visual surface and the algorithmic operation of machine learning. However, I would argue that the post-cinematic processes to whose operation DeepFakes refer our attention fundamentally transform the operational aesthetic, relocating it from the oscillations of attention that we see in the cinema to a deep, pre-attentional level that computation taps into with its microtemporal speed.

Consider the way digital glitches undo figure/ground distinctions. Whereas the cinematic image offered viewers opportunities to shift their attention from one figure to another and from these figures to the ground of the screen and projector enabling them, the digital glitch refuses to settle into the role either of figure or of ground. It is, simply, both—it stands out, figurally, as the pixely appearance of the substratal ground itself. Even more fundamentally, though, it points to the inadequacy, which is not to say dispensibility, of human perception and attention with respect to algorithmic processing. While the glitch’s visual appearance effects a deformation of the spatial categories of figure and ground, it does so on the basis of a temporal mismatch between human perception and algorithmic processing. The latter, operating at a scale measured in nanoseconds, by far outstrips the window of perception and subjectivity, so that by the time the subject shows up to perceive the glitch, the “object” (so to speak) has already acted upon our presubjective sensibilities and moved on. This is why glitches, compression artifacts, and other discorrelated images are not even bound to appear to us as visual phenomena in the first place in order to exert a material force on us. Another way to account for this is to say that the visually-subjectively delineated distinction between figure and ground itself depends on the deeper ground of presubjective embodiment, and it is the latter that defines for us our spatial situations and temporal potentialities. DeepFakes, like other discorrelated images, are able to dis-integrate coherent spatial forms so radically because they undercut the temporal window within which visual perception occurs. The operation at the heart of their operational aesthetic is itself an operationalization of the flesh, prior to its delineation into subjective and objective forms of corporeality. The seamfulness of DeepFakes—their occasional glitchy appearance or just the threat or presentiment that they might announce themselves as such—points to our fleshly imbrication with technical images today, which is to say: to the recoding not only of aesthetic form but of embodied aesthesis itself.

In other words: especially and as long as they still routinely fail to cohere as seamless suturings of viewing subjects together with visible objects, but instead retain their potential to fall apart at the seams and thus still require a suspension of belief, DeepFake videos are capable of calling attention to the ways that attention itself is bypassed, providing aesthetic form to the substratal interface between contemporary technics and embodied aesthesis. To be clear, and lest there be any mistake about it, I in no way wish to celebrate DeepFakes as a liberating media-technology, the way that the disruption of narrative by cinematic self-reflexivity was sometimes celebrated as opening a space where structuring ideologies gave way to an experience of materiality and the dissolution of the subject positions inscribed and interpellated by the apparatus. No amount of glitchy seamfulness will undo the gendered violence inflicted, mostly upon women, in involuntary synthetic pornography. Not only that, but the pleasure taken by viewers in their consumption of this violence seems to depend, at least in part, precisely on the failure or incompleteness of the spectacle: what such viewers desire is not to be tricked into actually believing that it is Gal Gadot or their ex-girlfriend that they are seeing on the screen, but precisely that it is a fake likeness or simulation, still open to glitches, upon which the operational aesthetic depends. Nevertheless, we should not look away from the paradoxical opening signaled by these viewers’ suspension of belief. The fact that they have to “squint a little” to complete the gendered fantasy of domination also means that they have to compromise, at least to a certain degree or for a short duration, their subjective mastery of the visual object, that they have to abdicate their own subjective ownership of their bodies as the bearers of experience. Though it is hard to believe that any trace of conscious awareness of it remains, much less that viewers will be reformed as a result of the experience, it seems reasonable to believe that viewers of DeepFake videos must experience at least an inkling of their own undoing as their de-subjectivized vision interfaces with the ahuman operation of machine vision.

What I am saying, then, and I am trying to be careful about how I say it, is that DeepFake videos open the door, experientially, to a highly problematic space in which our predictive technologies participate in processes of subjectivation by outpacing the subject, anticipating the subject, and intervening materially in the pre-personal realm of the flesh, out of which subjectivized and socially “typified” bodies emerge. The late Sartre, writing in the Critique of Dialectical Reason, defined commodities and the built environment in terms of the “practico-inert,” in light of the ways that “worked matter” stored past human praxis but condensed it into inert physical form. Around these objects, increasingly standardized through industrial capitalism’s serialized production processes, are arrayed alienated and impotent social collectives of interchangeable, fungible subjects. Compellingly, feminist philosopher Iris Marion Young takes Sartre’s argument as the basis for rethinking gender as a non-essentialist formation, a nascent collectivity, that is imposed on bodies materially—through architecture, clothing, and gender-specific objects that serve to enforce patriarchy and heterosexism. The practico-inert, in other words, participated in the gendered typification of the body—and we could extend the argument to racialization processes as well. But the computational infrastructures of today’s built environment are no longer adequately captured by the concept of the practico-inert. These infrastructures and objects are still the products of praxis, but they are far from inert. In their predictive and interactive operations, they are better thought of under the concept of the practico-alert—they are highly active, always on alert, and like the viewers of DeepFake videos on the lookout for a telling glitch, so are we ever and exhaustingly on the alert. In these circuits, which are located deeper than subjective attention, the standardization and typification processes I just mentioned are more fine-grained, more “personalized” or targeted, operating directly on the presubjective flesh. In this sense, the flattening of subjectivity, the suspension of belief and depersonalization of vision in DeepFake videos, points towards the contemporary “ungendering” of the flesh, as Hortense Spillers calls it in a different context, that marks a preliminary step in the computational intensification of racialized and gendered subjectivization. This is a truly insidious aesthetics of the flesh.Sartre and practico-inert — updated to practico-alert; cf. gender via Iris Marion Young: typification (or serialization) via practico-inert. Now a more direct, because immeasurably fast, operation on presubjective flesh.

On Saturday, October 2, 2021, at 1pm Eastern / 10am Pacific, I will be participating along with Hannah Zeavin, Casey Boyle, and Hank Gerba in a panel on “DeepFake Energies” at the Society for Literature, Science, and the Arts (SLSA) conference (via Zoom).

The panel thinks about the energies invested and expended in DeepFake phenomena: the embodied, cognitive, emotional, inventive, and other energies associated with creating and consuming machine-learning enabled media (video, text, etc.) that simulate human expression, re-create dead persons, or place living people into fake situations. Drawing on resources from phenomenology, psychoanalysis, media theory, and computational exploration, panelists trace the ways that the generative energies at the heart of these AI-powered media transform subjective and collective experiences, with significant consequences for gender, race, and other determinants of political existence in the age of DeepFakes.

Here are the abstracts:

On the Embodied Phenomenology of DeepFakes (Shane Denson, Stanford)

DeepFake videos pose significant challenges to conventional modes of viewing. Indeed, the use of machine learning algorithms in these videos’ production complicates not only traditional forms of moving-image media but also deeply anchored phenomenological categories and structures. By paying close attention to the exchange of energies around these videos, including the consumption of energy in their production but especially the investment of energy on the part of the viewer struggling to discern the provenance and veracity of such images, we discover a mode of viewing that both recalls pre-cinematic forms of fascination while relocating them in a decisively post-cinematic field. The human perceiver no longer stands clearly opposite the image object but instead interfaces with the spectacle at a pre-subjective level that approximates the nonhuman processing of visual information known as machine vision. While the depth referenced in the name “DeepFake” is that of “deep learning,” the aesthetic engagement with these videos implicates an intervention in the depths of embodied sensibility—at the level of what Merleau-Ponty referred to as the “inner diaphragm” that precedes stimulus and response or the distinction of subject and intentional object. While the overt visual thematics of these videos is often highly gendered (their most prominent examples being involuntary synthetic pornography targeting mostly women), viewers are also subject to a”ective syntheses and pre-subjective blurrings that, beyond the level of representation, open their bodies to fleshly “ungenderings” (Hortense Spillers) and re-typifications with far-reaching consequences for both race and gender.

No More Dying (Hannah Zeavin, UC Berkeley)

“No More Dying” concerns itself with the status of DeepFakes in psychic life on the grounds of DeepFakes that reprise the dead. In order to think about whether DeepFakes as surrogates constitute an attempt at eluding pain—a psychotic technology—or are a new form of an ancient capacity to symbolize pain for oneself (Bion 1962), I will return to the status of objects as melancholic media and what this digital partial-revivification might do to and for a psyche. Is creating a virtual agent in the likeness of a lost object a new terrain (a new expression of omnipotent fantasy) or is it more akin to the wish fulfillment at the center of transitional phenomena and dreaming? Does a literal enactment and acting out lead to, as Freud would have it, a mastery and working through—or does the concrete nature of gaming trauma lead to a melancholic preservation of an internal object via an investment in the mediatized external object? Beyond the psychical implications of this form of reviving the dead, the paper troubles the assumptions and politics of this nascent practice by asking whose dead, and whose trauma, are remediated and remedied this way. More simply, which dead are eligible for reliving and, recalling Judith Butler’s question—which lives are grievable?

Low Fidelity in High Definition (Casey Boyle, UT Austin)

When thinking about DeepFakes, it is easy to also think about theorist Jean Baudrillard. It was Baudrillard who, early and often, rang alarm bells regarding the propensity of images and/as information to become unmoored from any direct referent. DeepFakes seem to render literal the general unease with the ongoing mediatization that Baudrillard traced. However, the uncertainty about a “real” is not only because of this severing real from fake, but is also because of a prior condition of media since, as Baudrillard claims, “… a completely new species of uncertainty results not from the lack of information but from information itself and even from an excess of information” (Baudrillard, 1985). The excessive overload of mediatization enables DeepFakes to persist as a threat because the energy and e”ort required to validate any given piece of media is an unsustainable practice when there are so many to verify. It seems then the only response to overload is to generate…more. This presentation reports on an ongoing project to re-energize Baudrillard by computationally generating new texts. Using an instance of GPT-3 machine learning—one trained on Baudrillard’s texts—the presenter will rely on “new” primary texts to comment on the rise of DeepFakes, Post-Truth, and Fake News. Ultimately, this presentation, relying on “new” primary work from Baudrillard, argues that we are not entering an era of Post- Truth but of Post-Piety, which is an era in which we have failed to spend energy building agreement and commonplace.

A Gestural Technics of Individuation as Descent (Hank Gerba)

Googling “What is a DeepFake?” returns a vertiginous list of results detailing the technical processes involved in their production. Operational images par excellence, DeepFakes have spawned an industry of verification practices meant to buttress the epistemological doubt their existence sows. It would seem then that to be concerned with DeepFakes is to be concerned with veridicality, but, as this presentation argues, this problematic is derivative of, and entangled with, an aesthetic encounter. What if we approach DeepFakes otherwise, arriving at, rather than departing from, a causal understanding of their technicity? When a DeepFake “works,” it succeeds in satisfactorily producing gestures characteristic of the person it has “learned” to perform—through these gestures it means them, and only them. The question DeepFakes pose, then, is no longer simply “Is this video a true representation of X?” but “Is this performance true to X?” Gestures therefore plunge us into the aesthetics of personhood; they are, as Vilém Flusser argues, that which mediate personhood by bringing it into the social manifold of meaning. By linking Flusser’s theory of gesture with Gilbert Simondon’s theory of individuation, this presentation concludes by arguing that DeepFakes are a gestural technics of individuation—machinic operations which enfold personhood within the topological logic of gradient descent.

Tag: SLSA 2021

On the Embodied Phenomenology of DeepFakes — Full Text of Talk from #SLSA21

SLSA Panel: “DeepFake Energies” #SLSA21