Polymorph II is a result of research into multisensory, distrubuted AI working across different forms of matter, using the indeterminacy of complex physical systems to fine-tune generative AI models and produce emergent outcomes in immersive media.
It was produced using a recursively fine-tuned Stable Diffusion model and a real-time audio synthesis model (RAVE), alongside distorted steel plates functioning as sensors and sound resonators. In its data collection phases, thin strands of conductive ‘hair’, moving with shifts in air currents in the room, operated as sensors, together with live camera and audio input.
The data which drives these audiovisual outcomes is the result of the confluence of small changes in air current, the movement of bodies, and fluctuating electromagnetic interference entangled with fine-tuned generative AI models. Transforming across formats and forms of matter, the dataset generating the visible and auditory components of the work is merged with the environment of the data collection architecture.
This technique has enabled ‘leaps’ and ‘leaks’ - discontinuous transfers across the AI environment, where data formats and structural layers intersect at multiple temporal scales and across material components, generating differential intensities across sensing, auditory, and optical processes that do not resolve into a single hierarchy.
Sensing, auditory, and optical elements operate simultaneously as inputs and outputs, producing a continuously reconfiguring manifold of feedback relations.
Sensing, auditory, and optical elements operate simultaneously as inputs and outputs, producing a continuously reconfiguring manifold of feedback relations.
- In this iteration, Polymorph II integrates a RAVE audio model, a fine-tuned Stable Diffusion model, two steel plates (2) that function simultaneously as sensors and sound resonators, and a suspended conductive steel thread (1) that shifts closer to and further from a curved metal sheet. These changing proximities register as continuous variations in signal, producing data that is fed into TouchDesigner.
Subtle changes in air currents, body movement, and electromagnetic interference modulate the system. The steel plates translate vibrational activity into both acoustic output through surface resonators (3) and data input, while the conductive thread introduces low-intensity fluctuations through its movement relative to the metal surface. These streams condition the activation of the generative models.
The Stable Diffusion model generates images that are projected onto the metal sheet and surrounding walls. The sheet’s curvature (2) (8) redistributes the projection, producing a shifting optical field that is captured and reintroduced into the system through cameras (10). The resulting forms, transient, creature-like figures, act as markers of the system’s state, continuously shaped by incoming data.
Generated outputs are stored in local folders and used to further fine-tune the model, allowing the dataset to expand through its own activity. A small monitor displays a live imprint of these changes (6), indicating shifts in operational phase.
A microphone records the composite acoustic environment (5). This signal conditions the RAVE model, which generates sound played through speakers (4), feeding further variation back into the system.
Sensing, generation, and modulation remain entangled, with signals continuously circulating across material, acoustic, and visual states.
The physical structures and embedded sensors tuned the system to its own material conditions, so that its behaviour shifted in response to micro-variations in air currents and its own internal dynamics.
These changes registered as subtle but persistent modulations across the monitors.
Curved metal sheets within the installation distorted reflections of both the generated outputs and the camera feeds; these distortions were captured and fed back into the system, contributing to ongoing adjustments of the model.
As data passes through different formats and material states, the dataset generating the visual and auditory components of the work gradually expands, merging with the environment in which the system operates.
Systemic phase change - Polymorph’s activity maps
The system operates as a multivariate, adaptive setup in which sensors, models, and material feedback continuously influence one another. Under these conditions, small variations can accumulate and push the system across critical thresholds, leading to a change in how its activity is coordinated.
These regimes formed through the system’s own dynamics as it moved between more and less coherent states.
The maps make these shifts readable. They indicate when the system reorganises itself, and when distributed interactions settle into a different configuration of stability.
Datasets and multiagent complex dynamics
Plot of image data across two arbitrary semantic dimensions.
Polymorph’s continuous training outputs are not unambiguously separable, producing images that produce strong activations across disparate dimensions, here for classes “nipple” and “pretzel”.
Plot of t-SNE mapping of Polymorph outputs. To illustrate the various phases of Polymorph, the outputs from different stages of its activity were classified using a 152-layer ResNet model and mapped with the t-SNE algorithm to reduce dimensionality. The grouping of outputs expresses various functional shifts while maintaining cross-generational similarity.
The ResNet classification was performed using a pre-trained model from PyTorch. Each generated image was transformed to produce an output embedding of size 1,000, with each element corresponding to a distinct semantic category from ImageNet (for example “magpie”, “coffee mug”, or “screwdriver”).
The t-SNE algorithm reduces the >1,000-dimensional data to a two-dimensional plane while reflecting similarities between elements. The formation of clusters results from shared features in the input data.
The ResNet classification was performed using a pre-trained model from PyTorch. Each generated image was transformed to produce an output embedding of size 1,000, with each element corresponding to a distinct semantic category from ImageNet (for example “magpie”, “coffee mug”, or “screwdriver”).
The t-SNE algorithm reduces the >1,000-dimensional data to a two-dimensional plane while reflecting similarities between elements. The formation of clusters results from shared features in the input data.
Systemic taxonomies: ”Creatures”
To observe the system as it gradually increased in complexity, a set of entities was generated that functioned as discrete markers of its behaviour. These evolving forms registered shifts in sensor input and internal dynamics, taking shape through the interaction between environmental data and the system’s ongoing processes.
Their morphology was continuously modulated by incoming signals and by the outputs of the fine-tuned models. The creatures, along with other generated forms, were produced within the system and then fed back into it as part of the ongoing training process.