Self-Organising Textures

Niklasson, Eyvind; Mordvintsev, Alexander; Randazzo, Ettore; Levin, Michael

doi:10.23915/distill.00027.003

Distill

Self-Organising Textures

Neural Cellular Automata Model of Pattern Formation

Speed:

Cell alignment

Rotation:

Grid type

Textures

Inception

Try in a Notebook Try in a Notebook

Authors

Affiliations

Eyvind Niklasson

Google

Alexander Mordvintsev

Ettore Randazzo

Published

Feb. 11, 2021

DOI

10.23915/distill.00027.003

This article is part of the Differentiable Self-organizing Systems Thread, an experimental format collecting invited short articles delving into differentiable self-organizing systems, interspersed with critical commentary from several experts in adjacent fields.

Self-classifying MNIST Digits Adversarial Reprogramming of Neural Cellular Automata

Neural Cellular Automata (NCA We use NCA to refer to both Neural Cellular Automata and Neural Cellular Automaton.) are capable of learning a diverse set of behaviours: from generating stable, regenerating, static images , to segmenting images , to learning to “self-classify” shapes . The inductive bias imposed by using cellular automata is powerful. A system of individual agents running the same learned local rule can solve surprisingly complex tasks. Moreover, individual agents, or cells, can learn to coordinate their behavior even when separated by large distances. By construction, they solve these tasks in a massively parallel and inherently degenerate Degenerate in this case refers to the biological concept of degeneracy. way. Each cell must be able to take on the role of any other cell - as a result they tend to generalize well to unseen situations.

In this work, we apply NCA to the task of texture synthesis. This task involves reproducing the general appearance of a texture template, as opposed to making pixel-perfect copies. We are going to focus on texture losses that allow for a degree of ambiguity. After training NCA models to reproduce textures, we subsequently investigate their learned behaviors and observe a few surprising effects. Starting from these investigations, we make the case that the cells learn distributed, local, algorithms.

To do this, we apply an old trick: we employ neural cellular automata as a differentiable image parameterization .

Patterns, textures and physical processes

A pair of Zebra. Zebra are said to have unique stripes.

Zebra stripes are an iconic texture. Ask almost anyone to identify zebra stripes in a set of images, and they will have no trouble doing so. Ask them to describe what zebra stripes look like, and they will gladly tell you that they are parallel stripes of slightly varying width, alternating in black and white. And yet, they may also tell you that no two zebra have the same set of stripes Perhaps an apocryphal claim, but at the very lowest level every zebra will be unique. Ourp point is - “zebra stripes” as a concept in human understanding refers to the general structure of a black and white striped pattern and not to a specific mapping from location to colour.. This is because evolution has programmed the cells responsible for creating the zebra pattern to generate a pattern of a certain quality, with certain characteristics, as opposed to programming them with the blueprints for an exact bitmap of the edges and locations of stripes to be moulded to the surface of the zebra’s body.

Put another way, patterns and textures are ill-defined concepts. The Cambridge English Dictionary defines a pattern as “any regularly repeated arrangement, especially a design made from repeated lines, shapes, or colours on a surface”. This definition falls apart rather quickly when looking at patterns and textures that impart a feeling or quality, rather than a specific repeating property. A coloured fuzzy rug, for instance, can be considered a pattern or a texture, but is composed of strands pointing in random directions with small random variations in size and color, and there is no discernable regularity to the pattern. Penrose tilings do not repeat (they are not translationally invariant), but show them to anyone and they’ll describe them as a pattern or a texture. Most patterns in nature are outputs of locally interacting processes that may or may not be stochastic in nature, but are often based on fairly simple rules. There is a large body of work on models which give rise to such patterns in nature; most of it is inspired by Turing’s seminal paper on morphogenesis.

Such patterns are very common in developmental biology . In addition to coat colors and skin pigmentation, invariant large-scale patterns, arising in spite of stochastic low-level dynamics, are a key feature of peripheral nerve networks, vascular networks, somites (blocks of tissue demarcated in embryogenesis that give rise to many organs), and segments of anatomical and genetic-level features, including whole body plans (e.g., snakes and centipedes) and appendages (such as demarcation of digit fields within the vertebrate limb). These kinds of patterns are generated by reaction-diffusion processes, bioelectric signaling, planar polarity, and other cell-to-cell communication mechanisms. Patterns in biology are not only structural, but also physiological, as in the waves of electrical activity in the brain and the dynamics of gene regulatory networks. These gene regulatory networks, for example, can support computation sufficiently sophisticated as to be subject to Liar paradoxes See liar paradox. In principle, gene regulatory networks can express paradoxical behaviour, such as that expression of factor A represses the expression of factor A. One result of such a paradox can be that a certain factor will oscillate with time. . Studying the emergence and control of such patterns can help us to understand not only their evolutionary origins, but also how they are recognized (either in the visual system of a second observer or in adjacent cells during regeneration) and how they can be modulated for the purposes of regenerative medicine.

As a result, when having any model learn to produce textures or patterns, we want it to learn a generative process for the pattern. We can think of such a process as a means of sampling from the distribution governing this pattern. The first hurdle is to choose an appropriate loss function, or qualitative measure of the pattern. To do so, we employ ideas from Gatys et. al . NCA become the parametrization for an image which we “stylize” in the style of the target pattern. In this case, instead of restyling an existing image, we begin with a fully unconstrained setting: the output of an untrained, randomly initialized, NCA. The NCA serve as the “renderer” or “generator”, and a pre-trained differentiable model serves as a distinguisher of the patterns, providing the gradient necessary for the renderer to learn to produce a pattern of a certain style.

From Turing, to Cellular Automata, to Neural Networks

NCA are well suited for generating textures. To understand why, we’ll demonstrate parallels between texture generation in nature and NCA. Given these parallels, we argue that NCA are a good model class for texture generation.

PDEs

In “The Chemical Basis of Morphogenesis” , Alan Turing suggested that simple physical processes of reaction and diffusion, modelled by partial differential equations, lie behind pattern formation in nature, such as the aforementioned zebra stripes. Extensive work has since been done to identify PDEs modeling reaction-diffusion and evaluating their behaviour. One of the more celebrated examples is the Gray-Scott model of reaction diffusion (,). This process has a veritable zoo of interesting behaviour, explorable by simply tuning the two parameters. We strongly encourage readers to visit this interactive atlas of the different regions of the Gray-Scott reaction diffusion model to get a sense for the extreme variety of behaviour hidden behind two simple knobs. The more adventurous can even play with a simulation locally or in the browser.

To tackle the problem of reproducing our textures, we propose a more general version of the above systems, described by a simple Partial Differential Equation (PDE) over the state space of an image.

\frac{\partial \mathbf{s} }{\partial t } = f(\textbf{s}, \nabla_\mathbf{x} \textbf{s}, \nabla_\mathbf{x}^{2}\textbf{s})

Here, $f$ is a function that depends on the gradient ( $\nabla_\mathbf{x} \textbf{s}$ ) and Laplacian ( $\nabla_\mathbf{x}^{2}\textbf{s}$ ) of the state space and determines the time evolution of this state space. $s$ represents a k dimensional vector, whose first three components correspond to the visible RGB color channels.

Intuitively, we have defined a system where every point of the image changes with time, in a way that depends on how the image currently changes across space, with respect to its immediate neighbourhood. Readers may start to recognize the resemblance between this and another system based on immediately local interactions.

To CAs

Differential equations governing natural phenomena are usually evaluated using numerical differential equation solvers. Indeed, this is sometimes the only way to solve them, as many PDEs and ODEs of interest do not have closed form solutions. This is even the case for some deceptively simple ones, such as the three-body problem. Numerically solving PDEs and ODEs is a vast and well-established field. One of the biggest hammers in the metaphorical toolkit for numerically evaluating differential equations is discretization: the process of converting the variables of the system from continuous space to a discrete space, where numerical integration is tractable. When using some ODEs to model a change in a phenomena over time, for example, it makes sense to advance through time in discrete steps, possibly of variable size.

We now show that numerically integrating the aforementioned PDE is equivalent to reframing the problem as a Neural Cellular Automata, with $f$ assuming the role of the NCA rule.

The logical approach to discretizing the space the PDE operates on is to discretize the continuous 2D image space into a 2D raster grid. Boundary conditions are of concern but we can address them by moving to a toroidal world where each dimension wraps around on itself.

Similarly to space, we choose to treat time in a discretized fashion and evaluate our NCA at fixed-sized time steps. This is equivalent to explicit Euler integration. However, here we make an important deviation from traditional PDE numerical integration methods for two reasons. First, if all cells are updated synchronously, initial conditions $s_0$ must vary from cell-to-cell in order to break the symmetry. Second, the physical implementation of the synchronous model would require the existence of a global clock, shared by all cells. One way to work around the former is by initializing the grid with random noise, but in the spirit of self organisation we instead choose to decouple the cell updates by asynchronously evaluating the CA. We sample a subset of all cells at each time-step to update. This introduces both asynchronicity in time (cells will sometimes operate on information from their neighbours that is several timesteps old), and asymmetry in space, solving both aforementioned issues.

Our next step towards representing a PDE with cellular automata is to discretize the gradient and Laplacian operators. For this we use the sobel operator and the 9-point variant of the discrete Laplace operator, as below.

\begin{array}{ c c c } \begin{bmatrix} -1 & 0 & 1\\-2 & 0 & 2 \\-1 & 0 & 1 \end{bmatrix} & \begin{bmatrix} -1 & -2 & -1\\ 0 & 0 & 0 \\1 & 2 & 1 \end{bmatrix} & \begin{bmatrix} 1 & 2 & 1\\2 & -12 & 2 \\1 & 2 & 1 \end{bmatrix} \\ Sobel_x & Sobel_y & Laplacian \end{array}

With all the pieces in place, we now have a space-discretized version of our PDE that looks very much like a Cellular Automata: the time evolution of each discrete point in the raster grid depends only on its immediate neighbours. These discrete operators allow us to formalize our PDE as a CA. To double check that this is true, simply observe that as our grid becomes very fine, and the asynchronous updates approach uniformity, the dynamics of these discrete operators will reproduce the continuous dynamics of the original PDE as we defined it.

To Neural Networks

The final step in implementing the above general PDE for texture generation is to translate it to the language of deep learning. Fortunately, all the operations involved in iteratively evaluating the generalized PDE exist as common operations in most deep learning frameworks. We provide both a Tensorflow and a minimal PyTorch implementation for reference, and refer readers to these for details on our implementation.

NCA as pattern generators

Model:

We build on the Growing CA NCA model , complete with built-in quantization of weights, stochastic updates, and the batch pool mechanism to approximate long-term training. For further details on the model and motivation, we refer readers to this work.

Loss function:

We use a well known deep convolutional network for image recognition, VGG (Visual Geometry Group Net ) as our differentiable discriminator of textures, for the same reasons outlined in Differentiable Parametrizations . We start with a template image, $\vec{x}$ , which we feed into VGG. Then we collect statistics from certain layers (block[ $1…5$ ]_conv1) in the form of the raw activation values of the neurons in these layers. Finally, we run our NCA forward for between 32 and 64 iterations, feeding the resulting RGB image into VGG. Our loss is the $L_2$ distance between the gram matrix For a brief definition of gram matrices, see here. of activations of these neurons with the NCA as input and their activations with the template image as input. We keep the weights of VGG frozen and use ADAM to update the weights of the NCA.

Dataset:

The template images for this dataset are from the Oxford Describable Textures Dataset . The aim of this dataset is to provide a benchmark for measuring the ability of vision models to recognize and categorize textures and describe textures using words. The textures were collected to match 47 “attributes” such as “bumpy” or “polka-dotted”. These 47 attributes were in turn distilled from a set of common words used to describe textures identified by Bhusan, Rao and Lohse .

Results:

After a few iterations of training, we see the NCA converge to a solution that at first glance looks similar to the input template, but not pixel-wise identical. The very first thing to notice is that the solution learned by the NCA is not time-invariant if we continue to iterate the CA. In other words it is constantly changing!

This is not completely unexpected. In Differentiable Parametrizations, the authors noted that the images produced when backpropagating into image space would end up different each time the algorithm was run due to the stochastic nature of the parametrizations. To work around this, they introduced some tricks to maintain alignment between different visualizations. In our model, we find that we attain such alignment along the temporal dimension without optimizing for it; a welcome surprise. We believe the reason is threefold. First, reaching and maintaining a static state in an NCA appears to be non-trivial in comparison to a dynamic one, so much so that in Growing CA a pool of NCA states at various iteration times had to be maintained and sampled as starting states to simulate loss being applied after a time period longer than the NCAs iteration period, to achieve a static stability. We employ the same sampling mechanism here to prevent the pattern from decaying, but in this case the loss doesn’t enforce a static fixed target; rather it guides the NCA towards any one of a number of states that minimizes the style loss. Second, we apply our loss after a random number of iterations of the NCA. This means that, at any given time step, the pattern must be in a state that minimizes the loss. Third, the stochastic updates, local communication, and quantization all limit and regularize the magnitude of updates at each iteration. This encourages changes to be small between one iteration and the next. We hypothesize that these properties combined encourage the NCA to find a solution where each iteration is aligned with the previous iteration. We perceive this alignment through time as motion, and as we iterate the NCA we observe it traversing a manifold of locally aligned solutions.

We now posit that finding temporally aligned solutions is equivalent to finding an algorithm, or process, that generates the template pattern, based on the aforementioned findings and qualitative observation of the NCA. We proceed to demonstrate some exciting behaviours of NCA trained on different template images.

An NCA trained to create a pattern in the style of chequered_0121.jpg.

Here, we see that the NCA is trained using a template image of a simple black and white grid.

We notice that:

Initially, a non-aligned grid of black and white quadrilaterals is formed.
As time progresses, the quadrilaterals seemingly grow or shrink in both $\vec{x}$ and $\vec{y}$ to more closely approximate squares. Quadrilaterals of both colours either emerge or disappear. Both of these behaviours seem to be an attempt to find local consistency.
After a longer time, the grid tends to achieve perfect consistency.

Such behaviour is not entirely unlike what one would expect in a hand-engineered algorithm to produce a consistent grid with local communication. For instance, one potential hand-engineered approach would be to have cells first try and achieve local consistency, by choosing the most common colour from the cells surrounding them, then attempting to form a diamond of correct size by measuring distance to the four edges of this patch of consistent colour, and moving this boundary if it were incorrect. Distance could be measured by using a hidden channel to encode a gradient in each direction of interest, with each cell decreasing the magnitude of this channel as compared to its neighbour in that direction. A cell could then localize itself within a diamond by measuring the value of two such gradient channels. The appearance of such an algorithm would bear resemblance to the above - with patches of cells becoming either black, or white, diamonds then resizing themselves to achieve consistency.

An NCA trained to create a pattern in the style of bubbly_0101.jpg.

In this video, the NCA has learned to reproduce a texture based on a template of clear bubbles on a blue background. One of the most interesting behaviours we observe is that the density of the bubbles remains fairly constant. If we re-initialize the grid states, or interactively destroy states, we see a multitude of bubbles re-forming. However, as soon as two bubbles get too close to each other, one of them spontaneously collapses and disappears, ensuring a constant density of bubbles throughout the entire image. We regard these bubbles as ”solitons″ in the solution space of our NCA. This is a concept we will discuss and investigate at length below.

If we speed the animation up, we see that different bubbles move at different speeds, yet they never collide or touch each other. Bubbles also maintain their structure by self-correcting; a damaged bubble can re-grow.

This behaviour is remarkable because it arises spontaneously, without any external or auxiliary losses. All of these properties are learned from a combination of the template image, the information stored in the layers of VGG, and the inductive bias of the NCA. The NCA learned a rule that effectively approximates many of the properties of the bubbles in the original image. Moreover, it has learned a process that generates this pattern in a way that is robust to damage and looks realistic to humans.

An NCA trained to create a pattern in the style of interlaced_0172.jpg.

Here we see one of our favourite patterns: a simple geometric “weave”. Again, we notice the NCA seems to have learned an algorithm for producing this pattern. Each “thread” alternately joins or detaches from other threads in order to produce the final pattern. This is strikingly similar to what one would attempt to implement, were one asked to programmatically generate the above pattern. One would try to design some sort of stochastic algorithm for weaving individual threads together with other nearby threads.

An NCA trained to create a pattern in the style of banded_0037.jpg.

Here, misaligned stripe fragments travel up or down the stripe until either they merge to form a single straight stripe or a stripe shrinks and disappears. Were this to be implemented algorithmically with local communication, it is not infeasible that a similar algorithm for finding consistency among the stripes would be used.

This foray into pattern generation is by no means the first. There has been extensive work predating deep-learning, in particular suggesting deep connections between spatial patterning of anatomical structure and temporal patterning of cognitive and computational processes (e.g., reviewed in ). Hans Spemann, one of the heroes of classical developmental biology, said “Again and again terms have been used which point not to physical but to psychical analogies. It was meant to be more than a poetical metaphor. It was meant to my conviction that the suitable reaction of a germ fragment, endowed with diverse potencies, in an embryonic ‘field’… is not a common chemical reaction, like all vital processes, are comparable, to nothing we know in such degree as to vital processes of which we have the most intimate knowledge.” . More recently, Grossberg quantitatively laid out important similarities between developmental patterning and computational neuroscience . As briefly touched upon, the inspiration for much of the work came from Turing’s work on pattern generation through local interaction, and later papers based on this principle. However, we also wish to acknowledge some works that we feel have a particular kinship with ours.

Patch sampling

Early work in pattern generation focused on texture sampling. Patches were often sampled from the original image and reconstructed or rejoined in different ways to obtain an approximation of the texture. This method has also seen recent success with the work of Gumin .

Deep learning

Gatys et. al’s work , referenced throughout, has been seminal with regards to the idea that statistics of certain layers in a pre-trained network can capture textures or styles in an image. There has been extensive work building on this idea, including playing with other parametrisations for image generation and optimizing the generation process .

Other work has focused on using a convolutional generator combined with path sampling and trained using an adversarial loss to produce textures of similar quality .

Interactive Evolution of Camouflage

Perhaps the most unconventional approach, with which we find kinship, is laid out in Interactive Evolution of Camouflage . Craig Reynolds uses a texture description language, consisting of generators and operators, to parametrize a texture patch, which is presented to human viewers who have to decide which patches are the worst at “camouflaging” themselves against a chosen background texture. The population is updated in an evolutionary fashion to maximize “camouflage”, resulting in a texture exhibiting the most camouflage (to human eyes) after a number of iterations. We see strong parallels with our work - instead of a texture generation language, we have an NCA parametrize the texture, and instead of human reviewers we use VGG as an evaluator of the quality of a generated pattern. We believe a fundamental difference lies in the solution space of an NCA. A texture generation language comes with a number of inductive biases and learns a deterministic mapping from coordinates to colours. Our method appears to learn more general algorithms and behaviours giving rise to the target pattern.

Two other noteworthy examples of similar work are Portilla et. al’s work with the wavelet transform , and work by Chen et al with reaction diffusion .

Feature visualization

A butterfly with an “eye-spot” on the wings.

We have now explored some of the fascinating behaviours learned by the NCA when presented with a template image. What if we want to see them learn even more “unconstrained” behaviour?

Some butterflies have remarkably lifelike eyes on their wings. It’s unlikely the butterflies are even aware of this incredible artwork on their own bodies. Evolution placed these there to trigger a response of fear in potential predators or to deflect attacks from them . It is likely that neither the predator nor the butterfly has a concept for what an eye is or what an eye does, or even less so any theory of mind regarding the consciousness of the other, but evolution has identified a region of morphospace for this organism that exploits pattern-identifying features of predators to trick them into fearing a harmless bug instead of consuming it.

Even more remarkable is the fact that the individual cells composing the butterfly’s wings can self assemble into coherent, beautiful, shapes far larger than an individual cell - indeed a cell is on the order of $1^{-5}m$ while the features on the wings will grow to as large as $1^{-3}m$ . The coordination required to produce these features implies self-organization over hundreds or thousands of cells to generate a coherent image of an eye that evolved simply to act as a visual stimuli for an entirely different species, because of the local nature of cell-to-cell communication. Of course, this pales in comparison to the morphogenesis that occurs in animal and plant bodies, where structures consisting of millions of cells will specialize and coordinate to generate the target morphology.

A common approach to investigating neural networks is to look at what inhibits or excites individual neurons in a network . Just as neuroscientists and biologists have often treated cells and cell structures and neurons as black-box models to be investigated, measured and reverse-engineered, there is a large contemporary body of work on doing the same with neural networks. For instance the work by Boettiger .

We can explore this idea with minimal effort by taking our pattern-generating NCA and exploring what happens if we task it to enter a state that excites a given neuron in Inception. One of the common resulting NCAs we notice is eye and eye-related shapes - such as the video below - likely as a result of having to detect various animals in ImageNet. In the same way that cells form eye patterns on the wings of butterflies to excite neurons in the brains of predators, our NCA’s population of cells has learned to collaborate to produce a pattern that excites certain neurons in an external neural network.

An NCA trained to excite mixed4a_472 in Inception.