A family of early-vision neurons reacting to directional transitions from high to low spatial frequency.
Some of the neurons in vision models are features that we aren’t particularly surprised to find. Curve detectors, for example, are a pretty natural feature for a vision system to have. In fact, they had already been discovered in the animal visual cortex
High-low frequency detectors, on the other hand, seem more surprising. They are not a feature that we would have expected a priori to find. Yet, when systematically characterizingmixed3a
that appear to detect a high frequency pattern on one side, and a low frequency pattern on the other.
One worry we might have about the circuits approach
How can we be sure that “high-low frequency detectors” are actually detecting directional transitions from low to high spatial frequency? We will rely on three methods:
Later on in the article, we dive into the mechanistic details of how they are both implemented and used. We will be able to understand the algorithm that implements them, confirming that they detect high to low frequency transitions.
A feature visualization
From their feature visualizations, we observe that all of these high-low frequency detectors share these same characteristics:
We can use a diversity term in our feature visualizations to jointly optimize for the activation of a neuron while encouraging different activation patterns in a batch of visualizations. We are thus reasonably confident that if high-low frequency detectors were also sensitive to other patterns, we would see signs of them in these feature visualizations. Instead, the frequency contrast remains an invariant aspect of all these visualizations. (Although other patterns form along the boundary, these are likely outside the neuron’s effective receptive field.)
We generate dataset examples by sampling from a natural data distribution (in this case, the training set) and selecting the images that cause the neurons to maximally activate. Checking against these examples helps ensure we’re not misreading the feature visualizations.
A wide range of real-world situations can cause high-low frequency detectors to fire. Oftentimes it’s a highly-textured, in-focus foreground object against a blurry background — for example, the foreground might be the microphone’s latticework, the hummingbird’s tiny head feathers, or the small rubber dots on the Lenovo ThinkPad pointing stick — but not always: we also observe that it fires for the MP3 player’s brushed metal finish against its shiny screen, or the text of a watermark.
In all cases, we see one area with high frequency and another area with low frequency. Although they often fire at an object boundary, they can also fire in cases where there is a frequency change without an object boundary. High-low frequency detectors are therefore not the same as boundary detectors.
Tuning curves show us how a neuron’s response changes with respect to a parameter.
They are a standard method in neuroscience
To construct such a curve, we’ll need a set of synthetic stimuli which cause high-low frequency detectors to fire. We generate images with a high-frequency pattern on one side and a low-frequency pattern on the other. Since we’re interested in orientation, we’ll rotate this pattern to create a 1D family of stimuli:
But what frequency should we use for each side? How steep does the difference in frequency need to be? To explore this, we’ll add a second dimension varying the ratio between the two frequencies:
(Adding a second dimension will also help us see whether the results for the first dimension are robust.)
Now that we have these two dimensions, we sample the synthetic stimuli and plot each neuron’s responses to them:
Each high-low frequency detector exhibits a clear preference for a limited range of orientations. As we previously found with curve detectors, high-low frequency detectors are rotationally equivariant: each one selects for a given orientation, and together they span the full 360º space.
How are high-low frequency detectors built up from lower-level neurons? One could imagine many different circuits which could implement this behavior. To give just one example, it seems like there are at least two different ways that the oriented nature of these units could form.
To resolve this question — and more generally, to understand how these detectors are implemented — we can look at the weights.
Let’s look at a single detector. Glancing at the weights from conv2d2
to mixed3a
110, most of them can be roughly divided into two categories: those that activate on the left and inhibit on the right, and those that do the opposite.
The same also holds for each of the other high-low frequency detectors — but, of course, with different spatial patterns
Surprisingly, across all high-low frequency detectors, the two clusters of neurons that we get for each are actually the same two clusters! One cluster appears to detect textures with a generally high frequency, and one cluster appears to detect textures with a generally low frequency.
This is exactly what we would expect to see if the Invariant→Equivariant hypothesis is true: each high-low frequency detector composes the same two components in different spatial arrangements, which then in turn govern the detector’s orientation.
These two different clusters are really striking. In the next section, we’ll investigate them in more detail.
It would be nice if we could confirm that these two clusters of neurons are real. It would also be nice if we could create a simpler way to represent them for circuit analysis later.
Factorizing the connections
Each factor corresponds to a vector over neurons. Feature visualization can also be used to visualize these linear combinations of neurons. Strikingly, one clearly displays a generic high-frequency image, whereas the other does the same with a low-frequency image.
The feature visualizations are suggestive, but how can we be sure that these factors really correspond to high and low frequency in general, rather than specific high or low frequency patterns? One thing we can do is to create synthetic stimuli again, but now plotting the responses of those two NMF factors.
Since our factors don’t correspond to an edge, our synthetic stimuli will only have one frequency region for each stimulus. To add a second dimension and again demonstrate robustness, we also vary the rotation of that region. (The frequency texture is not exactly rotationally invariant because we construct the stimulus out of orthogonal cosine waves.)
Unlike last time, these activations now mostly ignore the image’s orientation, but are sensitive to its frequency. We can average these results over all orientations in order to produce a simple tuning curve of how each factor responds to frequency. As predicted, the HF-factor responds to high frequency and the LF-factor responds to low frequency.
Now that we’ve confirmed what these factors are, let’s look at how they’re combined into high-low frequency detectors.
NMF factors the weights into both a channel factor and a spatial factor. So far, we’ve looked at the two parts of the channel factor. The spatial factor shows the spatial weighting that combines the HF and LF factors into high-low frequency detectors.
Unsurprisingly, these weights basically reproduce the same pattern that we’d previously been seeing in Figure 5 from its two different clusters of neurons: where the HF-factor inhibits, the LF-factor activates — and vice versa.
conv2d2
123) also appears to be lightly activated by bright greens and magentas. This might be responsible for the feature visualizations of these high-low frequency detectors showing only greens and magentas on the high-frequency side.
High-low frequency detectors are therefore built up by circuits that arrange high frequency detection on one side and low frequency detection on the other.
There are some exceptions that aren’t fully captured by the NMF factorization perspective. For example, conv2d2
181 is a texture contrast detector that appears to already have spatial structure.
This is the kind of feature that we would expect to be involved through an Equivariant→Equivariant circuit.
If that were the case, however, we would expect its weights to the high-low frequency detector mixed3a
70 to be a solid positive stripe down the middle.
What we instead observe is that it contributes as a component of high frequency detection, though perhaps with a slight positive overall bias.
Although conv2d2
181 has a spatial structure, perhaps it responds more strongly to high frequency patterns.
Now that we understand how they are constructed, how are high-low frequency detectors used by higher-level features?
mixed3b
is the next layer immediately after the high-low frequency detectors. Here, high-low frequency detectors contribute to a variety of features. Their most important role seems to be supporting boundary detectors, but they also contribute to bumps and divots, line-like and curve-like shapes,
and at least one each of center-surrounds, patterns, and textures.
Oftentimes, downstream features appear to ignore the “polarity” of a high-low frequency detector, responding roughly the same way regardless of which side is high frequency. For example, the vertical boundary detector mixed3b
345 (see above) is strongly excited by high-low frequency detectors that detect frequency change across a vertical line in either direction.
Whereas activation from a high-low frequency detector can help detect boundaries between different objects, inhibition from a high-low frequency detector can also add structure to an object detector by detecting regions that must be contiguous along some direction — essentially, indicating the absence of a boundary.
As we’ve mentioned, by far the primary downstream contribution of high-low frequency detectors is to boundary detectors. Of the top 20 neurons in mixed3b
with the highest L2-norm of weights across all high-low frequency detectors, eight of those 20 neurons participate in boundary detection of some sort: double boundary detectors, miscellaneous boundary detectors, and especially object boundary detectors.
Object boundary detectors are neurons which detect boundaries between objects, whether that means the boundary between one object and another or the transition from foreground to background. They are different from edge detectors or curve detectors: although they are sensitive to edges (indeed, some of their strongest weights are contributed by lower-level edge detectors!), object boundary detectors are also sensitive to other indicators such as color contrast and high-low frequency detection.
High-low frequency detectors contribute to these object boundary detectors by providing one piece of evidence that an object has ended and something else has begun. Some examples of object boundary detectors are shown below, along with their weights to a selection of high-low frequency detectors, grouped by orientation (ignoring polarity).
In particular, note how similar the weights are within each grouping! This shows us again that the later layers ignore the high-low frequency detectors’ polarity. Furthermore, the arrangement of excitatory and inhibitory weights contributes to each boundary detector’s overall shape, following the principles outlined above.
Beyond mixed3b
, high-low frequency detectors ultimately play a role in detecting more sophisticated object shapes in mixed4a
and beyond, by continuing to contribute to the detection of boundaries and contiguity.
So far, the scope of our investigation has been limited to InceptionV1. How common are high-low frequency detectors in convolutional neural networks generally?
It’s always good to ask if what we see is the rule or an interesting exception
Notice that these detectors are found at very similar depths within the different networks, between 29% and 33% network depth!
Even though these families are from three completely different networks, we also discover that their high-low frequency detectors are built up from high and low frequency components.
As we did with InceptionV1, we can again perform NMF on the weights of the high-low frequency detectors in each network in order to extract the strongest two factors.
The feature visualizations of the two factors reveal one clear HF-factor and one clear LF-factor, just like what we found in InceptionV1. Furthermore, the weights on the two factors are again very close to symmetric.
Our earlier conclusions therefore also hold across these different networks: high-low frequency detectors are built up from the specific spatial arrangement of a high frequency component and a low frequency component.
Although high-low frequency detectors represent a feature that we didn’t necessarily expect to find in a neural network, we find that we can still explore and understand them using the interpretability tools we’ve built up for exploring circuits: NMF, feature visualization, synthetic stimuli, and more.
We’ve also learned that high-low frequency detectors are built up from comprehensible lower-level parts, and we’ve shown how they contribute to later, higher-level features. Finally, we’ve seen that high-low frequency detectors are common across multiple network architectures.
Given the universality observations, we might wonder whether the existence of high-low frequency detectors isn’t so unnatural after all. We even find approximate high-low frequency detectors in AlexNet Places, with its substantially different training data. Beyond neural networks, the aesthetic quality imparted by the blurriness of an out-of-focus region of an image is already known as to photographers as bokeh. And in VR, visual blur can either provide an effective depth-of-field cue or, conversely, can induce nausea in the user when implemented in a dissonant way. Perhaps frequency detection might well be commonplace in both natural and artificial vision systems as yet another type of informational cue.
Nevertheless, whether their existence is natural or not, we find that high-low frequency detectors are possible to characterize and understand.
As with many scientific collaborations, the contributions to the high-low frequency detectors paper are difficult to separate because it was a collaborative effort that we wrote together.
Conceptual Contributions. Christopher Olah originally noted the high-low frequency directors as a research direction.
Experiments. Ludwig Schubert wrote the code for generating and measuring synthetic tuning curves for the high-low frequency detectors, for performing NMF on the high-low frequency detectors to extract the HF-factor and the LF-factor, and for performing NMF on the high-low frequency detectors from other networks. Chelsea Voss wrote the code for generating and measuring synthetic stimuli for the HF-factor and LF-factor, with help from Chris for extracting and using the NMF components. This investigation was done in the context of and informed by collaborative research into circuits by Nick Cammarata, Gabe Goh, Chelsea, Ludwig, and Chris.
Figures. Ludwig designed the visualization of the high-low frequency detector synthetic tuning curves, the visualization of the HF-factor and LF-factor NMF vectors, the visualization of the HF-factor and LF-factor NMF weights from conv2d1 and conv2d2, and the figures demonstrating high-low frequency detectors from other networks shown in the Universality section. Chelsea designed the visualization of the HF-factor and LF-factor synthetic stimuli results and the figures articulating the downstream use of high-low frequency detectors, and edited some of the final figures. Two figures were borrowed from Zoom In and Early Vision.
Writing. Chelsea and Ludwig wrote the paper, with feedback from Chris.
We are also grateful to Jennifer Lin, Stefan Sietzen, and Vincent Tjeng for comments on a draft of this paper.
If you see mistakes or want to suggest changes, please create an issue on GitHub.
Diagrams and text are licensed under Creative Commons Attribution CC-BY 4.0 with the source available on GitHub, unless noted otherwise. The figures that have been reused from other sources don’t fall under this license and can be recognized by a note in their caption: “Figure from …”.
For attribution in academic contexts, please cite this work as
Schubert, et al., "High-Low Frequency Detectors", Distill, 2021.
BibTeX citation
@article{schubert2021high-low, author = {Schubert, Ludwig and Voss, Chelsea and Cammarata, Nick and Goh, Gabriel and Olah, Chris}, title = {High-Low Frequency Detectors}, journal = {Distill}, year = {2021}, note = {https://distill.pub/2020/circuits/frequency-edges}, doi = {10.23915/distill.00024.005} }