The Contrast Transfer Function (CTF)
The microscope does not image faithfully — it weights every frequency by an oscillating curve and even inverts contrast
The microscope does not record the specimen faithfully. Because of defocus and spherical aberration, it transfers different spatial frequencies with different strengths — some are boosted, some are pushed to zero (lost entirely), and some have their contrast fully inverted (what should be dark comes out bright). The curve that describes this is the Contrast Transfer Function (CTF).
Put differently: decompose an image into stripe patterns of every coarseness (spatial frequencies), and the microscope assigns each stripe a gain between and . A gain of passes that stripe through unchanged, erases it, passes it through with black and white swapped. The CTF is the curve of gain versus stripe coarseness — and it is not flat or monotone but oscillates, crossing zero and flipping sign again and again.
Why does the microscope behave this way? Biological molecules in vitreous ice barely absorb electrons; they mostly add phase to the electron wave passing through (a phase object). But the detector records only intensity — it cannot store phase. Turning that invisible phase difference into visible brightness requires deliberately defocusing the lens: defocus makes scattered waves of different frequencies accumulate different phase delays by the time they reach the detector, and those delays make some waves interfere constructively (bright) and others destructively (dark), translating phase into contrast. The cost is that this translation acts differently on every frequency — hence the oscillating curve.
The CTF’s oscillation with spatial frequency, with defocus and amplitude contrast adjustable:
At each zero the CTF wipes out that frequency entirely; in negative bands it inverts contrast. More defocus → faster oscillation and a lower-frequency first zero — which is why no single defocus covers all frequencies and why a defocus series is merged.
The purple bands are frequencies where CTF < 0 and contrast is inverted; the yellow dashed line is the first zero — beyond it the curve repeatedly crosses zero, repeatedly erasing those frequencies. Try increasing defocus: the first zero moves toward lower frequency and the whole curve packs tighter — meaning high defocus gives strong contrast to low frequencies (coarse features) but starts losing high frequencies (fine detail) sooner. Lower defocus does the reverse: weaker low-frequency contrast, but the first zero is pushed out to higher frequency, preserving more detail. This is exactly the “visible versus sharp” trade-off made when collecting data.
The math
The CTF is set by the wave-aberration function . For spatial frequency , defocus , spherical aberration , and electron wavelength :
Read the phase term by term: is the extra phase (in radians) that a wave scattered to frequency accumulates relative to the unscattered wave. is spatial frequency (units ; larger means finer features); is defocus (by convention underfocus is taken positive); is the spherical-aberration coefficient, which captures the extra phase error the lens imposes on widely scattered waves; is the electron wavelength set by the accelerating voltage. The first term comes from defocus and grows as , dominating at low and mid frequencies; the second term comes from aberration and grows as , mattering only at high frequency.
Including amplitude contrast :
The term is phase contrast: every time advances by , flips sign — this is the source of the repeated sign changes. (typically about ) is the fraction of amplitude contrast, the small part of the signal that comes from electrons actually absorbed or scattered out of the beam, which images without relying on defocus; through the term it gives a little nonzero contrast near zero frequency, so as the rather than . The leading minus sign is a convention that makes low-frequency contrast negative under underfocus, matching common displays.
The demo uses 300 kV ( Å) and mm. The term is defocus-dominated and grows quadratically; the term is aberration-dominated and only matters at high frequency — together they produce oscillations that pack tighter toward high .
A useful concrete number: a new zero appears each time the phase advances by . Solving , i.e. , with only the defocus term gives a first zero near . With and , this is — the first zero lands at a feature size of about . Raise defocus to and shrinks by , retreating to about — quantifying how high defocus loses detail sooner.
So translating phase into contrast carries a double cost. First the oscillation above: past some frequency swings positive and negative, so coarse and fine features in the same image disagree on which way is dark. Second, the detector measures only intensity and discards the wave’s phase, and the CTF is the per-frequency fingerprint that this phase-to-intensity translation leaves behind. Understanding both makes it clear why correction is unavoidable.
Why it matters
- Zeros = information holes: at every CTF zero that frequency is simply not recorded and cannot be recovered from a single image. Correction can flip signs and reweight, but it cannot conjure information that was never captured at a zero.
- Contrast inversion: features in negative bands have flipped polarity, so CTF correction must flip the sign back before reconstruction. The crudest correction, “phase flipping,” multiplies the bands by so all bands share one polarity; a fuller scheme (Wiener filtering) also reweights each frequency by and the noise level — but still cannot fill the zeros.
- One defocus is not enough: the first zero’s position depends on defocus, and no single defocus covers all frequencies — so experiments often collect a defocus series and merge it. Where one image has a zero, another transfers that frequency well; the staggered zeros cover for each other, and only the merge has signal across a wide band.
One practical detail: real specimens have thickness and often slight astigmatism, so the effective defocus varies with position and direction within a single image. The pipeline must first estimate the defocus and astigmatism for each image (or each region), fitting that CTF curve before it can correct — usually by reading the concentric rings (Thon rings) in the image power spectrum, whose spacing and ellipticity report defocus and astigmatism directly.
The CTF and the missing wedge are the two pillars of Cryo-ET imaging: one discounts along frequency, the other carves out along angle. Together with sampling and Nyquist it decides which frequencies in an image are trustworthy: Nyquist sets the ceiling, the CTF weights frequency by frequency below it. The CTF’s per-frequency falloff is also a major reason image signal-to-noise drops with frequency. CTF correction before reconstruction is a form of filtering — applying a per-frequency weight in the frequency domain. Methods like CryoGEN do not treat correction as an isolated preprocessing step; they write the CTF (frequency-domain degradation) and the missing wedge (angle-domain degradation) into one imaging model, so restoration and reconstruction confront both degradations together.