The Contrast Transfer Function (CTF)

The microscope does not image faithfully — it weights every frequency by an oscillating curve and even inverts contrast

Intuition

The microscope does not record the specimen faithfully. Because of defocus and spherical aberration, it transfers different spatial frequencies with different strengths — some are boosted, some are pushed to zero (lost entirely), and some have their contrast fully inverted (what should be dark comes out bright). The curve that describes this is the Contrast Transfer Function (CTF).

Put differently: decompose an image into stripe patterns of every coarseness (spatial frequencies), and the microscope assigns each stripe a gain between 1-1 and +1+1. A gain of +1+1 passes that stripe through unchanged, 00 erases it, 1-1 passes it through with black and white swapped. The CTF is the curve of gain versus stripe coarseness — and it is not flat or monotone but oscillates, crossing zero and flipping sign again and again.

Why does the microscope behave this way? Biological molecules in vitreous ice barely absorb electrons; they mostly add phase to the electron wave passing through (a phase object). But the detector records only intensity — it cannot store phase. Turning that invisible phase difference into visible brightness requires deliberately defocusing the lens: defocus makes scattered waves of different frequencies accumulate different phase delays by the time they reach the detector, and those delays make some waves interfere constructively (bright) and others destructively (dark), translating phase into contrast. The cost is that this translation acts differently on every frequency — hence the oscillating curve.

The CTF’s oscillation with spatial frequency, with defocus and amplitude contrast adjustable:

First zeroSpatial frequency k (1/Å)+1−1
CTF(k)Phase-inverted band (CTF < 0)

At each zero the CTF wipes out that frequency entirely; in negative bands it inverts contrast. More defocus → faster oscillation and a lower-frequency first zero — which is why no single defocus covers all frequencies and why a defocus series is merged.

The purple bands are frequencies where CTF < 0 and contrast is inverted; the yellow dashed line is the first zero — beyond it the curve repeatedly crosses zero, repeatedly erasing those frequencies. Try increasing defocus: the first zero moves toward lower frequency and the whole curve packs tighter — meaning high defocus gives strong contrast to low frequencies (coarse features) but starts losing high frequencies (fine detail) sooner. Lower defocus does the reverse: weaker low-frequency contrast, but the first zero is pushed out to higher frequency, preserving more detail. This is exactly the “visible versus sharp” trade-off made when collecting data.

The math

Depth

The CTF is set by the wave-aberration function χ(k)\chi(k). For spatial frequency kk, defocus Δf\Delta f, spherical aberration CsC_s, and electron wavelength λ\lambda:

χ(k)=πλΔfk2    π2Csλ3k4\chi(k) = \pi\,\lambda\,\Delta f\,k^2 \;-\; \frac{\pi}{2}\,C_s\,\lambda^3\,k^4

Read the phase term by term: χ(k)\chi(k) is the extra phase (in radians) that a wave scattered to frequency kk accumulates relative to the unscattered wave. kk is spatial frequency (units 1/A˚1/\text{Å}; larger means finer features); Δf\Delta f is defocus (by convention underfocus is taken positive); CsC_s is the spherical-aberration coefficient, which captures the extra phase error the lens imposes on widely scattered waves; λ\lambda is the electron wavelength set by the accelerating voltage. The first term πλΔfk2\pi\lambda\,\Delta f\,k^2 comes from defocus and grows as k2k^2, dominating at low and mid frequencies; the second term π2Csλ3k4\tfrac{\pi}{2}C_s\lambda^3 k^4 comes from aberration and grows as k4k^4, mattering only at high frequency.

Including amplitude contrast ww:

CTF(k)=[1w2sinχ(k)+wcosχ(k)]\mathrm{CTF}(k) = -\Big[\sqrt{1-w^2}\,\sin\chi(k) + w\,\cos\chi(k)\Big]

The sinχ\sin\chi term is phase contrast: every time χ\chi advances by π\pi, sinχ\sin\chi flips sign — this is the source of the repeated sign changes. ww (typically about 0.070.10.07\text{–}0.1) is the fraction of amplitude contrast, the small part of the signal that comes from electrons actually absorbed or scattered out of the beam, which images without relying on defocus; through the cosχ\cos\chi term it gives a little nonzero contrast near zero frequency, so as k0k\to 0 the CTFw\mathrm{CTF}\to -w rather than 00. The leading minus sign is a convention that makes low-frequency contrast negative under underfocus, matching common displays.

The demo uses 300 kV (λ0.0197\lambda \approx 0.0197 Å) and Cs=2.7C_s = 2.7 mm. The k2k^2 term is defocus-dominated and grows quadratically; the k4k^4 term is aberration-dominated and only matters at high frequency — together they produce oscillations that pack tighter toward high kk.

A useful concrete number: a new zero appears each time the phase advances by π\pi. Solving sinχ=0\sin\chi=0, i.e. χ(kn)=nπ\chi(k_n)=n\pi, with only the defocus term gives a first zero near k11/(λΔf)k_1\approx\sqrt{1/(\lambda\,\Delta f)}. With Δf=1 μm=104 A˚\Delta f=1\ \mu\text{m}=10^4\ \text{Å} and λ0.0197 A˚\lambda\approx 0.0197\ \text{Å}, this is k11/14 A˚1k_1\approx 1/14\ \text{Å}^{-1} — the first zero lands at a feature size of about 14 A˚14\ \text{Å}. Raise defocus to 2 μm2\ \mu\text{m} and k1k_1 shrinks by 2\sqrt 2, retreating to about 20 A˚20\ \text{Å} — quantifying how high defocus loses detail sooner.

So translating phase into contrast carries a double cost. First the oscillation above: past some frequency sinχ\sin\chi swings positive and negative, so coarse and fine features in the same image disagree on which way is dark. Second, the detector measures only intensity 2|\,\cdot\,|^2 and discards the wave’s phase, and the CTF is the per-frequency fingerprint that this phase-to-intensity translation leaves behind. Understanding both makes it clear why correction is unavoidable.

Why it matters

One practical detail: real specimens have thickness and often slight astigmatism, so the effective defocus varies with position and direction within a single image. The pipeline must first estimate the defocus and astigmatism for each image (or each region), fitting that CTF curve before it can correct — usually by reading the concentric rings (Thon rings) in the image power spectrum, whose spacing and ellipticity report defocus and astigmatism directly.

The CTF and the missing wedge are the two pillars of Cryo-ET imaging: one discounts along frequency, the other carves out along angle. Together with sampling and Nyquist it decides which frequencies in an image are trustworthy: Nyquist sets the ceiling, the CTF weights frequency by frequency below it. The CTF’s per-frequency falloff is also a major reason image signal-to-noise drops with frequency. CTF correction before reconstruction is a form of filtering — applying a per-frequency weight in the frequency domain. Methods like CryoGEN do not treat correction as an isolated preprocessing step; they write the CTF (frequency-domain degradation) and the missing wedge (angle-domain degradation) into one imaging model, so restoration and reconstruction confront both degradations together.

← Signal Processing