Convolution & the point-spread function

Every linear shift-invariant imaging system blurs by convolving the object with a point-spread function, which becomes a transfer function in frequency space

Convolution is the operation that describes how a linear, shift-invariant system transforms an input. For a one-dimensional signal $f$ and a kernel $h$ ,

(f * h)(x) = \int_{-\infty}^{\infty} f(x')\,h(x - x')\,dx',

with the obvious extension to two and three dimensions. If a system is linear (outputs add) and shift-invariant (the response does not depend on position), then its entire behavior is captured by its response to a single point source. That response is the point-spread function (PSF), and the output of the system is the object convolved with the PSF.

Read the integral term by term: $x$ is the output location being computed; $x'$ is the summation variable that sweeps over the whole object; $f(x')$ is the object’s intensity at $x'$ ; and $h(x-x')$ is the weight with which that point contributes to location $x$ , depending only on the difference $x-x'$ — which is exactly what “shift-invariant” means. The minus sign flips $h$ before summing; that flip is what distinguishes convolution from correlation. It makes no visible difference for a symmetric PSF, but it cannot be ignored in general.

Intuition

Picture the object as a night sky of glowing points, and the instrument as something that cannot image any point as a true point: each one is drawn as a small blob of fixed shape, and that blob is the PSF. The whole image is every point’s blob, scaled by its brightness, laid down on top of one another. No information about the object is destroyed — it is just stamped over, again and again, by the same shape. Restoring the image means asking the reverse question: which true sky of points, stamped with this blob, would produce exactly the image I see.

Input (two spikes)Gaussian PSF kernelConvolved output (blurred)

PSF width σ: 6.0 px

σ = 6.0 px: each spike is replaced by a scaled copy of the PSF and the copies are summed. A wider σ blurs more, eventually merging the two peaks. By the convolution theorem, real-space convolution equals multiplication in frequency space, so a broad PSF attenuates high frequencies and detail is lost.

The PSF therefore plays the role of an idealized blur stamp: each point of the object is replaced by a scaled copy of the PSF, and the copies are summed. A narrow PSF preserves fine detail; a broad one smears it. Imaging in Cryo-ET is, to good approximation, such a system, so the recorded image is the projected potential convolved with the instrument’s PSF. The microscope’s stamp is not a benign low-pass blur, however: its contrast transfer function oscillates and flips sign with frequency, inverting contrast outright over some frequency bands, so its PSF carries ringing positive-and-negative lobes rather than the monotone spread of a Gaussian bell.

A common analytic model is a Gaussian PSF whose width is set by a standard deviation $\sigma$ ; a larger $\sigma$ merges two nearby sharp features into a single indistinguishable blob. Whether two point sources at a fixed separation remain resolvable after convolution is a direct expression of resolution.

Putting numbers on this makes it concrete. Take the PSF to be a Gaussian $h(x)=\exp(-x^2/2\sigma^2)$ and two equal-strength point sources separated by $d$ . After convolution the image is a sum of two Gaussians whose centers are $d$ apart. When $d$ is much larger than $\sigma$ there is a clear dip between the peaks and the eye separates them; as $d$ shrinks toward about $2\sigma$ the central dip fills in, the peaks merge into one bump, and you can no longer tell it was “two.” So $\sigma$ reads directly as the scale of the resolvable distance: with $\sigma$ at $2$ pixels, features closer than roughly $4$ pixels blur together. In frequency space this is an equally Gaussian transfer function $H(k)=\exp(-2\pi^2\sigma^2 k^2)$ — the wider the real-space blur (larger $\sigma$ ), the narrower the frequency response and the more high frequencies are suppressed, which is precisely where the lost detail goes.

Depth

By the convolution theorem, convolution in real space is multiplication in frequency space:

\mathcal{F}\{f * h\}(k) = F(k)\,H(k).

Here $F(k)$ and $H(k)$ are the Fourier transforms of $f$ and $h$ , and $k$ is spatial frequency. The transform $H(k)$ of the PSF is the system’s transfer function, which states how strongly each spatial frequency is passed. One useful special case is $k=0$ : $H(0)=\int h(x)\,dx$ is the total “mass” of the PSF, and if it equals $1$ the convolution leaves overall brightness unchanged and only redistributes detail. For the microscope this transfer function is the contrast transfer function; its oscillation and zeros are exactly the per-frequency weighting that a single real-space PSF would be hard to read off directly. Working in frequency space also makes deconvolution conceptually simple — divide by $H(k)$ — though zeros of $H$ make exact inversion impossible: where $H(k)=0$ , $F(k)=Y(k)/H(k)$ becomes $0/0$ , that frequency’s information was discarded outright by the instrument and cannot be recovered; and where $H(k)$ is merely small, the division multiplies noise in that band, so naive inversion is impractical.

Two consequences follow. Restoring an image means undoing a convolution, a deconvolution problem that is ill-posed wherever the transfer function is small or zero. And because convolution is multiplication in frequency space, applying a corrective weight there — sharpening, smoothing, or CTF correction — is itself a convolution back in real space. This unifies blurring, the instrument response, and the filtering operations used throughout reconstruction under a single linear-systems framework.

This logic runs through the whole Cryo-ET reconstruction pipeline. Forward imaging is modeled as “object $*$ PSF,” so reconstruction is deconvolution; because it is ill-posed near the CTF zeros and sensitive to noise, dividing a single frame by $H(k)$ is almost unusable, and in practice one either merges images taken at several defoci so their zeros fall at different frequencies, or embeds the PSF as a known forward operator inside an iterative or generative solver and lets a prior fill in the bands the instrument lost. Understanding convolution and the PSF is understanding what reconstruction is trying to undo — and why a prior is required to undo it at all.

← Signal Processing