Data-processing overview: from raw frames to 3D structure

An end-to-end roadmap — from a pile of raw movie frames to a 3-D structure you can view and segment, with what each step uses and what it produces

Intuition

What you start with: a batch of tilt series. Each series is a set of images of the same specimen in the same patch of ice, taken across a range of tilt angles — and at each angle what you record is not a single picture but a movie, dozens of low-dose frames. What you want at the end: a denoised, missing-wedge-corrected 3-D volume you can drag straight into ChimeraX, see clearly, and trace membranes, organelles, and macromolecules out of.

This page is the roadmap connecting those two ends. In between runs a fixed pipeline: raw movie frames → motion correction → CTF and dose handling → tilt-series alignment → back-projection reconstruction → even/odd splitting → self-supervised training → visualization and segmentation. Below, each step is broken out: what goes in, what comes out, which tool does it, and why it exists, with a link to the article that develops it.

What the pipeline looks like

Strung together in one line:

tilt series (a movie at each tilt) → motion correction → CTF / dose / deconvolution → tilt-series alignment → weighted back-projection reconstruction → even/odd splitting → self-supervised training (fill the missing wedge + denoise) → ChimeraX / IMOD visualization and segmentation.

Each step takes the previous step’s product as input and emits an intermediate that is one notch closer to usable. Here is each item in turn.

1 · Acquisition: get the tilt series

In: a specimen in the microscope (a thin layer of vitrified ice with your molecules or cells embedded in it).
Out: a batch of tilt series, each a set of exposures of the same field at several tilt angles (typically around ±60°, one every few degrees), each exposure a movie.
Tool / source: the microscope’s automated acquisition software (e.g. SerialEM). For the principle, see tilt series.
Why: the beam damages the specimen, and any single view gives only one flattened 2-D projection. Tilting to record several angles is what makes it possible to infer the 3-D structure later — which is exactly what the reconstruction step does. The tilt cannot reach a full ±90°, and the angles you miss are the root of the missing wedge.

2 · Motion correction: align and sum each movie

In: the movie at each tilt angle (a stack of frames).
Out: each movie → one sharp, aligned image (a micrograph).
Tool: MotionCor2, IMOD’s alignframes, or the motion correction built into RELION.
Why: the specimen moves during the exposure (beam-induced motion plus stage drift). Summing the frames directly bakes that motion into the image permanently; recording a movie, estimating the inter-frame shifts, and aligning before summing leaves the motion between the frames and removes it. See movie frames and motion correction.

3 · CTF estimation / correction, dose weighting, deconvolution

In: the motion-corrected images.
Out: images with phase flips corrected and weighting re-applied by dose and frequency (sometimes a deconvolution too, to make the contrast easier to read by eye).
Tool: CTFFIND4 (estimate the CTF), Warp / RELION / IMOD (correct and weight).
Why: the microscope’s contrast transfer function (CTF) discounts the signal — and even flips its phase — as a function of spatial frequency; left uncorrected, the detail is simply wrong. Dose weighting then suppresses the high frequencies in late frames already eaten by radiation damage. For the principle see the CTF, for the how-to see CTF and dose handling.

4 · Tilt-series alignment

In: all the tilt images of one series.
Out: precise geometry for each image (which tilt angle it really corresponds to, how it is shifted and rotated relative to the rest) — i.e. all projections placed back into one self-consistent common coordinate frame.
Tool: IMOD’s etomo / tilt workflow, using gold fiducials or patch tracking; AreTomo works on fiducial-less data.
Why: the stage does not rotate cleanly about one fixed axis, so every image carries an unknown drift and a small rotation. Without aligning first, back-projection lays the projections down out of register and smears them into mush. See fiducial alignment.

5 · Reconstruction: weighted back-projection to a tomogram

In: the full set of aligned projections plus their geometry.
Out: a 3-D volume — the tomogram, usually an .mrc / .rec file.
Tool: IMOD’s tilt (WBP or SIRT), or AreTomo.
Why: smearing each 2-D projection back along the direction it was shot, and summing over all directions, makes the 3-D object emerge. This is the key leap from 2-D to 3-D. For the principle and the frequency-domain filter of FBP, see from tilt series to tomogram. The volume this step produces still carries the missing wedge left by the sampling geometry.

6 · Even/odd splitting: make two noise-independent copies

In: the raw data (split by frame, or by tilt angle).
Out: two half-maps of the same acquisition — same signal, independent noise.
Tool: set the split in the motion-correction / reconstruction scripts (IMOD, Warp, and cryoCARE’s preprocessing all support it).
Why: only with a “same-signal, different-noise” pair can you measure resolution objectively with FSC, and only then can you train a denoiser with no clean ground truth. The rule is to split early and keep the two processing chains from ever touching. See even/odd splitting.

7 · Self-supervised training: fill the missing wedge, denoise

In: the noise-independent half-maps (or half-volumes) from step 6.
Out: a denoised, missing-wedge-filled 3-D volume — cleaner than the raw reconstruction and closer to isotropic.
Tool: cryoCARE, IsoNet, DeepDeWedge, and this site’s research methods CryoGEN / CryoWGEN.
Why: the missing wedge and noise are the two worst ailments of a raw tomogram, and acquisition alone cannot fix them. These methods let a network learn a structural prior from “same-signal, different-noise” pairs and infer both the missing angular wedge and the noise away. See run the training.

8 · Visualization and segmentation: ChimeraX and IMOD

In: the 3-D volume from step 5 or step 7 (.mrc / .rec).
Out: a 3-D structure you can see, rotate, and turn into a figure, plus segmented membranes / organelles / particles.
Tool: UCSF ChimeraX (render, make figures), IMOD’s 3dmod (browse, draw models, segment).
Why: the volume itself is just a grid of density numbers you can’t see by eye; only with the right isosurface threshold and viewpoint does the structure read out. Downstream this connects to subtomogram averaging and segmentation to raise SNR and localize structure. For the step-by-step tutorial see IMOD and ChimeraX.

Intuition

The concepts live in the electron-tomography base — what the missing wedge is, why back-projection smears, how the CTF works — that side explains the principles; this base is the hands-on how-to that tells you which tool to use, what to type, and what file comes out. The two mirror each other: if you can’t see why a step exists, go back to the concept base; if you want to actually run it, stay here.

Which steps run once, and which run per tomogram

Telling these apart saves a lot of effort:

Once per movie: motion correction (step 2) — you run it as many times as you have tilts and series, but all with the same settings.
Calibrate once per dataset: the CTF and dose-weighting parameters, the pixel size, and the splitting strategy (the settings in steps 3 and 6) are usually fixed once and applied to the whole batch.
Once per tilt series: alignment and reconstruction (steps 4–5) — one series produces one tomogram.
Once per tomogram (or per batch): even/odd splitting and training (steps 6–7). A single denoising / wedge-filling model can be shared across many tomograms, then applied to each.
Any time, on demand: visualization and segmentation (step 8) — open whichever volume you want to look at.

Tip

Steps 6–7 (even/odd splitting + self-supervised training) are exactly where this site’s research methods land. The first five steps are the standard preprocessing every cryo-ET pipeline shares; what CryoGEN and CryoWGEN replace or strengthen is the “how to fill the missing wedge, how to denoise” of step 7 — and what they feed on is precisely the noise-independent pair produced in step 6.

This is the roadmap that starts the pipeline. Next: movie frames and motion correction.

← Software & Data Processing