Running training: self-supervised missing-wedge restoration
With public tools, hand a noisy, missing-wedge tomogram to a network that needs no ground truth — prepare, train, predict, step by step
By now you have a 3D volume (.mrc): the reconstructed tomogram. But it has two chronic problems — it is noisy (low dose, with shot noise burying the weak signal) and it has a missing wedge (you can only tilt to ±60°, so a slab of Fourier space is empty and structures get stretched and smeared along z; see the missing wedge). This page teaches you to run a self-supervised network that fixes both: the input is your noisy, missing-wedge volume, and the output is a denoised, more isotropic corrected volume. The key word is “self-supervised” — no clean ground truth anywhere; the network learns only from the dirty data you already have.
State the goal plainly: map a “noisy + missing-wedge observation” into a “corrected volume closer to the true structure.” With no clean reference to learn from, what makes it work are two constraints the data carries itself — rotation / missing-wedge consistency, and the independent-noise pairing from the even/odd split. Below, first meet the tools, then walk the common “prepare → train → predict” workflow.
The public tools you can actually run
These are public, documented, standard tools. Learn what each is best at:
- IsoNet — aimed at the missing wedge. Its idea: the true structure should look statistically similar in every direction (isotropic), while the missing wedge only corrupts some directions. So it rotates the volume, artificially carves out another wedge, and forces the network to recover the “complete other view” from the “rotated, mutilated view.” Through this rotation + missing-wedge consistency it fills the missing slab self-supervised, and denoises along the way.
- cryoCARE — aimed at denoising, the Noise2Noise way. It wants exactly the even/odd half-maps from the previous page: feed one half as input, the other as target. Because the two halves share signal but have independent noise, the network can only learn the shared clean structure (derivation on the even/odd page).
- DeepDeWedge — does both at once: it uses the even/odd independent-noise pairing to denoise, and adds a missing-wedge consistency term to fill the wedge. Effectively cryoCARE’s denoising plus IsoNet’s wedge-filling.
The three prepare data slightly differently (IsoNet from a single volume + rotations, cryoCARE / DeepDeWedge from even/odd pairs), but the “prepare → train → predict” backbone is identical. That backbone is what the rest of this page describes.
Ask yourself one question first: do you have the even/odd pair of independent halves? Yes → cryoCARE / DeepDeWedge both work, and give the cleanest denoising. Only a single volume, and the missing wedge is your main worry → start with IsoNet. The steps below overlap almost entirely either way.
Step 1: prepare metadata
What. Give the tool a manifest listing which tomograms to process and, most importantly, the pixel size (Å/pixel). This is usually a star file (the metadata contract from data processing), one line per volume with its path, pixel size, and as needed its defocus.
Why. Pixel size sets the “frequency ↔ resolution” conversion and decides how large a subtomogram is reasonable. Getting this wrong — pixel size especially — silently throws everything off downstream without raising an error. It is the most common pitfall.
# Command shape (not real flags): make a star file listing volumes and pixel size
<tool> prepare_star --tomos tomo1.mrc tomo2.mrc --pixel-size <Å/px> --out tomograms.star
Step 2: CTF-deconvolve for contrast
What. Before training, run a CTF deconvolution on the tomogram to lift the low-frequency contrast (IsoNet ships this step; cryoCARE / DeepDeWedge also recommend training on deconvolved volumes). The mechanism is in data processing: deconvolution adds no new information, it only re-weights existing frequencies so that large structures — membranes, organelles — pop out.
Why. With contrast lifted, the network (and your eye) can grab the real structure instead of flailing in a gray haze.
Repeat the iron rule from data processing: the training domain and the inference domain must match. If you train on deconvolved volumes, you may only feed deconvolved volumes at prediction time; the same goes for dose weighting, filtering, and pixel size. Deconvolution changes the spectral distribution of the data, and the model learns the statistics of that distribution — switch to a different preprocessing and you have switched the input to another domain, so results quietly degrade without erroring. So from this step on, record exactly what preprocessing your data chain did, and replicate it precisely at prediction time.
Step 3: generate a sampling mask
What. Give the volume a mask marking where the real stuff is (cell, membranes, particles) versus where it is empty ice. Training samples are drawn only from inside the mask.
Why. A large fraction of any tomogram is empty ice — pure noise, no structure. Let the network sample from there and it burns its training budget learning “what the noise should look like.” Restricting sampling to regions with signal spends the network’s capacity on the structure you actually want to fix.
Step 4: extract subtomograms (small cubes)
What. Cut many fixed-size small cubes (subtomograms / patches) out of the masked regions — say a few dozen voxels on a side — as training samples.
Why. A whole volume is too big to fit in GPU memory, and you do not need it — the network learns a local “dirty → clean” mapping, so small cubes suffice and also yield thousands of samples to feed training. On the even/odd route the cubes are cut in pairs: at the same location, one cube from the even half and one from the odd half form an (input, target) pair.
# Command shape: extract fixed-size subtomograms inside the mask
<tool> extract --star tomograms.star --mask mask.mrc --cube-size <N> --out subtomos/
Step 5: train self-supervised
What. Train the network on the cubes above. It learns the conditional distribution —
- IsoNet route: from rotation + missing-wedge consistency (“looking from another direction should be self-consistent, and the missing slab is filled from the statistics of the other directions”);
- cryoCARE / DeepDeWedge route: from the even/odd independent-noise pairing (one half predicts the other; the optimum is the shared clean structure).
Both routes reach the same place: the network outputs an estimate of the clean structure after the independent noise is averaged out and the wedge is filled by the learned prior.
Why hardware matters. This step requires a CUDA-capable NVIDIA GPU and takes hours (depending on the number of volumes, the number of subtomograms, and network size). This is not a laptop job — run it on a CPU or integrated graphics and it will either error out or be unusably slow. Put it on a machine with a GPU.
# Command shape: self-supervised training; the loss should fall over time
<tool> train --data subtomos/ --epochs <N> --gpu 0 --out model/
Watch three things during training: (1) the training loss should go down — if it stays flat or blows up, something in data prep is wrong (empty mask, wrong pixel size, misaligned even/odd pair). (2) Training finishes and produces a model-weights file. (3) Keep firmly in mind which preprocessing the model was trained on, because the next step has to match it.
Step 6: predict (apply to the whole volume)
What. Apply the trained model to the whole tomogram to get a denoised, more isotropic corrected .mrc. At prediction time the network slides over the full volume, restores it patch by patch, and stitches it back together.
Why. Training only learns the mapping on small cubes; what you actually want is the whole volume fixed. This step spreads the learned mapping over the entire map.
# Command shape: apply the trained model to the whole volume, output a corrected .mrc
<tool> predict --model model/ --tomo tomo1.mrc --out tomo1_corrected.mrc
- Preview a slice: open the original and the corrected volume side by side in 3dmod or ChimeraX — membranes should be more continuous, noise lower, the z-stretching reduced.
- Train and predict on the same data type: the iron rule from Step 2 closes here. If the model was trained on deconvolved volumes, feed it only deconvolved volumes at prediction; pixel size must match too.
- Do not over-trust filled-in detail: information inside the missing wedge is inferred by the network, not measured. Treat it as an educated guess, and verify key conclusions against the raw data or independent evidence.
Same shape, pushed further: this site’s methods
The public tools above all share the backbone prepare → train → predict. The two method families researched on this site push that backbone further, but the shape is unchanged — you already know how to run it:
- CryoGEN — equips “what counts as a plausible clean structure” with an explicit energy prior, then uses optimal transport to push a degraded observation back to its most probable clean volume. Still self-supervised, still no ground truth, and a point estimate: one corrected volume per degraded volume.
- CryoWGEN — adds an entropy term to the transport cost, yielding a posterior distribution instead of a single point: for one degraded volume it returns a family of plausible reconstructions, making the uncertainty about “what is really inside the missing wedge” explicit and quantifiable.
In other words: IsoNet / cryoCARE / DeepDeWedge get this pipeline running; CryoGEN / CryoWGEN keep the same prepare→train→predict shape while swapping in a stronger prior and reporting uncertainty. Get the public tools working once, and the method pages read much more easily.
Once you have the corrected volume in hand, the next step is to see it and segment it — see IMOD and ChimeraX.
Previous: even/odd splitting · Next: visualization with IMOD and ChimeraX