Skip to content

Processing pipeline

A cryo-EM experiment produces tens of thousands of movie files. To turn those raw frames into a set of picked, extracted particles ready for 3D reconstruction, the data must pass through a well-defined sequence of processing steps. Magellon automates the early, compute-heavy stages and lets you track, inspect, and re-run any step from the Jobs panel.

A cryo-EM dataset is thousands of 2D images of the same molecule frozen at random orientations. Each image is one projection of the 3D molecular density. The pipeline’s job is to clean and characterise those images so the downstream reconstruction can figure out each molecule’s orientation and back-project everything into a 3D map.

Everything else is making that work in the presence of noise, motion, and imaging artefacts.

StepWhat it doesMagellon today
Motion correctionAligns the frames in each movie to remove beam-induced specimen motionAutomated via motioncor plugin (MotionCor2/3)
CTF estimationFits the microscope’s contrast transfer function per micrographAutomated via ctf plugin (CTFFIND4)
Square detectionLocates grid squares in low-magnification overview imagesAutomated via ptolemy plugin (ONNX model)
Hole detectionLocates ice holes in medium-magnification imagesAutomated via ptolemy plugin (ONNX model)
FFTComputes the power-spectrum FFT of each micrographAutomated — always-on reference plugin
Particle pickingFinds particle coordinates in each micrographAutomated via topaz (CNN-based) or template-picker
Micrograph denoisingDenoises micrographs using a trained CNNAutomated via Topaz denoising backend
Particle extractionCuts and normalises a stack of particle boxesAutomated via stack-maker plugin
2D classificationClusters extracted particles by appearance; removes junkAutomated via can-classifier (CAN + MRA)
3D reconstruction onwardsInitial model → refinement → polishing → postprocessNot yet automated — run externally (RELION, cryoSPARC)

Steps 1–9 are what Magellon orchestrates today. Steps 10 and beyond remain in external tools; Magellon stores their outputs as session artifacts for browsing.

A single Krios session (12–48 hours) typically contains:

ItemTypical sizeDescription
Movies50–500 frames, 50 MB–several GB eachRaw detector output — one movie per acquisition position
Micrograph count1 k–10 kEach micrograph captures ~100–1 000 particles
Gain reference1 file per sessionPer-pixel sensitivity correction applied during import

When Magellon imports a session it reads these files from the configured data path (MAGELLON_GPFS_PATH) and creates a session record. Large payloads (movies, micrographs) stay on the shared filesystem; only metadata and task results travel over the message bus.

Each movie’s frames suffer from beam-induced specimen motion — the specimen drifts 5–50 Å during the exposure. Summing the frames naively blurs the image. Motion correction aligns the frames first, producing a single, sharper micrograph.

Inputs: movie stack + gain reference
Outputs: one aligned .mrc micrograph per movie
Plugin: motioncor — wraps MotionCor2/3; GPU-accelerated

The microscope intentionally defocuses the image to increase contrast, which introduces a sinusoidal modulation in Fourier space (the Contrast Transfer Function). Every downstream step needs to know each micrograph’s CTF to correctly weight and combine signal.

Inputs: aligned micrograph
Outputs: defocus, astigmatism, and CTF goodness-of-fit per micrograph
Plugin: ctf — wraps CTFFIND4; multiple backends (fast, GPU, external)

CTF quality filtering

The CTF fit quality score is stored as micrograph metadata. You can filter out poor micrographs (high astigmatism, low confidence) in the session view before dispatching particle picking — this dramatically reduces junk picks downstream.

At low magnification the microscope acquires overview images showing the grid squares. At medium magnification it captures the individual ice holes within each square. Magellon’s ptolemy plugin uses ONNX-based computer vision to locate both automatically, driving the acquisition target selection pipeline.

Plugin: ptolemy — one plugin, two categories (square_detection and hole_detection)

A fast-Fourier-transform of each aligned micrograph produces a power-spectrum thumbnail — the classic “Thon ring” image used to visually verify CTF quality. The FFT plugin is always-on and its output appears immediately in the image viewer for every micrograph.

Particle picking scans each aligned micrograph for blob-shaped signals that match the expected particle size and produces a coordinate file (x, y) per micrograph. Magellon ships two pickers:

BackendMethodBest for
topazTrained CNN (Topaz)General-purpose; works without a reference template
template-pickerCross-correlation template matchingWhen you already have a good 2D template

The Topaz backend also supports micrograph denoising as a companion step — denoised micrographs feed improved coordinates back into subsequent picks.

Extraction cuts a square box (e.g. 256 × 256 px) around each picked coordinate, normalises the contrast, and writes all boxes to a single .mrcs particle stack. The stack is the input to 2D classification.

Plugin: stack-maker — thin wrapper around the vendored extraction algorithm from the Magellon algorithm library

Picked particles always include some junk: ice contamination, broken molecules, neighbouring molecules accidentally cropped. 2D classification clusters all particles into K groups by appearance. Bad groups (featureless blobs, ice rings, edge artefacts) are dropped, leaving a clean particle set. Typical retention: 30–70 % of initial picks.

Plugin: can-classifier — Convolutional Autoencoder + Multi-Reference Alignment (CAN+MRA)

Beyond 2D classification

Steps beyond 2D classification — initial 3D model generation, 3D refinement, CTF refinement, Bayesian polishing, and postprocessing — are typically run in RELION or cryoSPARC. Magellon stores and displays the results but does not yet automate these later stages.

Each automated step dispatches work as tasks visible in the Jobs panel. One import creates one job containing one task per micrograph per step. The Jobs panel shows:

  • Per-step progress bars
  • Individual task status (pending / running / completed / failed)
  • Live log output from the plugin processing each task
  • Output file locations on the shared filesystem

Failed tasks can be individually retried from the Jobs panel without re-running the whole import.

Metadata travels over the message bus; large files (movies, micrographs, particle stacks) travel over the shared filesystem. CoreService and every plugin container must mount the same path. In the default Docker Compose setup this is a bind mount; on HPC clusters it is typically GPFS, Lustre, or BeeGFS.

All containers must see the same path

If a plugin container can write /magellon/home/<session>/motioncor/file.mrc but CoreService cannot read that path, results will silently disappear. Verify the shared mount before running your first import — see Directory Structure for the expected layout.

  • Plugins — architectural overview of the plugin system
  • Plugin categories — full reference of every category and its current backends
  • Managing plugins — start/stop/scale plugins, inspect logs, troubleshoot
  • Data import — how to bring a session into Magellon