Skip to content

Data Import

Importing a session is how a fresh dataset enters Magellon — once it’s in, every plugin can read from it and write results back against the same session record.

Two things must be in place:

  1. Your data is on the shared filesystem. Magellon reads raw movies and gain references directly from disk — it does not copy them. The path must be reachable by both CoreService and every plugin container. In the default Docker Compose setup this is the host directory bound to MAGELLON_GPFS_PATH (/magellon/gpfs inside containers).

  2. The session directory has a session.json. This JSON file describes acquisition parameters (microscope voltage, magnification, pixel size, defocus range). Export it from your acquisition software (Leginon, EPU, or SerialEM with the companion export tool) and place it at the root of your session directory.

Shared filesystem is required

CoreService and all plugin containers must mount the same data path at the same filesystem location. If a motion-correction plugin writes /magellon/home/<session>/motioncor/file.mrc but CoreService is looking at a different mount, results will silently fail to be recorded. See Directory Structure for the expected layout, and Environment Settings for the path variables.

  1. Drop (or symlink) your session directory into the configured GPFS path — e.g. /magellon/gpfs/<your-session>/.

  2. Open the UI at http://localhost:8080 and log in.

  3. Click the menu icon and choose Import.

  4. In the Magellon import tab, navigate to your session directory and double-click its session.json.

  5. Click IMPORT DATA.

Import can take 5–60 minutes depending on dataset size and whether frame alignment is running on a GPU. Progress streams into the Jobs panel — leave the tab open or check back later; the state persists.

Under the hood, clicking IMPORT DATA triggers the following:

  1. Reads session.json — creates a msession record in the database with the acquisition metadata (voltage, pixel size, etc.).

  2. Scans for movies — walks the session directory for .tif, .mrc, and .eer files matching the acquisition pattern. Creates one image database record per movie, pointing at its path on disk. No data is copied.

  3. Dispatches the processing pipeline — for each image, tasks are enqueued for the enabled plugins in pipeline order (motion correction → CTF → FFT → square/hole detection). Tasks appear in the Jobs panel immediately.

  4. Streams progress — as each plugin finishes, it writes result metadata back to the image record and emits a progress event over the message bus. The Jobs panel updates in real time via Socket.IO.

Magellon uses a convention of <home>/<session>/<category>/<filename> for all output files. After a complete import the layout looks like:

/magellon/home/<session-name>/
├── frames/ # raw movies (read-only, your input data)
│ ├── image_001.tif
│ └── ...
├── motioncor/ # written by the motioncor plugin
│ ├── image_001_aligned.mrc
│ └── ...
├── ctf/ # written by the ctf plugin
│ ├── image_001.star
│ └── ...
├── fft/ # written by the fft plugin
│ ├── image_001_fft.png
│ └── ...
├── thumbnails/ # JPEG thumbnails for the image browser
│ ├── image_001.jpg
│ └── ...
└── particles/ # written after picking + extraction
├── coords/
└── stacks/

Each subdirectory maps directly to the plugin category that created it. If you re-run a step (e.g. with a different CTF plugin backend), the new outputs overwrite the previous ones in the same directory.

Magellon is built around a clear separation between two communication paths:

PlaneWhat travels hereExamples
Control plane (message bus)Task requests, results, progress events, plugin heartbeatsTaskMessage, TaskResultMessage, step progress
Data plane (shared filesystem)Large binary files — movies, micrographs, particle stacks.mrc, .mrcs, .tif, .eer, .star

A micrograph is typically 10–100 MB; a movie can be several GB. These payloads cannot travel through the message broker. Instead, plugins receive a path on the message bus and read or write the actual bytes on the shared filesystem directly. This is why the shared mount requirement is non-negotiable.

FormatExtensionProduced by
MRC movie stack.mrcMost detectors
TIFF (16-bit).tif / .tiffGatan K3 / K2
EER (electron event representation).eerThermo Fisher Falcon 4 / 4i

Gain references (.dm4, .mrc) are read at import time and applied during motion correction — you do not need to convert them first.