Data Import
Importing a session is how a fresh dataset enters Magellon — once it’s in, every plugin can read from it and write results back against the same session record.
Before you import
Section titled “Before you import”Two things must be in place:
-
Your data is on the shared filesystem. Magellon reads raw movies and gain references directly from disk — it does not copy them. The path must be reachable by both CoreService and every plugin container. In the default Docker Compose setup this is the host directory bound to
MAGELLON_GPFS_PATH(/magellon/gpfsinside containers). -
The session directory has a
session.json. This JSON file describes acquisition parameters (microscope voltage, magnification, pixel size, defocus range). Export it from your acquisition software (Leginon, EPU, or SerialEM with the companion export tool) and place it at the root of your session directory.
CoreService and all plugin containers must mount the same data path at
the same filesystem location. If a motion-correction plugin writes
/magellon/home/<session>/motioncor/file.mrc but CoreService is
looking at a different mount, results will silently fail to be recorded.
See Directory Structure for
the expected layout, and Environment Settings
for the path variables.
Running an import
Section titled “Running an import”-
Drop (or symlink) your session directory into the configured GPFS path — e.g.
/magellon/gpfs/<your-session>/. -
Open the UI at http://localhost:8080 and log in.
-
Click the menu icon and choose Import.
-
In the Magellon import tab, navigate to your session directory and double-click its
session.json. -
Click IMPORT DATA.
Import can take 5–60 minutes depending on dataset size and whether frame alignment is running on a GPU. Progress streams into the Jobs panel — leave the tab open or check back later; the state persists.
What the import does
Section titled “What the import does”Under the hood, clicking IMPORT DATA triggers the following:
-
Reads
session.json— creates amsessionrecord in the database with the acquisition metadata (voltage, pixel size, etc.). -
Scans for movies — walks the session directory for
.tif,.mrc, and.eerfiles matching the acquisition pattern. Creates oneimagedatabase record per movie, pointing at its path on disk. No data is copied. -
Dispatches the processing pipeline — for each image, tasks are enqueued for the enabled plugins in pipeline order (motion correction → CTF → FFT → square/hole detection). Tasks appear in the Jobs panel immediately.
-
Streams progress — as each plugin finishes, it writes result metadata back to the image record and emits a progress event over the message bus. The Jobs panel updates in real time via Socket.IO.
Session directory layout
Section titled “Session directory layout”Magellon uses a convention of <home>/<session>/<category>/<filename>
for all output files. After a complete import the layout looks like:
/magellon/home/<session-name>/├── frames/ # raw movies (read-only, your input data)│ ├── image_001.tif│ └── ...├── motioncor/ # written by the motioncor plugin│ ├── image_001_aligned.mrc│ └── ...├── ctf/ # written by the ctf plugin│ ├── image_001.star│ └── ...├── fft/ # written by the fft plugin│ ├── image_001_fft.png│ └── ...├── thumbnails/ # JPEG thumbnails for the image browser│ ├── image_001.jpg│ └── ...└── particles/ # written after picking + extraction ├── coords/ └── stacks/Each subdirectory maps directly to the plugin category that created it. If you re-run a step (e.g. with a different CTF plugin backend), the new outputs overwrite the previous ones in the same directory.
Data plane vs control plane
Section titled “Data plane vs control plane”Magellon is built around a clear separation between two communication paths:
| Plane | What travels here | Examples |
|---|---|---|
| Control plane (message bus) | Task requests, results, progress events, plugin heartbeats | TaskMessage, TaskResultMessage, step progress |
| Data plane (shared filesystem) | Large binary files — movies, micrographs, particle stacks | .mrc, .mrcs, .tif, .eer, .star |
A micrograph is typically 10–100 MB; a movie can be several GB. These payloads cannot travel through the message broker. Instead, plugins receive a path on the message bus and read or write the actual bytes on the shared filesystem directly. This is why the shared mount requirement is non-negotiable.
Supported acquisition formats
Section titled “Supported acquisition formats”| Format | Extension | Produced by |
|---|---|---|
| MRC movie stack | .mrc | Most detectors |
| TIFF (16-bit) | .tif / .tiff | Gatan K3 / K2 |
| EER (electron event representation) | .eer | Thermo Fisher Falcon 4 / 4i |
Gain references (.dm4, .mrc) are read at import time and applied
during motion correction — you do not need to convert them first.
See also
Section titled “See also”- Processing pipeline — what each step does and which plugins run it
- Directory Structure — full filesystem layout reference
- Environment Settings — path variables (
MAGELLON_GPFS_PATH,MAGELLON_HOME_PATH) - Plugins — installing and managing the plugin fleet