fw-file
Unified interface for reading medical file types, exposing parsed fields as dict
keys as well as attributes and for saving any modifications to disk or a buffer.
DICOM support - built on top of pydicom
- is the primary goal of the library.
fw-file
also provides helpers for parsing DICOMs containing non-standard tags
and utilities for organizing datasets and extracting metadata.
Additional file types supported:
- NIfTI1 and NIfTI2 (.nii.gz)
- Bruker ParaVision (subject/acqp/method)
- GE MR RAW / PFile (P_NNNNN_.7)
- Philips MR PAR/REC header (.par)
- Philips MR PAR/REC zipfile (.parrec.zip) (read-only)
- Siemens MR RAW (.dat)
- Siemens MR Spectroscopy (.rda)
- Siemens PET RAW (.ptd)
- PNG (.png)
- JPEG/JPG (.jpeg/.jpg)
- BrainVision EEG (.vhdr/.vmrk/.eeg)
- EEGLAB EEG (.set/.fdt)
- European Data Format EEG (.edf)
- BioSemi Data Format EEG (.bdf)
Installation
To install the package with all the optional dependencies:
pip install "fw-file[all]"
Alternatively, add as a poetry
dependency to your project:
poetry add fw-file --extras all
Usage
Opening
from fw_file.dicom import DICOM
dcm = DICOM("dataset.dcm")
Fields
Attribute access on DICOMs works similarly to that in pydicom
:
dcm.PatientAge == "060Y"
dcm.patientage == "060Y"
dcm.patient_age == "060Y"
Key access also returns values instead of pydicom.DataElement
:
dcm["PatientAge"] == "060Y"
dcm["patientage"] == "060Y"
dcm["patient_age"] == "060Y"
dcm["00101010"] == "060Y"
dcm["0010", "1010"] == "060Y"
dcm[0x00101010] == "060Y"
dcm[0x0010, 0x1010] == "060Y"
Private tags can be accessed as keys when including the creator:
dcm["AGFA", "Zoom factor"] == 2
dcm["AGFA", "0019xx82"] == 2
Assignment and deletion works with attributes and keys alike:
dcm.PatientAge = "065Y"
del dcm["PatientAge"]
Metadata
Flywheel metadata can be extracted using the get_meta()
method:
from fw_file.dicom import DICOM
dcm = DICOM("dataset.dcm")
dcm.get_meta() == {
"subject.label": "PatientID",
"session.label": "StudyDescription",
"session.uid": "1.2.3",
"acquisition.label": "SeriesDescription",
"acquisition.uid": "4.5.6",
}
Saving
dcm.save()
dcm.save("edited.dcm")
dcm.save(io.BytesIO())
Collections and series
Handling multiple DICOM files together is a common use case, where the tags of
more than one file need to be inspected in tandem for QA/validation or even
modified for de-identification. DICOMCollection
facilitates that and exposes
convenience methods to be loaded from a list of files, a directory or a zip
archive.
from fw_file.dicom import DICOMCollection
coll_dcm = DICOMCollection("001.dcm", "002.dcm")
coll_dir = DICOMCollection.from_dir(".")
coll_zip = DICOMCollection.from_zip("dicom.zip")
coll = DICOMCollection()
coll.append("001.dcm")
To interact with the underlying DICOMs:
coll[0].SOPInstanceUID == "1.2.3"
coll.bulk_get("SOPInstanceUID") == ["1.2.3", "1.2.4"]
coll.get("SeriesInstanceUID") == "1.2"
coll.get("SOPInstanceUID")
coll.set("PatientAge", "060Y")
coll.delete("PatientID")
Finally, a DICOMCollection
can be saved in place, exported to a directory or
packed as a zip archive:
coll.save()
coll.to_dir("/tmp/dicom")
coll.to_zip("/tmp/dicom.zip")
DICOMSeries
is a subclass of DICOMCollection
, intended to be used on files
that belong to the same DICOM series. The instances normally have the same
SeriesInstanceUID
attribute and are uploaded together (zipped) into a Flywheel
acquisition. In addition to the collection methods, DICOMSeries
can be used to
pack the instances into an appropriately named ZIP archive and extract Flywheel
metadata from multiple files while also validating the values, checking for any
discrepancies among the instances along the way.
from fw_file.dicom import DICOMSeries
series = DICOMSeries("001.dcm", "002.dcm")
filepath, metadata = series.to_upload()
DICOM Standard Editions
As the DICOM Standard is typically revised multiple times throughout the year,
fw-file
provides the option to choose which edition is being utilized via
environment variables. The default is "2024e"
, which utilizes the locally-saved
2024e edition. Additional options are "current"
and any valid 5-character edition
(i.e. "2022d"
). Specifying "current"
will fetch the most recent edition at runtime.
FW_DCM_STANDARD_REV=current
FW_DCM_STANDARD_REV=2022d
Private dictionary
In addition to the private tags included in
pydicom
,
fw-file
ships with an extended dictionary to
make accessing even more private tags that much simpler.
The private dictionary can be further extended by creating a DCMTK-style
data dict
file and setting the
DCMDICTPATH
environment variable to it's path.
DataElement
decoding
DICOMs are often saved with non-standard and/or corrupt data elements. To enable
loading these datasets, fw-file
provides fixes for some common problems:
- Fix
VM=1
strings that contain \
by replacing with _
(default: enabled) - Fix
VR
for known data elements encoded as explicit UN
(default: enabled) - Extend/improve handling of data elements with a
VR
mismatch (default: disabled)
These fixes can also be enabled/disabled via environment variables:
FW_DCM_REPLACE_UN_WITH_KNOWN_VR=false
FW_DCM_FIX_VM1_STRINGS=false
FW_DCM_FIX_VR_MISMATCH=true
To extract as much information from a DICOM as possible, fw-file
can be run in
read-only mode. When enabled, invalid values are retained and the VR is set to OB.
As it is not safe to write the DICOM back in this state, saving is disabled. This
mode can be enabled via an environment variable. (default: disabled)
FW_DCM_READ_ONLY=true
Additionally, validation mode can be set via environment variables. Default is
1 (WARN), additional options are 2 (RAISE) and 0 (IGNORE).
FW_DCM_READING_VALIDATION_MODE=1
FW_DCM_WRITING_VALIDATION_MODE=1
EEG
Multiple EEG filetypes are supported including BrainVision, EEGLAB, EDF, and BDF files.
These files are parsed using the MNE-Python library.
BrainVision data must contain both the header file (.vhdr) and the marker file (.vmrk)
in the same directory.
If EEGLAB data is made up of two files (.set and .fdt), these files must be
in the same directory.
A zip archive can also be used to instantiate a fw-file
BrainVision or EEGLAB object.
from fw_file.eeg import BrainVision, EEGLAB
bv = BrainVision.from_zip("brainvision.zip")
e = EEGLAB.from_zip("eeglab.zip")
Development
Install the project using poetry
and enable pre-commit
:
poetry install --extras "all"
pre-commit install
License
