
Research
/Security News
CanisterWorm: npm Publisher Compromise Deploys Backdoor Across 29+ Packages
The worm-enabled campaign hit @emilgroup and @teale.io, then used an ICP canister to deliver follow-on payloads.
D-MemFS
Advanced tools
An in-process virtual filesystem with hard quota enforcement for Python.
| Metric | Details |
|---|---|
| 🧪 Robustness | 436 tests with 97% code coverage |
| 🔒 Verified Safety | 98, 100×4 — top scores across all security categories (Socket.dev) |
| 🌟 Community | Discussed on r/Python with highly positive reception |
MemoryFileSystem gives you a fully isolated filesystem-like workspace inside a Python process.
MFSQuotaExceededError) to reject oversized writes before OOMimport_tree, copy_tree, move)PYTHON_GIL=0) — stress-tested under 50-thread contentionAsyncMemoryFileSystem) powered by asyncio.to_threadThis is useful when io.BytesIO is too primitive (single buffer), and OS-level RAM disks/tmpfs are impractical (permissions, container policy, Windows driver friction). Ideal for CI pipeline acceleration — eliminate disk I/O from test suites and data processing without any infrastructure changes.
Note on Architectural Boundary: This is strictly an in-process tool. External subprocesses (CLI tools) cannot access these files via standard OS paths. If your pipeline relies heavily on passing files to external binaries, an OS-level RAM disk (tmpfs) is the correct tool. D-MemFS shines when accelerating Python-native test suites or internal data pipelines.
Extract ZIP/TAR archives directly into D-MemFS using the built-in expand_archive() (atomic, all-or-nothing) or expand_archive_streaming() (low-memory, incremental). Custom archive formats are supported via the pluggable ArchiveAdapter interface. A low-level manual extraction example using open()/write() is also included as a reference for advanced use cases.
examples/archive_extraction.mdSpeed up your pipeline by running heavy file I/O tests entirely in memory. If a test fails, export the complete virtual filesystem state to a physical directory (export_tree) for easy post-mortem debugging.
examples/ci_debug_export.mdEliminate disk I/O bottlenecks in your database test suites. Generate a master SQLite database state once, store it in D-MemFS, and load it instantly for each individual test. Ensure perfect test isolation with zero disk wear and zero cleanup.
examples/sqlite_test_fixtures.mdUse D-MemFS as a volatile, high-speed staging area for ETL pipelines. It features built-in, thread-safe file locking, ensuring safe concurrent data processing.
examples/etl_staging_multithread.mdProcess massive files chunk-by-chunk using our Memory Guard. Safely raise an exception before the host OS hits an Out-Of-Memory (OOM) crash, which is crucial for environments without OS-level RAM disks.
examples/memory_guard_streaming.mdpip install D-MemFS
Requirements: Python 3.11+
from dmemfs import MemoryFileSystem, MFSQuotaExceededError
mfs = MemoryFileSystem(max_quota=64 * 1024 * 1024)
mfs.mkdir("/data")
with mfs.open("/data/hello.bin", "wb") as f:
f.write(b"hello")
with mfs.open("/data/hello.bin", "rb") as f:
print(f.read()) # b"hello"
print(mfs.listdir("/data"))
print(mfs.is_file("/data/hello.bin")) # True
try:
with mfs.open("/huge.bin", "wb") as f:
f.write(bytes(512 * 1024 * 1024))
except MFSQuotaExceededError as e:
print(e)
MemoryFileSystemopen(path, mode, *, preallocate=0, lock_timeout=None)mkdir, remove, rmtree, rename, move, copy, copy_treelistdir, exists, is_dir, is_file, walk, globstat, stats, get_sizeexport_as_bytesio, export_tree, iter_export_tree, import_treeexpand_archive(mfs, source, dest, *, on_conflict, adapter, adapters) — atomic extraction via import_tree()expand_archive_streaming(mfs, source, dest, *, on_conflict, adapter, adapters) — streaming extraction, returns write countArchiveAdapter — base class for pluggable archive format support (built-in: ZipAdapter, TarAdapter)Constructor parameters:
max_quota (default 256 MiB): byte quota for file datamax_nodes (default None): optional cap on total node count (files + directories). Raises MFSNodeLimitExceededError when exceeded.default_storage (default "auto"): storage backend for new files — "auto" / "sequential" / "random_access"promotion_hard_limit (default None): byte threshold above which Sequential→RandomAccess auto-promotion is suppressed (None uses the built-in 512 MiB limit)chunk_overhead_override (default None): override the per-chunk overhead estimate used for quota accountingdefault_lock_timeout (default 30.0): default timeout in seconds for file-lock acquisition during open(). Use None to wait indefinitely.memory_guard (default "none"): physical memory protection mode — "none" / "init" / "per_write"memory_guard_action (default "warn"): action when the guard triggers — "warn" (ResourceWarning) / "raise" (MemoryError)memory_guard_interval (default 1.0): minimum seconds between OS memory queries ("per_write" only)Note: The
BytesIOreturned byexport_as_bytesio()is outside quota management. Exporting large files may consume significant process memory beyond the configured quota limit.
Note — Quota and free-threaded Python: The per-chunk overhead estimate used for quota accounting is calibrated at import time via
sys.getsizeof(). Free-threaded Python (3.13t,PYTHON_GIL=0) has larger object headers than the standard build, soCHUNK_OVERHEAD_ESTIMATEis higher (~117 bytes vs ~93 bytes on CPython 3.13). This means the samemax_quotayields slightly less effective storage capacity on free-threaded builds, especially for workloads with many small files or small appends. This is not a bug — it reflects real memory consumption. To ensure consistent behaviour across builds, usechunk_overhead_overrideto pin the value, or inspectstats()["overhead_per_chunk_estimate"]at runtime.
Supported binary modes: rb, wb, ab, r+b, xb
MFS enforces a logical quota, but that quota can still be configured larger than the
currently available physical RAM. memory_guard provides an optional safety net.
from dmemfs import MemoryFileSystem
# Warn if max_quota exceeds available RAM
mfs = MemoryFileSystem(max_quota=8 * 1024**3, memory_guard="init")
# Raise MemoryError before writes when RAM is insufficient
mfs = MemoryFileSystem(
max_quota=8 * 1024**3,
memory_guard="per_write",
memory_guard_action="raise",
)
| Mode | Initialization | Each Write | Overhead |
|---|---|---|---|
"none" | — | — | Zero |
"init" | Check once | — | Negligible |
"per_write" | Check once | Cached check | About 1 OS call/sec |
When memory_guard_action="warn", the guard emits ResourceWarning and allows the operation to continue.
When memory_guard_action="raise", the guard rejects the operation with MemoryError before the actual allocation path.
AsyncMemoryFileSystem accepts the same constructor parameters and forwards them to the synchronous implementation.
MemoryFileHandleio.RawIOBase-compatible binary handleread, write, seek, tell, truncate, flush, closereadintoreadable, writable, seekableflush() is intentionally a no-op (compatibility API for file-like integrations).
stat() return (MFSStatResult)size, created_at, modified_at, generation, is_dir
size=0, generation=0, is_dir=TrueD-MemFS natively operates in binary mode. For text I/O, use MFSTextHandle:
from dmemfs import MemoryFileSystem, MFSTextHandle
mfs = MemoryFileSystem()
mfs.mkdir("/data")
# Write text
with mfs.open("/data/hello.bin", "wb") as f:
th = MFSTextHandle(f, encoding="utf-8")
th.write("こんにちは世界\n")
th.write("Hello, World!\n")
# Read text line by line
with mfs.open("/data/hello.bin", "rb") as f:
th = MFSTextHandle(f, encoding="utf-8")
for line in th:
print(line, end="")
MFSTextHandle is a thin, bufferless wrapper. It encodes on write() and decodes on read() / readline(). read(size) counts characters, not bytes, so multibyte text can be read safely without splitting code points. Unlike io.TextIOWrapper, it introduces no buffering issues when used with MemoryFileHandle.
from dmemfs import AsyncMemoryFileSystem
async def run() -> None:
mfs = AsyncMemoryFileSystem(max_quota=64 * 1024 * 1024)
await mfs.mkdir("/a")
async with await mfs.open("/a/f.bin", "wb") as f:
await f.write(b"data")
async with await mfs.open("/a/f.bin", "rb") as f:
print(await f.read())
_global_lock.ReadWriteLock.lock_timeout behavior:
None: block indefinitely0.0: try-lock (fail immediately with BlockingIOError)> 0: timeout in seconds, then BlockingIOErrorReadWriteLock is non-fair: under sustained read load, writers can starve.While the core MemoryFileSystem is thread-safe, individual file handles (MemoryFileHandle, MFSTextHandle, AsyncMemoryFileHandle) are not thread-safe when shared concurrently.
mfs.open() inside your worker function). Do not pass open handles across thread boundaries.lock_timeout in latency-sensitive code pathswalk() and glob() provide weak consistency: each directory level is
snapshotted under _global_lock, but the overall traversal is NOT atomic.
Concurrent structural changes may produce inconsistent results.Minimal benchmark tooling is included:
io.BytesIO vs PyFilesystem2 (MemoryFS) vs tempfile(RAMDisk) / tempfile(SSD)benchmarks/results/Note: As of setuptools 82 (February 2026),
pyfilesystem2fails to import due to a known upstream issue (#597). Benchmark results including PyFilesystem2 were measured with setuptools ≤ 81 and are valid as historical comparison data.
Run:
# With explicit RAM disk and SSD directories for tempfile comparison:
uvx --with-requirements requirements.txt --with-editable . python benchmarks/compare_backends.py --ramdisk-dir R:\Temp --ssd-dir C:\TempX --save-md auto --save-json auto
See BENCHMARK.md for details.
Latest benchmark snapshot:
Test execution and dev flow are documented in TESTING.md.
Typical local run:
uv pip compile requirements.in -o requirements.txt
uvx --with-requirements requirements.txt --with-editable . pytest tests/ -v --timeout=30 --cov=dmemfs --cov-report=xml --cov-report=term-missing
CI (.github/workflows/test.yml) runs tests with coverage XML generation.
API docs can be generated as Markdown (viewable on GitHub) using pydoc-markdown:
uvx --with pydoc-markdown --with-editable . pydoc-markdown '{
loaders: [{type: python, search_path: [.]}],
processors: [{type: filter, expression: "default()"}],
renderer: {type: markdown, filename: docs/api_md/index.md}
}'
Or as HTML using pdoc (local browsing only):
uvx --with-requirements requirements.txt pdoc dmemfs -o docs/api
open() is binary-only (rb, wb, ab, r+b, xb). Text I/O is available via the MFSTextHandle wrapper.pathlib.PurePath).pathlib.Path / os.PathLike API — MFS paths are virtual and must not be confused with host filesystem paths. Accepting os.PathLike would allow third-party libraries or a plain open() call to silently treat an MFS virtual path as a real OS path, potentially issuing unintended syscalls against the host filesystem. All paths must be plain str with POSIX-style absolute notation (e.g. "/data/file.txt").examples/archive_extraction.md for details.asyncio.to_thread() in async code.Auto-promotion behavior:
default_storage="auto"), new files start as SequentialMemoryFile and auto-promote to RandomAccessMemoryFile when random writes are detected.default_storage="sequential" or "random_access" to fix the backend at construction; use promotion_hard_limit to suppress auto-promotion above a byte threshold.Security note: In-memory data may be written to physical disk via OS swap or core dumps. MFS does not provide memory-locking (e.g., mlock) or secure erasure. Do not rely on MFS alone for sensitive data isolation.
| Exception | Typical cause |
|---|---|
MFSQuotaExceededError | write/import/copy would exceed quota |
MFSNodeLimitExceededError | node count would exceed max_nodes (subclass of MFSQuotaExceededError) |
FileNotFoundError | path missing |
FileExistsError | creation target already exists |
IsADirectoryError | file operation on directory |
NotADirectoryError | directory operation on file |
BlockingIOError | lock timeout or open-file conflict |
io.UnsupportedOperation | mode mismatch / unsupported operation |
ValueError | invalid mode/path/seek/truncate arguments |
D-MemFS ships a pytest plugin that provides an mfs fixture:
# conftest.py — register the plugin explicitly
pytest_plugins = ["dmemfs._pytest_plugin"]
Note: The plugin is not auto-discovered. Users must declare it in
conftest.pyto opt in.
# test_example.py
def test_write_read(mfs):
mfs.mkdir("/tmp")
with mfs.open("/tmp/hello.txt", "wb") as f:
f.write(b"hello")
with mfs.open("/tmp/hello.txt", "rb") as f:
assert f.read() == b"hello"
Design documents (Japanese):
These documents are written in Japanese and serve as internal design references.
Key results from the included benchmark (300 small files × 4 KiB, 16 MiB stream, 512 MiB large stream):
| Case | D-MemFS (ms) | BytesIO (ms) | tempfile(RAMDisk) (ms) | tempfile(SSD) (ms) |
|---|---|---|---|---|
| small_files_rw | 51 | 6 | 207 | 267 |
| stream_write_read | 81 | 62 | 20 | 21 |
| random_access_rw | 34 | 82 | 37 | 35 |
| large_stream_write_read | 529 | 2 258 | 514 | 541 |
| many_files_random_read | 1 280 | 212 | 6 310 | 8 601 |
| deep_tree_read | 224 | 3 | 346 | 361 |
D-MemFS incurs a small overhead on tiny-file workloads but delivers significantly better performance on large streams and random-access patterns compared with BytesIO. See BENCHMARK.md and benchmark_current_result.md for full data.
Note:
tempfile(RAMDisk)results were measured with the temp directory on a RAM disk;tempfile(SSD)results use a physical SSD. Use--ramdisk-dirand--ssd-diroptions to reproduce both variants in a single run.
If you find D-MemFS useful, consider sponsoring the project.
MIT License
FAQs
In-process virtual filesystem with hard quota for Python
We found that D-MemFS demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Research
/Security News
The worm-enabled campaign hit @emilgroup and @teale.io, then used an ICP canister to deliver follow-on payloads.

Research
/Security News
Attackers compromised Trivy GitHub Actions by force-updating tags to deliver malware, exposing CI/CD secrets across affected pipelines.

Security News
ENISA’s new package manager advisory outlines the dependency security practices companies will need to demonstrate as the EU’s Cyber Resilience Act begins enforcing software supply chain requirements.