Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

slideflow

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

slideflow

Deep learning tools for digital histology

3.0.2
PyPI

Maintainers: 1

ArXiv | Docs | Slideflow Studio | Cite | ✨ What's New in 3.0 ✨

Slideflow Studio: a visualization tool for interacting with models and whole-slide images.

Slideflow is a deep learning library for digital pathology, offering a user-friendly interface for model development.

Designed for both medical researchers and AI enthusiasts, the goal of Slideflow is to provide an accessible, easy-to-use interface for developing state-of-the-art pathology models. Slideflow has been built with the future in mind, offering a scalable platform for digital biomarker development that bridges the gap between ever-evolving, sophisticated methods and the needs of a clinical researcher. For developers, Slideflow provides multiple endpoints for integration with other packages and external training paradigms, allowing you to leverage highly optimized, pathology-specific processes with the latest ML methodologies.

🚀 Features

Easy-to-use, highly customizable training pipelines
Robust slide processing and stain normalization toolkit
Support for training with weakly-supervised or strongly-supervised labels
Built-in, state-of-the-art foundation models
Multiple-instance learning (MIL)
Self-supervised learning (SSL)
Generative adversarial networks (GANs)
Explainability tools: Heatmaps, mosaic maps, saliency maps, synthetic histology
Robust layer activation analysis tools
Uncertainty quantification
Interactive user interface for model deployment
... and more!

Full documentation with example tutorials can be found at slideflow.dev.

Requirements

Python >= 3.7 (<3.10 if using cuCIM)
PyTorch >= 1.9 or Tensorflow 2.5-2.11

Optional

Libvips >= 8.9 (alternative slide reader, adds support for *.scn, *.mrxs, *.ndpi, *.vms, and *.vmu files).
Linear solver (for preserved-site cross-validation)
- CPLEX 20.1.0 with Python API
- or Pyomo with Bonmin solver

📥 Installation

Slideflow can be installed with PyPI, as a Docker container, or run from source.

Method 1: Install via pip

pip3 install --upgrade setuptools pip wheel
pip3 install slideflow[cucim] cupy-cuda11x

The cupy package name depends on the installed CUDA version; see here for installation instructions. cupy is not required if using Libvips.

Method 2: Docker image

Alternatively, pre-configured docker images are available with OpenSlide/Libvips and the latest version of either Tensorflow and PyTorch. To install with the Tensorflow backend:

docker pull jamesdolezal/slideflow:latest-tf
docker run -it --gpus all jamesdolezal/slideflow:latest-tf

To install with the PyTorch backend:

docker pull jamesdolezal/slideflow:latest-torch
docker run -it --shm-size=2g --gpus all jamesdolezal/slideflow:latest-torch

Method 3: From source

To run from source, clone this repository, install the conda development environment, and build a wheel:

git clone https://github.com/slideflow/slideflow
conda env create -f slideflow/environment.yml
conda activate slideflow
pip install -e slideflow/ cupy-cuda11x

Non-Commercial Add-ons

To add additional tools and pretrained models available under a non-commercial license, install slideflow-gpl and slideflow-noncommercial:

pip install slideflow-gpl slideflow-noncommercial

This will provide integrated access to 6 additional pretrained foundation models (UNI, HistoSSL, GigaPath, PLIP, RetCCL, and CTransPath), the MIL architecture CLAM, the UQ algorithm BISCUIT, and the GAN framework StyleGAN3.

⚙️ Configuration

Deep learning (PyTorch vs. Tensorflow)

Slideflow supports both PyTorch and Tensorflow, defaulting to PyTorch if both are available. You can specify the backend to use with the environmental variable SF_BACKEND. For example:

export SF_BACKEND=tensorflow

Slide reading (cuCIM vs. Libvips)

By default, Slideflow reads whole-slide images using cuCIM. Although much faster than other openslide-based frameworks, it supports fewer slide scanner formats. Slideflow also includes a Libvips backend, which adds support for *.scn, *.mrxs, *.ndpi, *.vms, and *.vmu files. You can set the active slide backend with the environmental variable SF_SLIDE_BACKEND:

export SF_SLIDE_BACKEND=libvips

Getting started

Slideflow experiments are organized into Projects, which supervise storage of whole-slide images, extracted tiles, and patient-level annotations. The fastest way to get started is to use one of our preconfigured projects, which will automatically download slides from the Genomic Data Commons:

import slideflow as sf

P = sf.create_project(
    root='/project/destination',
    cfg=sf.project.LungAdenoSquam(),
    download=True
)

After the slides have been downloaded and verified, you can skip to Extract tiles from slides.

Alternatively, to create a new custom project, supply the location of patient-level annotations (CSV), slides, and a destination for TFRecords to be saved:

import slideflow as sf
P = sf.create_project(
  '/project/path',
  annotations="/patient/annotations.csv",
  slides="/slides/directory",
  tfrecords="/tfrecords/directory"
)

Ensure that the annotations file has a slide column for each annotation entry with the filename (without extension) of the corresponding slide.

Extract tiles from slides

Next, whole-slide images are segmented into smaller image tiles and saved in *.tfrecords format. Extract tiles from slides at a given magnification (width in microns size) and resolution (width in pixels) using sf.Project.extract_tiles():

P.extract_tiles(
  tile_px=299,  # Tile size, in pixels
  tile_um=302   # Tile size, in microns
)

If slides are on a network drive or a spinning HDD, tile extraction can be accelerated by buffering slides to a SSD or ramdisk:

P.extract_tiles(
  ...,
  buffer="/mnt/ramdisk"
)

Training models

Once tiles are extracted, models can be trained. Start by configuring a set of hyperparameters:

params = sf.ModelParams(
  tile_px=299,
  tile_um=302,
  batch_size=32,
  model='xception',
  learning_rate=0.0001,
  ...
)

Models can then be trained using these parameters. Models can be trained to categorical, multi-categorical, continuous, or time-series outcomes, and the training process is highly configurable. For example, to train models in cross-validation to predict the outcome 'category1' as stored in the project annotations file:

P.train(
  'category1',
  params=params,
  save_predictions=True,
  multi_gpu=True
)

Evaluation, heatmaps, mosaic maps, and more

Slideflow includes a host of additional tools, including model evaluation and prediction, heatmaps, analysis of layer activations, mosaic maps, and more. See our full documentation for more details and tutorials.

📚 Publications

Slideflow has been used by:

Dolezal et al, Modern Pathology, 2020
Rosenberg et al, Journal of Clinical Oncology [abstract], 2020
Howard et al, Nature Communications, 2021
Dolezal et al Nature Communications, 2022
Storozuk et al, Modern Pathology [abstract], 2022
Partin et al Front Med, 2022
Dolezal et al Journal of Clinical Oncology [abstract], 2022
Dolezal et al Mediastinum [abstract], 2022
Howard et al npj Breast Cancer, 2023
Dolezal et al npj Precision Oncology, 2023
Hieromnimon et al [bioRxiv], 2023
Carrillo-Perez et al Cancer Imaging, 2023

🔓 License

This code is made available under the Apache-2.0 license.

🔗 Reference

If you find our work useful for your research, or if you use parts of this code, please consider citing as follows:

Dolezal, J.M., Kochanny, S., Dyer, E. et al. Slideflow: deep learning for digital histopathology with real-time whole-slide visualization. BMC Bioinformatics 25, 134 (2024). https://doi.org/10.1186/s12859-024-05758-x

@Article{Dolezal2024,
    author={Dolezal, James M. and Kochanny, Sara and Dyer, Emma and Ramesh, Siddhi and Srisuwananukorn, Andrew and Sacco, Matteo and Howard, Frederick M. and Li, Anran and Mohan, Prajval and Pearson, Alexander T.},
    title={Slideflow: deep learning for digital histopathology with real-time whole-slide visualization},
    journal={BMC Bioinformatics},
    year={2024},
    month={Mar},
    day={27},
    volume={25},
    number={1},
    pages={134},
    doi={10.1186/s12859-024-05758-x},
    url={https://doi.org/10.1186/s12859-024-05758-x}
}

FAQs

What is slideflow?

Is slideflow well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

slideflow

🚀 Features

Requirements

Optional

📥 Installation

Method 1: Install via pip

Method 2: Docker image

Method 3: From source

Non-Commercial Add-ons

⚙️ Configuration

Deep learning (PyTorch vs. Tensorflow)

Slide reading (cuCIM vs. Libvips)

Getting started

Extract tiles from slides

Training models

Evaluation, heatmaps, mosaic maps, and more

📚 Publications

🔓 License

🔗 Reference

Related posts

Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm

Malicious npm Package Typosquats Popular TypeScript ESLint Plugin, Exfiltrates Data and Enables Remote Exploitation