Grain - Feeding JAX Models

Installation
| Quickstart
| Reference docs
| Change logs
Grain is a Python library for reading and processing data for training and
evaluating JAX models. It is flexible, fast and deterministic.
Grain allows to define data processing steps in a simple declarative way:
import grain
dataset = (
grain.MapDataset.source([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
.shuffle(seed=42)
.map(lambda x: x+1)
.batch(batch_size=2)
)
for batch in dataset:
Grain is designed to work with JAX models but it does not require JAX to run
and can be used with other frameworks as well.
Installation
Grain is available on PyPI and can be
installed with pip install grain.
Supported platforms
Grain does not directly use GPU or TPU in its transformations, the processing
within Grain will be done on the CPU by default.
| x86_64 | yes | no | yes |
| aarch64 | yes | yes | n/a |
Quickstart
Citing Grain
To cite this repository:
@software{grain2023github,
author = {Marvin Ritter and Ihor Indyk and Aayush Singh and Andrew Audibert and Anoosha Seelam and Camelia Hanes and Eric Lau and Jacek Olesiak and Jiyang Kang and Xihui Wu},
title = {{Grain} - Feeding JAX Models},
url = {http://github.com/google/grain},
version = {0.2.12},
year = {2023},
}
The version number is intended to be that from pyproject.toml, and the year corresponds to the project's open-source release.
Existing users
Grain is used by MaxText,
Gemma,
kauldron,
maxdiffusion and multiple
internal Google projects.