
spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
This package provides spaCy components and
architectures to use transformer models via
Hugging Face's transformers
in
spaCy. The result is convenient access to state-of-the-art transformer
architectures, such as BERT, GPT-2, XLNet, etc.
This release requires spaCy v3. For the
previous version of this library, see the
v0.6.x
branch.

Features
- Use pretrained transformer models like BERT, RoBERTa and XLNet to
power your spaCy pipeline.
- Easy multi-task learning: backprop to one transformer model from several
pipeline components.
- Train using spaCy v3's powerful and extensible config system.
- Automatic alignment of transformer output to spaCy's tokenization.
- Easily customize what transformer data is saved in the
Doc
object. - Easily customize how long documents are processed.
- Out-of-the-box serialization and model packaging.
🚀 Installation
Installing the package from pip will automatically install all dependencies,
including PyTorch and spaCy. Make sure you install this package before you
install the models. Also note that this package requires Python 3.6+,
PyTorch v1.5+ and spaCy v3.0+.
pip install 'spacy[transformers]'
For GPU installation, find your CUDA version using nvcc --version
and add the
version in brackets, e.g.
spacy[transformers,cuda92]
for CUDA9.2 or spacy[transformers,cuda100]
for
CUDA10.0.
If you are having trouble installing PyTorch, follow the
instructions on the official website
for your specific operating system and requirements.
📖 Documentation
⚠️ Important note: This package has been extensively refactored to take
advantage of spaCy v3.0. Previous versions that were built
for spaCy v2.x worked considerably differently. Please
see previous tagged versions of this README for documentation on prior
versions.
Applying pretrained text and token classification models
Note that the transformer
component from spacy-transformers
does not support
task-specific heads like token or text classification. A task-specific
transformer model can be used as a source of features to train spaCy components
like ner
or textcat
, but the transformer
component does not provide access
to task-specific heads for training or inference.
Alternatively, if you only want use to the predictions from an existing
Hugging Face text or token classification model, you can use the wrappers from
spacy-huggingface-pipelines
to incorporate task-specific transformer models into your spaCy pipelines.
Bug reports and other issues
Please use spaCy's issue tracker to
report a bug, or open a new thread on the
discussion board for any other
issue.