🚀 Speed Benchmark
- Llama2 70B Training Speed
🎉 News
📖 Introduction
XTuner is an efficient, flexible and full-featured toolkit for fine-tuning large models.
Efficient
- Support LLM, VLM pre-training / fine-tuning on almost all GPUs. XTuner is capable of fine-tuning 7B LLM on a single 8GB GPU, as well as multi-node fine-tuning of models exceeding 70B.
- Automatically dispatch high-performance operators such as FlashAttention and Triton kernels to increase training throughput.
- Compatible with DeepSpeed 🚀, easily utilizing a variety of ZeRO optimization techniques.
Flexible
- Support various LLMs (InternLM, Mixtral-8x7B, Llama 2, ChatGLM, Qwen, Baichuan, ...).
- Support VLM (LLaVA). The performance of LLaVA-InternLM2-20B is outstanding.
- Well-designed data pipeline, accommodating datasets in any format, including but not limited to open-source and custom formats.
- Support various training algorithms (QLoRA, LoRA, full-parameter fune-tune), allowing users to choose the most suitable solution for their requirements.
Full-featured
- Support continuous pre-training, instruction fine-tuning, and agent fine-tuning.
- Support chatting with large models with pre-defined templates.
- The output models can seamlessly integrate with deployment and server toolkit (LMDeploy), and large-scale evaluation toolkit (OpenCompass, VLMEvalKit).
🔥 Supports
Models
|
SFT Datasets
|
Data Pipelines
|
Algorithms
|
|
|
|
|
🛠️ Quick Start
Installation
-
It is recommended to build a Python-3.10 virtual environment using conda
conda create --name xtuner-env python=3.10 -y
conda activate xtuner-env
-
Install XTuner via pip
pip install -U xtuner
or with DeepSpeed integration
pip install -U 'xtuner[deepspeed]'
-
Install XTuner from source
git clone https://github.com/InternLM/xtuner.git
cd xtuner
pip install -e '.[all]'
Fine-tune
XTuner supports the efficient fine-tune (e.g., QLoRA) for LLMs. Dataset prepare guides can be found on dataset_prepare.md.
-
Step 0, prepare the config. XTuner provides many ready-to-use configs and we can view all configs by
xtuner list-cfg
Or, if the provided configs cannot meet the requirements, please copy the provided config to the specified directory and make specific modifications by
xtuner copy-cfg ${CONFIG_NAME} ${SAVE_PATH}
vi ${SAVE_PATH}/${CONFIG_NAME}_copy.py
-
Step 1, start fine-tuning.
xtuner train ${CONFIG_NAME_OR_PATH}
For example, we can start the QLoRA fine-tuning of InternLM2.5-Chat-7B with oasst1 dataset by
# On a single GPU
xtuner train internlm2_5_chat_7b_qlora_oasst1_e3 --deepspeed deepspeed_zero2
# On multiple GPUs
(DIST) NPROC_PER_NODE=${GPU_NUM} xtuner train internlm2_5_chat_7b_qlora_oasst1_e3 --deepspeed deepspeed_zero2
(SLURM) srun ${SRUN_ARGS} xtuner train internlm2_5_chat_7b_qlora_oasst1_e3 --launcher slurm --deepspeed deepspeed_zero2
-
--deepspeed
means using DeepSpeed 🚀 to optimize the training. XTuner comes with several integrated strategies including ZeRO-1, ZeRO-2, and ZeRO-3. If you wish to disable this feature, simply remove this argument.
-
For more examples, please see finetune.md.
-
Step 2, convert the saved PTH model (if using DeepSpeed, it will be a directory) to Hugging Face model, by
xtuner convert pth_to_hf ${CONFIG_NAME_OR_PATH} ${PTH} ${SAVE_PATH}
Chat
XTuner provides tools to chat with pretrained / fine-tuned LLMs.
xtuner chat ${NAME_OR_PATH_TO_LLM} --adapter {NAME_OR_PATH_TO_ADAPTER} [optional arguments]
For example, we can start the chat with InternLM2.5-Chat-7B :
xtuner chat internlm/internlm2_5-chat-7b --prompt-template internlm2_chat
For more examples, please see chat.md.
Deployment
-
Step 0, merge the Hugging Face adapter to pretrained LLM, by
xtuner convert merge \
${NAME_OR_PATH_TO_LLM} \
${NAME_OR_PATH_TO_ADAPTER} \
${SAVE_PATH} \
--max-shard-size 2GB
-
Step 1, deploy fine-tuned LLM with any other framework, such as LMDeploy 🚀.
pip install lmdeploy
python -m lmdeploy.pytorch.chat ${NAME_OR_PATH_TO_LLM} \
--max_new_tokens 256 \
--temperture 0.8 \
--top_p 0.95 \
--seed 0
🔥 Seeking efficient inference with less GPU memory? Try 4-bit quantization from LMDeploy! For more details, see here.
Evaluation
- We recommend using OpenCompass, a comprehensive and systematic LLM evaluation library, which currently supports 50+ datasets with about 300,000 questions.
🤝 Contributing
We appreciate all contributions to XTuner. Please refer to CONTRIBUTING.md for the contributing guideline.
🎖️ Acknowledgement
🖊️ Citation
@misc{2023xtuner,
title={XTuner: A Toolkit for Efficiently Fine-tuning LLM},
author={XTuner Contributors},
howpublished = {\url{https://github.com/InternLM/xtuner}},
year={2023}
}
License
This project is released under the Apache License 2.0. Please also adhere to the Licenses of models and datasets being used.