Security News
Fluent Assertions Faces Backlash After Abandoning Open Source Licensing
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
[!IMPORTANT]
The lm-buddy repo is being archived and its functionality is being folded into Lumigator. For more on the context and decisions behind this, please read here.
LM Buddy is a collection of jobs for finetuning and evaluating open-source (large) language models. The library makes use of YAML-based configuration files as inputs to CLI commands for each job, and tracks input/output artifacts on Weights & Biases.
The package currently exposes two types of jobs:
LM Buddy is available on PyPI and can be installed as follows:
pip install lm-buddy
LM Buddy is intended to be used in production on a Ray cluster
(see section below on Ray job submission).
Currently, we are utilizing Ray clusters running Python 3.11.9.
In order to avoid dependency/syntax errors when executing LM Buddy on Ray,
installation of this package requires Python between [3.11, 3.12)
.
LM Buddy exposes a CLI with a few commands, one for each type of job.
You can explore the CLI options by running lm-buddy --help
.
Once LM Buddy is installed in your local Python environment, usage is as follows:
# LLM finetuning
lm_buddy finetune --config finetuning_config.yaml
# LLM evaluation
lm_buddy evaluate lm-harness --config lm_harness_config.yaml
lm_buddy evaluate prometheus --config prometheus_config.yaml
See the examples/configs
folder for examples of the job configuration structure.
For a full end-to-end interactive workflow for using the package, see the example notebooks.
Although the LM Buddy CLI can be used as a standalone tool, its commands are intended to be used as the entrypoints for jobs on a Ray compute cluster. The suggested method for submitting an LM Buddy job to Ray is by using the Ray Python SDK within a local Python driver script. This requires you to specify a Ray runtime environment containing:
working_dir
for the local directory containing your job config YAML file, andpip
dependency for your desired version of lm-buddy
.Additionally, if your job requires GPU resources on the Ray entrypoint worker (e.g., for loading large/quantized models), you should specify the entrypoint_num_gpus parameter upon submission.
An example of the submission process is as follows:
from ray.job_submission import JobSubmissionClient
# If using a remote cluster, replace 127.0.0.1 with the head node's IP address.
client = JobSubmissionClient("http://127.0.0.1:8265")
runtime_env = {
"working_dir": "/path/to/working/directory",
"pip": ["lm-buddy==X.X.X"]
}
# Assuming 'config.yaml' is present in the working directory
client.submit_job(
entrypoint="lm_buddy finetune <job-name> --config config.yaml",
runtime_env=runtime_env,
entrypoint_num_gpus=1
)
See the examples/
folder for more examples of submitting Ray jobs.
See the contributing guide for more information on development workflows and/or building locally.
FAQs
Ray-centric library for finetuning and evaluation of (large) language models.
We found that lm-buddy demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Research
Security News
Socket researchers uncover the risks of a malicious Python package targeting Discord developers.
Security News
The UK is proposing a bold ban on ransomware payments by public entities to disrupt cybercrime, protect critical services, and lead global cybersecurity efforts.