Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

icalfa

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

icalfa

A fork of the InterCode benchmark used to evaluate natural language to Bash command translation.

0.3.5
PyPI

Maintainers: 1

InterCode-ALFA

Description

A fork of the InterCode benchmark used to evaluate natural language to Bash command translation.
Dataset
PyPI Package

InterCode-ALFA Diagram

Installation

Install Docker Engine Instructions
Configure Docker for non-sudo users Instructions
Create python virtual environment

apt install python3.12-venv
python3 -m venv icalfa-venv
source icalfa-venv/bin/activate

Install InterCode-ALFA

pip install icalfa

[Optional] If you want to use a local LLM, install Ollama

curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.1:70b

[Optional] If you want to use the embedding comparison method, install mxbai-embed-large

ollama pull mxbai-embed-large

Usage

Run the benchmark

import os
from icalfa import submit_command
from datasets import load_dataset

# Store OpenAI key as environment variable 
os.environ['ICALFA_OPENAI_API_KEY'] = '...'

# Load dataset
dataset = load_dataset("westenfelder/InterCode-ALFA-Data")['train']

# Iterate through the dataset
score = 0
for index, row in enumerate(dataset):

    # Retrieve natural language prompt
    prompt = row['query']

    # Convert natural language prompt to Bash command here

    # Submit Bash command for benchmark scoring. 0 = incorrect, 1 = correct
    score += submit_command(index=index, command="...")

    # Retrieve ground truth commands
    ground_truth_command = row['gold']
    ground_truth_command2 = row['gold2']

# Print the benchmark result
print(score/len(dataset))

submit_command parameters

# By default icalfa uses OpenAI's GPT-4 model and expects an API key
submit_command(index, command, eval_mode="openai", eval_param="gpt-4-0613")

# A local model can be used via Ollama
submit_command(index, command, eval_mode="ollama", eval_param="llama3.1:70b")

# You can also test the original method used in Princeton's InterCode benchmark
submit_command(index, command, eval_mode="tfidf")

# An embedding based comparison method is also available
# This uses the mxbai-embed-large model via Ollama, with the eval_param specifying the similarity threshold
submit_command(index, command, eval_mode="embed", eval_param=0.75)

Manage Docker containers

# Stop containers
docker stop $(docker ps -a --filter "name=intercode*" -q)

# Delete containers
docker rm $(docker ps -a --filter "name=intercode*" -q)

Building

# update version in pyproject.toml and __init__.py
rm -rf dist
python3 -m build
python3 -m twine upload --repository pypi dist/*
pip install --upgrade icalfa

Credits

InterCode-ALFA is a fork of the InterCode benchmark developed by the Princeton NLP group.
InterCode Website
InterCode PyPI Package

FAQs

What is icalfa?

Is icalfa well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

icalfa

InterCode-ALFA

Description

Installation

Usage

Building

Credits

Related posts

Malicious npm Package Typosquats Popular TypeScript ESLint Plugin, Exfiltrates Data and Enables Remote Exploitation

Ultralytics PyPI Package Compromised Through GitHub Actions Cache Poisoning