Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

icalfa

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

icalfa

A fork of the InterCode benchmark used to evaluate natural language to Bash command translation.

  • 0.3.5
  • PyPI
  • Socket score

Maintainers
1

InterCode-ALFA

Description

A fork of the InterCode benchmark used to evaluate natural language to Bash command translation.
Dataset
PyPI Package

InterCode-ALFA Diagram

Installation

  • Install Docker Engine Instructions
  • Configure Docker for non-sudo users Instructions
  • Create python virtual environment
apt install python3.12-venv
python3 -m venv icalfa-venv
source icalfa-venv/bin/activate
  • Install InterCode-ALFA
pip install icalfa
  • [Optional] If you want to use a local LLM, install Ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama3.1:70b
  • [Optional] If you want to use the embedding comparison method, install mxbai-embed-large
ollama pull mxbai-embed-large

Usage

  • Run the benchmark
import os
from icalfa import submit_command
from datasets import load_dataset

# Store OpenAI key as environment variable 
os.environ['ICALFA_OPENAI_API_KEY'] = '...'

# Load dataset
dataset = load_dataset("westenfelder/InterCode-ALFA-Data")['train']

# Iterate through the dataset
score = 0
for index, row in enumerate(dataset):

    # Retrieve natural language prompt
    prompt = row['query']

    # Convert natural language prompt to Bash command here

    # Submit Bash command for benchmark scoring. 0 = incorrect, 1 = correct
    score += submit_command(index=index, command="...")

    # Retrieve ground truth commands
    ground_truth_command = row['gold']
    ground_truth_command2 = row['gold2']

# Print the benchmark result
print(score/len(dataset))
  • submit_command parameters
# By default icalfa uses OpenAI's GPT-4 model and expects an API key
submit_command(index, command, eval_mode="openai", eval_param="gpt-4-0613")

# A local model can be used via Ollama
submit_command(index, command, eval_mode="ollama", eval_param="llama3.1:70b")

# You can also test the original method used in Princeton's InterCode benchmark
submit_command(index, command, eval_mode="tfidf")

# An embedding based comparison method is also available
# This uses the mxbai-embed-large model via Ollama, with the eval_param specifying the similarity threshold
submit_command(index, command, eval_mode="embed", eval_param=0.75)
  • Manage Docker containers
# Stop containers
docker stop $(docker ps -a --filter "name=intercode*" -q)

# Delete containers
docker rm $(docker ps -a --filter "name=intercode*" -q)

Building

# update version in pyproject.toml and __init__.py
rm -rf dist
python3 -m build
python3 -m twine upload --repository pypi dist/*
pip install --upgrade icalfa

Credits

InterCode-ALFA is a fork of the InterCode benchmark developed by the Princeton NLP group.
InterCode Website
InterCode PyPI Package

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc