Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

prompt-security-fuzzer

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

prompt-security-fuzzer

LLM and System Prompt vulnerability scanner tool

  • 2.0.0
  • PyPI
  • Socket score

Maintainers
1

prompt-icon Prompt Fuzzer prompt-icon

The open-source tool to help you harden your GenAI applications

License: MIT ci GitHub contributors Last release Open In Colab

Brought to you by Prompt Security, the One-Stop Platform for GenAI Security :lock:

Prompt Security Logo


Table of Contents


✨ What is the Prompt Fuzzer

  1. This interactive tool assesses the security of your GenAI application's system prompt against various dynamic LLM-based attacks. It provides a security evaluation based on the outcome of these attack simulations, enabling you to strengthen your system prompt as needed.
  2. The Prompt Fuzzer dynamically tailors its tests to your application's unique configuration and domain.
  3. The Fuzzer also includes a Playground chat interface, giving you the chance to iteratively improve your system prompt, hardening it against a wide spectrum of generative AI attacks.

:warning: Using the Prompt Fuzzer will lead to the consumption of tokens. :warning:


🚀 Installation

prompt-fuzzer-install-final

  1. Install the Fuzzer package

    Using pip install
    pip install prompt-security-fuzzer
    

    Using the package page on PyPi

    You can also visit the package page on PyPi

    Or grab latest release wheel file form releases

  2. Launch the Fuzzer

    export OPENAI_API_KEY=sk-123XXXXXXXXXXXX
    
    prompt-security-fuzzer
    
  3. Input your system prompt

  4. Start testing

  5. Test yourself with the Playground! Iterate as many times are you like until your system prompt is secure.

:computer: Usage

Features

The Prompt Fuzzer Supports:
🧞 16 llm providers
🔫 15 different attacks
💬 Interactive mode
🤖 CLI mode
🧵 Multi threaded testing

Environment variables:

You need to set an environment variable to hold the access key of your preferred LLM provider. default is OPENAI_API_KEY

Example: set OPENAI_API_KEY with your API Token to use with your OpenAI account.

Alternatively, create a file named .env in the current directory and set the OPENAI_API_KEY there.

We're fully LLM agnostic. (Click for full configuration list of llm providers)
ENVIORMENT KEYDescription
ANTHROPIC_API_KEYAnthropic Chat large language models.
ANYSCALE_API_KEYAnyscale Chat large language models.
AZURE OPENAI_API_KEYAzure OpenAI Chat Completion API.
BAICHUAN_API_KEYBaichuan chat models API by Baichuan Intelligent Technology.
COHERE_API_KEYCohere chat large language models.
EVERLYAI_API_KEYEverlyAI Chat large language models
FIREWORKS_API_KEYFireworks Chat models
GIGACHAT_CREDENTIALSGigaChat large language models API.
GOOGLE_API_KEYGoogle PaLM Chat models API.
JINA_API_TOKENJina AI Chat models API.
KONKO_API_KEYChatKonko Chat large language models API.
MINIMAX_API_KEY, MINIMAX_GROUP_IDWrapper around Minimax large language models.
OPENAI_API_KEYOpenAI Chat large language models API.
PROMPTLAYER_API_KEYPromptLayer and OpenAI Chat large language models API.
QIANFAN_AK, QIANFAN_SKBaidu Qianfan chat models.
YC_API_KEYYandexGPT large language models.


Command line Options

  • --list-providers Lists all available providers
  • --list-attacks Lists available attacks and exit
  • --attack-provider Attack Provider
  • --attack-model Attack Model
  • --target-provider Target provider
  • --target-model Target model
  • --num-attempts, -n NUM_ATTEMPTS Number of different attack prompts
  • --num-threads, -t NUM_THREADS Number of worker threads
  • --attack-temperature, -a ATTACK_TEMPERATURE Temperature for attack model
  • --debug-level, -d DEBUG_LEVEL Debug level (0-2)
  • -batch, -b Run the fuzzer in unattended (batch) mode, bypassing the interactive steps

Examples

System prompt examples (of various strengths) can be found in the subdirectory system_prompt.examples in the sources.

Interactive mode (default mode)

Run tests against the system prompt

    prompt_security_fuzzer 

:speedboat: Quick start single run

Run tests against the system prompt (in non-interactive batch mode):

    prompt-security-fuzzer -b ./system_prompt.examples/medium_system_prompt.txt
📺 Custom Benchmark!

Run tests against the system prompt with a custom benchmark

    prompt-security-fuzzer -b ./system_prompt.examples/medium_system_prompt.txt --custom-benchmark=ps_fuzz/attack_data/custom_benchmark1.csv
🐹 Run only a subset of attacks!

Run tests against the system prompt with a subset of attacks

    prompt-security-fuzzer -b ./system_prompt.examples/medium_system_prompt.txt --custom-benchmark=ps_fuzz/attack_data/custom_benchmark1.csv --tests='["ucar","amnesia"]'



📓 Google Colab Notebook

Refine and harden your system prompt in our Google Colab Notebook

Prompt Fuzzer Refinement Process

🎬 Demo video

Watch the video

:crossed_swords: Simulated Attack Details

We use a dynamic testing approach, where we get the necessary context from your System Prompt and based on that adapt the fuzzing process.

Jailbreak
  • AIM Jailbreak: Probes ethical compliance by eliciting instructions on illegal activities via Machiavelli roleplay.
  • Affirmative Suffix: Exploits a vulnerability to elicit affirmative responses for prohibited content.
  • Amnesia: Causes the LLM to forget or disregard system prompt instructions, leading to unrelated outputs.
  • Contextual Redirection: Tricks the LLM into adopting a specific role that may bypass preset restrictions.
  • Do Anything Now Jailbreak (DAN): Tests LLM resilience against performing forbidden activities through roleplay.
  • Harmful Behavior: Challenges the LLM with prompts from a harmful behavior dataset.
  • Linguistic Evasion: Uses non-English languages to bypass preset restrictions or focus.
  • Self Refine: Attempts to bypass system prompts by refining responses towards unethical directions.
  • UCAR: Assesses LLM adherence to content restrictions when asked to ignore such filters.
  • Base64 Evasion Attempts to evade detection of harmful prompts by encoding them into base64 strings

Prompt Injection
  • Authoritative Role Impersonation: Impersonates an authoritative role to misdirect the LLM's outputs.
  • Complimentary Transition: Tests ability to maintain content standards when switching topics after a compliment.
  • Ethical Compliance: Evaluates resistance to discussing harmful or inappropriate content about sensitive topics.
  • Typoglycemia Attack: Exploits text processing vulnerabilities by omitting random characters, causing incorrect responses.

System prompt extraction
  • System Prompt Stealer: Attempts to extract the LLM's internal configuration or sensitive information.
Definitions
  • Broken: Attack type attempts that LLM succumbed to.
  • Resilient: Attack type attempts that LLM resisted.
  • Errors: Attack type attempts that had inconclusive results.


:rainbow: What’s next on the roadmap?

  • Google Colab Notebook
  • Adjust the output evaluation mechanism for prompt dataset testing
  • More attack types
  • Better reporting capabilites
  • Hardening recommendations

Turn this into a community project! We want this to be useful to everyone building GenAI applications. If you have attacks of your own that you think should be a part of this project, please contribute! This is how: https://github.com/prompt-security/ps-fuzz/blob/main/CONTRIBUTING.md

🍻 Contributing

Interested in contributing to the development of our tools? Great! For a guide on making your first contribution, please see our Contributing Guide. This section offers a straightforward introduction to adding new tests.

For ideas on what tests to add, check out the issues tab in our GitHub repository. Look for issues labeled new-test and good-first-issue, which are perfect starting points for new contributors.

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc