Product
Introducing License Enforcement in Socket
Ensure open-source compliance with Socket’s License Enforcement Beta. Set up your License Policy and secure your software!
LLMDet is a text detection tool that can identify which generated sources the text came from (e.g. large language model or human-write). The core idea of the detection algorithm is to use the n-grams probability sampled from specified language model to calculate proxy perplexity of large language models, and use the proxy perplexity as a feature to train a text classifier.
We believe that a practical LLM detection tool needs to have the following capabilities, which is also the goal of our LLMDet.
A package for large language model-generated text detection tool.
Code is compatible with Python >=3.8
pip install llmdet
python setup.py install
requirements.txt
for dependent python packages.Currently, it is supported to determine whether the text comes from GPT-2, OPT, UniLM or Human-write.
import llmdet
llmdet.load_probability()
text = "The actress was honoured for her role in 'The Reader' at the annual ceremony, which was held at the Royal Albert Hall. The film, which is based on the novel by the same name by Philip Roth, tells the story of a New York Times reporter who returns to his hometown to cover the death of his brother-in-law. Winslet plays his wife, with whom he has been divided since the death of their son.\nIn the film, Winslet plays the mother of the grieving brother-in-law.\nThe actress also won a Golden Globe for her role in the film at the ceremony in November.\nWinslet was also nominated for an Oscar for her role in 'The Reader'.\nThe 63-year-old Winslet was seen accepting her awards at the ceremony, where she was joined by her husband, John Krasinski, who has been nominated for best supporting actor in the film.\nWinslet and Krasinski met while"
# Detect, `text` is a string or string list
result = llmdet.detect(text)
print(result)
[{
'OPT': 0.5451331013247862,
'GPT-2': 0.4393605735865629,
'UniLM': 0.012642800848279893,
'T5': 0.0022592730436008556,
'Bloom': 0.00025873253035729044,
'GPT-neo': 0.0002520776780109571,
'LLaMA': 6.0459794454546154e-05,
'Human_write': 1.9576671778802474e-05,
'BART': 1.3404522168622544e-05
}]
FAQs
LLMDet: A Large Language Models Detection Tool
We found that llmdet demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Ensure open-source compliance with Socket’s License Enforcement Beta. Set up your License Policy and secure your software!
Product
We're launching a new set of license analysis and compliance features for analyzing, managing, and complying with licenses across a range of supported languages and ecosystems.
Product
We're excited to introduce Socket Optimize, a powerful CLI command to secure open source dependencies with tested, optimized package overrides.