Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Argilla is a collaboration tool for AI engineers and domain experts who need to build high-quality datasets for their projects. If you just want to get started, we recommend our UI demo or our free Hugging Face Spaces deployment integration. Curious, and want to know more? Read our documentation.
Whether you are working on monitoring and improving complex generative tasks involving LLM pipelines with RAG, or you are working on a predictive task for things like AB-testing of span- and text-classification models. Our versatile platform helps you ensure your data work pays off.
Compute is expensive and output quality is important. We help you focus on data, which tackles the root cause of both of these problems at once. Argilla helps you to achieve and keep high-quality standards for your data. This means you can improve the quality of your AI output.
Most AI tools are black boxes. Argilla is different. We believe that you should be the owner of both your data and your models. That's why we provide you with all the tools your team needs to manage your data and models in a way that suits you best.
Gathering data is a time-consuming process. Argilla helps by providing a platform that allows you to interact with your data in a more engaging way. This means you can quickly and easily label your data with filters, AI feedback suggestions and semantic search. So you can focus on training your models and monitoring their performance.
We are an open-source community-driven project and we love to hear from you. Here are some ways to get involved:
Community Meetup: listen in or present during one of our bi-weekly events.
Discord: get direct support from the community in #argilla-distilabel-general and #argilla-distilabel-help.
Roadmap: plans change but we love to discuss those with our community so feel encouraged to participate.
The community uses Argilla to create amazing open-source datasets and models.
AI teams from companies like the Red Cross, Loris.ai and Prolific use Argilla to improve the quality and efficiency of AI projects. They shared their experiences in our AI community meetup.
First things first! You can install the SDK with pip as follows:
pip install argilla
After that, you will need to deploy Argilla Server. The easiest way to do this is through our free Hugging Face Spaces deployment integration.
To use the client, you need to import the Argilla
class and instantiate it with the API URL and API key.
import argilla as rg
client = rg.Argilla(api_url="https://[your-owner-name]-[your_space_name].hf.space", api_key="owner.apikey")
We can now create a dataset with a simple text classification task. First, you need to define the dataset settings.
settings = rg.Settings(
guidelines="Classify the reviews as positive or negative.",
fields=[
rg.TextField(
name="review",
title="Text from the review",
use_markdown=False,
),
],
questions=[
rg.LabelQuestion(
name="my_label",
title="In which category does this article fit?",
labels=["positive", "negative"],
)
],
)
dataset = rg.Dataset(
name=f"my_first_dataset",
settings=settings,
client=client,
)
dataset.create()
Next, we can add records to the dataset.
pip install datasets
from datasets import load_dataset
data = load_dataset("imdb", split="train[:100]").to_list()
dataset.records.log(records=data, mapping={"text": "review"})
🎉 You have successfully created your first dataset with Argilla. You can now access it in the Argilla UI and start annotating the records. Need more info, check out our docs.
To help our community with the creation of contributions, we have created our community docs. Additionally, you can always schedule a meeting with our Developer Advocacy team so they can get you up to speed.
FAQs
The Argilla python server SDK
We found that argilla demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.