New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

crowd-kit

Package Overview
Dependencies
Maintainers
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

crowd-kit

Computational Quality Control for Crowdsourcing

  • 1.4.1
  • PyPI
  • Socket score

Maintainers
2

Crowd-Kit: Computational Quality Control for Crowdsourcing

Crowd-Kit

PyPI Version GitHub Tests Codecov Documentation Paper

Crowd-Kit is a powerful Python library that implements commonly-used aggregation methods for crowdsourced annotation and offers the relevant metrics and datasets. We strive to implement functionality that simplifies working with crowdsourced data.

Currently, Crowd-Kit contains:

  • implementations of commonly-used aggregation methods for categorical, pairwise, textual, and segmentation responses;
  • metrics of uncertainty, consistency, and agreement with aggregate;
  • loaders for popular crowdsourced datasets.

Also, the learning subpackage contains PyTorch implementations of deep learning from crowds methods and advanced aggregation algorithms.

Installing

To install Crowd-Kit, run the following command: pip install crowd-kit. If you also want to use the learning subpackage, type pip install crowd-kit[learning].

If you are interested in contributing to Crowd-Kit, use uv to manage the dependencies:

uv venv
uv pip install -e '.[dev,docs,learning]'
uv tool run pre-commit install

We use pytest for testing and a variety of linters, including pre-commit, Black, isort, Flake8, pyupgrade, and nbQA, to simplify code maintenance.

Getting Started

This example shows how to use Crowd-Kit for categorical aggregation using the classical Dawid-Skene algorithm.

First, let us do all the necessary imports.

from crowdkit.aggregation import DawidSkene
from crowdkit.datasets import load_dataset

import pandas as pd

Then, you need to read your annotations into Pandas DataFrame with columns task, worker, label. Alternatively, you can download an example dataset:

df = pd.read_csv('results.csv')  # should contain columns: task, worker, label
# df, ground_truth = load_dataset('relevance-2')  # or download an example dataset

Then, you can aggregate the workers' responses using the fit_predict method from the scikit-learn library:

aggregated_labels = DawidSkene(n_iter=100).fit_predict(df)

More usage examples

Implemented Aggregation Methods

Below is the list of currently implemented methods, including the already available (✅) and in progress (🟡).

Categorical Responses

MethodStatus
Majority Vote
One-coin Dawid-Skene
Dawid-Skene
Gold Majority Vote
M-MSR
Wawa
Zero-Based Skill
GLAD
KOS
MACE

Multi-Label Responses

MethodStatus
Binary Relevance

Textual Responses

MethodStatus
RASA
HRRASA
ROVER

Image Segmentation

MethodStatus
Segmentation MV
Segmentation RASA
Segmentation EM

Pairwise Comparisons

MethodStatus
Bradley-Terry
Noisy Bradley-Terry

Learning from Crowds

MethodStatus
CrowdLayer
CoNAL

Citation

@article{CrowdKit,
  author    = {Ustalov, Dmitry and Pavlichenko, Nikita and Tseitlin, Boris},
  title     = {{Learning from Crowds with Crowd-Kit}},
  year      = {2024},
  journal   = {Journal of Open Source Software},
  volume    = {9},
  number    = {96},
  pages     = {6227},
  publisher = {The Open Journal},
  doi       = {10.21105/joss.06227},
  issn      = {2475-9066},
  eprint    = {2109.08584},
  eprinttype = {arxiv},
  eprintclass = {cs.HC},
  language  = {english},
}

Support and Contributions

Please use GitHub Issues to seek support and submit feature requests. We accept contributions to Crowd-Kit via GitHub as according to our guidelines in CONTRIBUTING.md.

License

© Crowd-Kit team authors, 2020–2024. Licensed under the Apache License, Version 2.0. See LICENSE file for more details.

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc