You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP →

Book a Demo Install Sign in

dataset-with-logits

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

dataset-with-logits

PyTorch datasets with pre-computed model logits for efficient research

0.2.9

PyPI

Maintainers: 1

Dataset with Logits

A PyTorch package for loading computer vision datasets paired with pre-computed model logits. Perfect for knowledge distillation, model analysis, and efficient research workflows.

🚀 Quick Start

pip install dataset-with-logits

import torchvision.transforms as transforms
from dataset_with_logits import ImageNet

# Define transforms
transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
])

# Create dataset (auto-downloads predictions)
dataset = ImageNet(
    root='/path/to/imagenet/val',
    model='resnet18',
    transform=transform,
    auto_download=True
)

# Use with DataLoader
from torch.utils.data import DataLoader
loader = DataLoader(dataset, batch_size=32, shuffle=True)

for images, labels, logits in loader:
    # images: [batch_size, 3, 224, 224] 
    # labels: [batch_size] - ground truth
    # logits: [batch_size, 1000] - model predictions
    break

📊 Available Models

ImageNet-1K

resnet18 - ResNet-18 (11.7M parameters)
resnet50 - ResNet-50 (25.6M parameters)
resnet152 - ResNet-152 (60.2M parameters)
vit_l_16 - Vision Transformer Large (304M parameters)
mobilenet_v3_small - MobileNet V3 Small (2.5M parameters)
mobilenet_v3_large - MobileNet V3 Large (5.5M parameters)

More models and datasets coming soon!

🎯 Use Cases

Knowledge Distillation

import torch.nn.functional as F

def knowledge_distillation_loss(student_logits, teacher_logits, labels, temperature=3.0):
    student_soft = F.log_softmax(student_logits / temperature, dim=1)
    teacher_soft = F.softmax(teacher_logits / temperature, dim=1)
    return F.kl_div(student_soft, teacher_soft, reduction='batchmean')

# In your training loop
for images, labels, teacher_logits in dataloader:
    student_logits = student_model(images)
    loss = knowledge_distillation_loss(student_logits, teacher_logits, labels)

Model Analysis

from dataset_with_logits import ImageNet

# Compare different models
models = ['resnet18', 'resnet152', 'vit_l_16']
datasets = {}

for model in models:
    datasets[model] = ImageNet(root=imagenet_path, model=model)

# Analyze prediction differences, calibration, etc.

🔧 Advanced Usage

List Available Models

from dataset_with_logits import list_available_models

models = list_available_models()
print(models)
# {'imagenet1k': {'resnet18': 'ResNet-18 (11.7M parameters)', ...}}

Custom Cache Directory

dataset = ImageNet(
    root='/path/to/imagenet',
    model='resnet18',
    cache_dir='/custom/cache/dir',
    auto_download=True
)

Version Control

dataset = ImageNet(
    root='/path/to/imagenet',
    model='resnet18',
    version='v0.1.0',  # Specific version
    auto_download=True
)

📁 File Format

Prediction files are CSV format with:

id: Image filename (no extension)
label: Ground truth class index
logits: Semicolon-separated model outputs

Example:

id,label,logits
ILSVRC2012_val_00000001,65,-2.3;1.7;0.2;...;0.8
ILSVRC2012_val_00000002,970,0.1;-1.2;3.4;...;-0.5

🌐 Data Source

Prediction files are automatically downloaded from Hugging Face Hub (primary) with GitHub fallback. Files are cached locally after first download.

Hosting Infrastructure:

🤗 Primary: Hugging Face Datasets - Fast, reliable, academic-friendly
🐙 Fallback: GitHub Releases - For redundancy
📦 Multi-backend: Automatic fallback ensures high availability

🔍 Examples

See the examples/ directory for:

Basic usage
Knowledge distillation
Model comparison
Advanced workflows

📦 Installation

From PyPI (Recommended)

pip install dataset-with-logits

From Source

git clone https://github.com/ViGeng/predictions-on-datasets.git
cd predictions-on-datasets/dataset_with_logits
pip install -e .

🤝 Contributing

Contributions are welcome! See the main repository for contribution guidelines.

📄 License

MIT License - see LICENSE file for details.

Keywords

FAQs

What is dataset-with-logits?

Is dataset-with-logits well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

dataset-with-logits

Dataset with Logits

🚀 Quick Start

📊 Available Models

ImageNet-1K

🎯 Use Cases

Knowledge Distillation

Model Analysis

🔧 Advanced Usage

List Available Models

Custom Cache Directory

Version Control

📁 File Format

🌐 Data Source

🔍 Examples

📦 Installation

From PyPI (Recommended)

From Source

🤝 Contributing

📄 License

Keywords

Related posts

60 Malicious Ruby Gems Used in Targeted Credential Theft Campaign

New CNA Scorecard Tool Ranks CVE Data Quality Across the Ecosystem