๐ก๏ธ ForgetNet: Differentially Private Block-wise Gradient Shuffle for Deep Learning ๐ง

ForgetNet introduces a novel privacy-preserving technique for deep learning: Differentially Private Block-wise Gradient Shuffle (DP-BloGS). ๐๐
๐ Features
- ๐ Fast training times, close to non-private training
- ๐ฏ Competitive privacy guarantees compared to DP-SGD
- ๐ Better privacy-utility trade-off in many scenarios
- ๐๏ธ Scalable to large models (tested up to 1.1 billion parameters)
- ๐งฎ Parameter-wise privacy budget allocation
๐ฆ Installation
pip install forgetnet
๐ Quick Start BloGSSFTTrainer
from forgetnet import BloGSSFTTrainer
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")
trainer = BloGSSFTTrainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
tokenizer=tokenizer,
dataset_text_field="text",
target_epsilon=1.0,
delta=1e-5,
clip_value=1.0
)
trainer.train()
๐ Quick Start with Privacy Engine and ResNet
Here's how to use the BloGS Privacy Engine with a ResNet model for MNIST classification:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torchvision.models import resnet18
from torch.utils.data import DataLoader
from forgetnet import BloGSPrivacyEngine
def mnist_resnet18():
model = resnet18()
model.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
model.fc = nn.Linear(model.fc.in_features, 10)
return model
batch_size = 64
learning_rate = 0.01
epochs = 10
target_epsilon = 1.0
delta = 1e-5
clip_value = 1.0
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])
train_dataset = datasets.MNIST('./data', train=True, download=True, transform=transform)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
total_iterations = (len(train_dataset) // batch_size) * epochs
model = mnist_resnet18().to(device)
optimizer = optim.SGD(model.parameters(), lr=learning_rate)
privacy_engine = BloGSPrivacyEngine(
optimizer=optimizer,
model=model,
target_epsilon=target_epsilon,
delta=delta,
clip_value=clip_value,
steps=total_iterations,
batch_size=batch_size
)
model.train()
for epoch in range(epochs):
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
privacy_engine.zero_grad()
output = model(data)
loss = nn.functional.cross_entropy(output, target)
loss.backward()
epsilon_spent, delta = privacy_engine.step()
if batch_idx % 100 == 0:
print(f'Epoch {epoch}, Batch {batch_idx}, Loss: {loss.item():.4f}, Epsilon: {epsilon_spent:.4f}')
total_epsilon_spent = privacy_engine.get_privacy_spent()
print(f"Total privacy spent: ฮต = {total_epsilon_spent:.4f}")
This example demonstrates:
- Setting up a ResNet18 model modified for MNIST
- Loading the MNIST dataset
- Initializing the PrivacyEngine with the model and optimizer
- Training the model with privacy-preserving gradient updates
- Monitoring privacy budget expenditure during training
Adjust hyperparameters as needed for your specific use case.
๐ How It Works
DP-BloGS introduces a probabilistic approach to gradient noise through block-wise shuffling:
- ๐ Divide gradients into blocks
- ๐ Shuffle blocks randomly
- ๐ Apply layer-specific block sizes
- โ๏ธ Use batch layer clipping
- ๐งฎ Accumulate gradients
This combination allows for fast training while maintaining strong privacy guarantees!
๐ Performance
DP-BloGS has been tested on various model architectures, including:
- GPT-2 (124M)
- BERT (110M)
- OPT (350M)
- BLOOM (560M)
- TinyLlama (1.1B)
Results show competitive or better performance compared to DP-SGD in terms of:
- ๐โโ๏ธ Training speed
- ๐ญ Privacy guarantees
- ๐ Model utility
๐ต๏ธ Membership Inference Attack (MIA)
ForgetNet now includes a powerful Membership Inference Attack tool to assess the privacy risks of your language models:
๐ Quick Start
from forgetnet import LanguageMIA
mia = LanguageMIA()
results = mia.attack(train_dataset, test_dataset, model, tokenizer)
print(f"ROC AUC: {results['roc_auc']:.4f}")
print(f"Precision-Recall AUC: {results['precision_recall_auc']:.4f}")
print(f"Best model: {results['best_model']}")
print(f"Optimal threshold: {results['optimal_threshold']:.4f}")
๐ Comprehensive Evaluation
The LanguageMIA
class provides a detailed analysis of your model's vulnerability to membership inference attacks:
- ๐ฏ ROC AUC: Measures the overall performance of the attack
- ๐ Precision-Recall AUC: Assesses the trade-off between precision and recall
- ๐ง Best Model: Identifies the most effective attack model
- ๐ Optimal Threshold: Determines the best decision threshold for classification
- ๐ Standard and Optimal Metrics: Provides accuracy, precision, recall, and F1 score for both standard (0.5) and optimal thresholds
๐ ๏ธ Integration with Model Evaluation
Easily incorporate MIA into your model evaluation pipeline:
def evaluate_model(model, train_dataset, test_dataset, tokenizer):
mia = LanguageMIA()
mia_results = mia.attack(train_dataset, test_dataset, model, tokenizer)
perplexity = evaluate_perplexity(model, test_dataset, tokenizer)
results = {
'perplexity': perplexity,
'mia_results': mia_results,
}
return results
evaluation_results = evaluate_model(model, train_dataset, test_dataset, tokenizer)
print(f"Model Perplexity: {evaluation_results['perplexity']:.2f}")
print(f"MIA ROC-AUC: {evaluation_results['mia_results']['roc_auc']:.4f}")
๐ฏ Benefits
- ๐ Quantify Privacy Risks: Understand your model's vulnerability to membership inference attacks
- ๐ Track Improvements: Monitor how privacy-preserving techniques affect model privacy
- ๐ Optimize Trade-offs: Fine-tune the balance between utility and privacy in your models
Use the LanguageMIA
tool to ensure your language models are both powerful and privacy-preserving!
๐ Citation
If you use ForgetNet in your research, please cite my paper:
@article{zagardo2024dpblogs,
title={Differentially Private Block-wise Gradient Shuffle for Deep Learning},
author={Zagardo, David},
journal={arXiv preprint arXiv:2407.21347},
year={2024},
note={arXiv:2407.21347 [cs.LG]}
}
๐ค Contributing
We welcome contributions! Please see my CONTRIBUTING.md for details on how to get started.
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgements
We thank the open-source community and the authors of the papers cited in our work for their valuable contributions to the field of privacy-preserving machine learning.
Built with ๐ง by David Zagardo