
Security News
Astral Launches pyx: A Python-Native Package Registry
Astral unveils pyx, a Python-native package registry in beta, designed to speed installs, enhance security, and integrate deeply with uv.
Multi-vendor GPU health monitoring supporting old GPUs for e-waste reduction
A comprehensive multi-vendor GPU health monitoring and optimization tool that helps users assess GPU performance and select optimal hardware for their workloads.
🚀 Features
🔥 Comprehensive GPU Health Monitoring: Temperature, power, utilization, and throttling detection
⚡ Advanced Stress Testing: Compute, memory bandwidth, VRAM, and mixed-precision tests
📊 Detailed Health Scoring: 100-point scoring system with actionable recommendations
🖥️ Multi-GPU Support: Test and compare multiple GPUs simultaneously
🧪 Mock Mode: Test on any computer without GPUs (perfect for development)
🔌 Multi-Vendor Support: NVIDIA, AMD, Intel, and Mock mode
☁️ Cloud-Ready: Designed to help select optimal GPUs for cloud deployment (coming soon!)
Basic Installation (Works on any system with GPU)
pip install gpu-benchmark-tool
Installation with Enhanced GPU Support
pip install gpu-benchmark-tool[nvidia]
pip install gpu-benchmark-tool[amd]
pip install gpu-benchmark-tool[intel]
pip install gpu-benchmark-tool[all]
🎯 Quick Start
Check Available GPUs gpu-benchmark list
Run Benchmark
gpu-benchmark benchmark
gpu-benchmark benchmark --gpu-id 0
gpu-benchmark benchmark --gpu-id 0 --duration 30
gpu-benchmark benchmark --gpu-id 0 --export results.json
gpu-benchmark benchmark --mock --duration 30
📊 Google Colab Quick Start
!pip install gpu-benchmark-tool[nvidia] !gpu-benchmark benchmark --gpu-id 0 --duration 30
Health Score (0-100 points) 85-100: 🟢 Healthy - Safe for all workloads including AI training 70-84: 🟢 Good - Suitable for most workloads 55-69: 🟡 Degraded - Limit to inference or light compute 40-54: 🟡 Warning - Monitor closely, avoid heavy workloads 0-39: 🔴 Critical - Do not use for production
Each component contributes to the total 100-point score:
Temperature (20 points)
Baseline Temperature (10 points)
Power Efficiency (10 points)
GPU Utilization (10 points)
Throttling (20 points)
Errors (20 points)
Temperature Stability (10 points)
Matrix Multiplication: Raw compute performance (TFLOPS) Memory Bandwidth: Memory throughput (GB/s) VRAM Stress: Memory allocation stability Mixed Precision: FP16/BF16 support for AI workloads
Benchmark Command
gpu-benchmark benchmark [OPTIONS]
Options: --gpu-id INTEGER Specific GPU to test (default: all GPUs) --duration INTEGER Test duration in seconds (default: 60) --basic Run basic tests only (faster) --export TEXT Export results to JSON file --verbose Show detailed output --mock Use mock GPU (no hardware required)
gpu-benchmark benchmark --gpu-id 0 --duration 120 --export full_test.json
gpu-benchmark benchmark --gpu-id 0 --duration 30 --basic
gpu-benchmark benchmark --mock --export mock_results.json
gpu-benchmark monitor --gpu-id 0
Basic Usage
import pynvml from gpu_benchmark import run_full_benchmark
pynvml.nvmlInit() handle = pynvml.nvmlDeviceGetHandleByIndex(0)
results = run_full_benchmark( handle=handle, duration=60, enhanced=True, device_id=0 )
print(f"Health Score: {results['health_score']['score']}/100") print(f"Status: {results['health_score']['status']}")
Analyzing Results
if results['health_score']['score'] >= 70: print("✅ GPU is suitable for production workloads") else: print("⚠️ GPU needs attention")
if 'performance_tests' in results: tflops = results['performance_tests']['matrix_multiply']['tflops'] print(f"Compute Performance: {tflops:.2f} TFLOPS")
🔧 Troubleshooting
"No GPUs found"
Use --mock flag for testing without GPUs Ensure NVIDIA/AMD/Intel drivers are installed For AMD: Install ROCm drivers and PyTorch with ROCm support For Intel: Install Intel GPU drivers and Intel Extension for PyTorch
"NVML Error" on Colab
This warning can be ignored - the tool still works correctly Use --gpu-id 0 for cleaner output
"PyTorch not available"
The base installation now includes PyTorch If you see this error, try: pip install gpu-benchmark-tool[nvidia]
Check system cooling Ensure GPU isn't thermal throttling Close other GPU applications Multi-GPU JSON Format
Use --gpu-id 0 to test single GPU (simpler output) Without --gpu-id, results are nested under 'results' key
NVIDIA GPUs (Full Support) Consumer: RTX 4090, 4080, 4070, 3090, 3080, 3070, 3060 Data Center: A100, V100, T4, P100, K80 Workstation: RTX A6000, A5000, A4000 AMD GPUs (ROCm Required) MI250X, MI210, MI100 Radeon RX 7900 XTX, RX 6900 XT Intel GPUs (Limited Support) Arc A770, A750 Intel Xe integrated graphics
Python 3.8 or higher For NVIDIA: CUDA drivers For AMD: ROCm drivers For Intel: Intel GPU drivers
📄 License MIT License - see LICENSE file for details.
🙏 Acknowledgments Built to solve real-world GPU selection challenges and reduce cloud computing costs through better hardware decisions.
📧 Contact PyPI: https://pypi.org/project/gpu-benchmark-tool/ Email: ywrajput@gmail.com
FAQs
Multi-vendor GPU health monitoring supporting old GPUs for e-waste reduction
We found that gpu-benchmark-tool demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Astral unveils pyx, a Python-native package registry in beta, designed to speed installs, enhance security, and integrate deeply with uv.
Security News
The Latio podcast explores how static and runtime reachability help teams prioritize exploitable vulnerabilities and streamline AppSec workflows.
Security News
The latest Opengrep releases add Apex scanning, precision rule tuning, and performance gains for open source static code analysis.