
Security News
Crates.io Implements Trusted Publishing Support
Crates.io adds Trusted Publishing support, enabling secure GitHub Actions-based crate releases without long-lived API tokens.
DataGradients is an open-source python based library designed for computer vision dataset analysis.
Extract valuable insights from your datasets and get comprehensive reports effortlessly.
Non-exhaustive list of supported features.
π Deep Dive into Data Profiling
Puzzled by some dataset challenges while using DataGradients? We've got you covered.
Enrich your understanding with this πfree online course. Dive into dataset profiling, confront its complexities, and harness the full potential of DataGradients.
Check out the pre-computed dataset analysis for a deeper dive into reports.
You can install DataGradients directly from the GitHub repository.
pip install data-gradients
class_id
-> class_name
.Please ensure all the points above are checked before you proceed with DataGradients.
Example
from torchvision.datasets import CocoDetection
train_data = CocoDetection(...)
val_data = CocoDetection(...)
class_names = ["person", "bicycle", "car", "motorcycle", ...]
# OR
# class_names = {0: "person", 1:"bicycle", 2:"car", 3: "motorcycle", ...}
Good to Know - DataGradients will try to find out how the dataset returns images and labels.
- If something cannot be automatically determined, you will be asked to provide some extra information through a text input.
- In some extreme cases, the process will crash and invite you to implement a custom dataset extractor
Heads up - DataGradients provides a few out-of-the-box dataset/dataloader implementation. You can find more dataset implementations in PyTorch or SuperGradients.
You are now ready to go, chose the relevant analyzer for your task and run it over your datasets!
Image Classification
from data_gradients.managers.classification_manager import ClassificationAnalysisManager
train_data = ... # Your dataset iterable (torch dataset/dataloader/...)
val_data = ... # Your dataset iterable (torch dataset/dataloader/...)
class_names = ... # [<class-1>, <class-2>, ...]
analyzer = ClassificationAnalysisManager(
report_title="Testing Data-Gradients Classification",
train_data=train_data,
val_data=val_data,
class_names=class_names,
)
analyzer.run()
Object Detection
from data_gradients.managers.detection_manager import DetectionAnalysisManager
train_data = ... # Your dataset iterable (torch dataset/dataloader/...)
val_data = ... # Your dataset iterable (torch dataset/dataloader/...)
class_names = ... # [<class-1>, <class-2>, ...]
analyzer = DetectionAnalysisManager(
report_title="Testing Data-Gradients Object Detection",
train_data=train_data,
val_data=val_data,
class_names=class_names,
)
analyzer.run()
Semantic Segmentation
from data_gradients.managers.segmentation_manager import SegmentationAnalysisManager
train_data = ... # Your dataset iterable (torch dataset/dataloader/...)
val_data = ... # Your dataset iterable (torch dataset/dataloader/...)
class_names = ... # [<class-1>, <class-2>, ...]
analyzer = SegmentationAnalysisManager(
report_title="Testing Data-Gradients Segmentation",
train_data=train_data,
val_data=val_data,
class_names=class_names,
)
analyzer.run()
Example
You can test the segmentation analysis tool in the following example which does not require you to download any additional data.
Once the analysis is done, the path to your pdf report will be printed. You can find here examples of pre-computed dataset analysis reports.
The feature configuration allows you to run the analysis on a subset of features or adjust the parameters of existing features. If you are interested in customizing this configuration, you can check out the documentation on that topic.
Ensuring Comprehensive Dataset Compatibility
DataGradients is adept at automatic dataset inference; however, certain specificities, such as nested annotations structures or unique annotation format, may necessitate a tailored approach.
To address this, DataGradients offers extractors
tailored for enhancing compatibility with diverse dataset formats.
For an in-depth understanding and implementation details, we encourage a thorough review of the Dataset Extractors Documentation.
![]() | Example notebook on Colab |
![]() | Click here to join our Discord Community |
This project is released under the Apache 2.0 license.
FAQs
DataGradients
We found that data-gradients demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago.Β It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Crates.io adds Trusted Publishing support, enabling secure GitHub Actions-based crate releases without long-lived API tokens.
Research
/Security News
Undocumented protestware found in 28 npm packages disrupts UI for Russian-language users visiting Russian and Belarusian domains.
Research
/Security News
North Korean threat actors deploy 67 malicious npm packages using the newly discovered XORIndex malware loader.