
Security News
New CVE Forecasting Tool Predicts 47,000 Disclosures in 2025
CVEForecast.org uses machine learning to project a record-breaking surge in vulnerability disclosures in 2025.
pynonymizer
pynonymizer is a tool for anonymizing sensitive production database dumps, allowing you to create realistic test datasets while maintaining GDPR/Data Protection compliance. It replaces personally identifiable information (PII) in your database with random, yet realistic data, using the Faker library and other functions.
Key features:
With pynonymizer, you can safely share production database copies with developers and testers, enabling better staging environments, integration tests, and database migration simulations, without compromising user privacy.
pynonymizer
replaces personally identifiable data in your database with realistic pseudorandom data, from the Faker
library or from other functions.
There are a wide variety of data types available which should suit the column in question, for example:
unique_email
company
file_path
[...]
Pynonymizer's main data replacement mechanism fake_update
is a random selection from a small pool of data (--seed-rows
controls the available Faker data). This process is chosen for compatibility and speed of operation, but does not guarantee uniqueness.
This may or may not suit your exact use-case. For a full list of data generation strategies, see the docs on strategyfiles
You can see strategyfile examples for existing databases, in the the examples folder.
If this workflow doesnt work for you, see process control to see if it can be adjusted to suit your needs.
mysql
/mysqldump
Must be in $PATH.sql
.gz
.sql
.gz
.xz
pynonymizer[mssql]
RESTORE_DB
/DUMP_DB
operations, the database server must be running
locally with pynonymizer. This is because MSSQL RESTORE
and BACKUP
instructions
are received by the database, so piping a local backup to a remote server is not possible.psql
/pg_dump
Must be in $PATH.sql
.gz
.sql
.gz
.xz
pynonymizer --help
pynonymizer is available as a docker image so that you dont have to install the client tools for your database.
See https://hub.docker.com/repository/docker/rwnxt/pynonymizer
# As pynonymizer depends on strategyfiles, you'll need to create a file mount so the file can be read.
docker run --mount type=bind,source=./strategyfile.yml,target=/tmp/strategyfile.yml rwnxt/pynonymizer -s /tmp/strategyfile.yml --db-host [...]
Pynonymizer can also be invoked programmatically / from other python code. See the module entrypoint pynonymizer or pynonymizer/pynonymize.py
import pynonymizer
pynonymizer.run(input_path="./backup.sql", strategyfile_path="./strategy.yml" [...] )
FAQs
An anonymization tool for production databases
We found that pynonymizer demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
CVEForecast.org uses machine learning to project a record-breaking surge in vulnerability disclosures in 2025.
Security News
Browserslist-rs now uses static data to reduce binary size by over 1MB, improving memory use and performance for Rust-based frontend tools.
Research
Security News
Eight new malicious Firefox extensions impersonate games, steal OAuth tokens, hijack sessions, and exploit browser permissions to spy on users.