redis-bulk-cleaner

Deletes keys from Redis database in bulk.
Features
- Built for very large scale databases (10Gb+, 100 000 000+ keys)
- Uses SCAN operation so it is safe to run it without downtime
- Could process multiple patterns using only one SCAN operation
- Has option to search for given patterns, but not delete them
Usage
Install it via PyPi
pip install redis_bulk_cleaner
Firstly check what are you going to delete with --dry-run
option:
$ redis_bulk_cleaner 'test:unsubscribe_token:*' 'test:session:*' --dry-run
Search for keys:
test:unsubscribe_token:*, test:session:*
===========
test:unsubscribe_token:NDCivQ45KQcTghpS
test:session:NDCivQ45KQcTghpS
test:unsubscribe_token:9SBQ1YsDPuTETWBS
test:unsubscribe_token:MckwaZGER1GiVjoX
test:session:9SBQ1YsDPuTETWBS
test:session:MckwaZGER1GiVjoX
You could pass as many patterns as you want. Only overhead will be Regex matching slowdown.
Then to actually delete keys run it without --dry-run
option:
$ redis_bulk_cleaner 'test:unsubscribe_token:*' 'test:session:*'
This tools will delete ANY key that matches any of the following pattern:
test:unsubscribe_token:*, test:session:*
Do you want to continue? (y, n): y
...
6 keys cleaned
Be ware: it uses regex matching so test*
will also match test_other:key
.
However any special symbols like .?
will be escaped.

Note: because of SCAN behaviour it is not possible to calculate accurate estimate/percentage.
Progress bar could go beyond 100%.
-D
, --dry-run
- Do not delete keys just print them
-r
, --use-regex
- By default patterns are redis-patterns (*
equals any characters sequence including :
).
If this flag is enabled then it will parse patterns as python regex expressions (some escaping may still be needed base on your shell type)
-s
, --sleep
- (milliseconds) Sleep between DELETE commands to prevent high memory and cpu usage. Disabled by default.
-h
, --host
- Redis server host
-p
, --port
- Redis server port
--db
- Redis server db
-P
, --password
- Redis server password
-b
, '--batch' - Redis SCAN batch size
- Too small value will cause slowdown, too high value could lead to memory issues/timeouts
- Default:
500
--disable-cursor-backups
- By default current scan position is saving to temporary redis variable (
redis_cleaner:cursor
)
so it will continue from same place after restart (in case of crash/network issues/etc)
- Default: Enabled
Also see redis_bulk_cleaner --help
Credits
This package was created with
Cookiecutter and the
audreyr/cookiecutter-pypackage
project template.
Free software: MIT license