Logging Redactor
data:image/s3,"s3://crabby-images/1a22a/1a22a0ba3706f7b3480d1f08310e270c8254532e" alt="Supported Python versions"
Logging Redactor is a Python library designed to redact sensitive data in logs based on regex mask_patterns or dictionary keys. It supports JSON logging formats and handles nested data at the message level, at the positional argument level and also in the extra
keyword argument.
Installation
You can install Logging Redactor via pip:
pip install loggingredactor
Illustrative Examples
Below is a basic example that illustrates how to redact any digits in a logger message:
import re
import logging
import loggingredactor
logger = logging.getLogger()
redact_mask_patterns = [re.compile(r'\d+')]
logger.addFilter(loggingredactor.RedactingFilter(redact_mask_patterns, mask='xx'))
logger.warning("This is a test 123...")
Python only applies the filter on that logger, so any other files using logging will not get the filter applied. To have this filter applied to all loggers do the following
import re
import logging
import loggingredactor
from pythonjsonlogger import jsonlogger
redact_mask_patterns = [re.compile(r'(?<=api_key=)[\w-]+')]
class RedactStreamHandler(logging.StreamHandler):
def __init__(self, *args, **kwargs):
logging.StreamHandler.__init__(self, *args, **kwargs)
self.addFilter(loggingredactor.RedactingFilter(redact_mask_patterns))
root_logger = logging.getLogger()
sys_stream = RedactStreamHandler()
sys_stream.setFormatter(jsonlogger.JsonFormatter('%(name)s %(message)s'))
root_logger.addHandler(sys_stream)
logger = logging.getLogger(__name__)
logger.error("Request Failed", extra={'url': 'https://example.com?api_key=my-secret-key'})
You can also redact by dictionary keys, rather than by regex, in cases where certain fields should always be redacted. To achieve this, you can provide any iterable representing the keys that you would like to redact on. An example is shown below (this time with a different default mask):
import re
import logging
import loggingredactor
from pythonjsonlogger import jsonlogger
redact_keys = ['email', 'password']
class RedactStreamHandler(logging.StreamHandler):
def __init__(self, *args, **kwargs):
logging.StreamHandler.__init__(self, *args, **kwargs)
self.addFilter(loggingredactor.RedactingFilter(mask='REDACTED', mask_keys=redact_keys))
root_logger = logging.getLogger()
sys_stream = RedactStreamHandler()
sys_stream.setFormatter(jsonlogger.JsonFormatter('%(name)s %(message)s'))
root_logger.addHandler(sys_stream)
logger = logging.getLogger(__name__)
logger.warning("User %(firstname)s with email: %(email)s and password: %(password)s bought some food!", {'firstname': 'Arman', 'email': 'arman_jasuja@yahoo.com', 'password': '1234567'})
The above example also illustrates the logger redacting positional arguments provided to the message.
Integrating with already built logger configs
Logging Redactor also integrates quite well with already created logging configurations, for example, say you have your logging config set up in the following format:
import re
import logging.config
...
LOGGING = {
...
'filters':{
...
'pii': {
'()': 'loggingredactor.RedactingFilter',
'mask_keys': ('password', 'email', 'last_name', 'first_name', 'gender', 'lastname', 'firstname',),
'mask_patterns': (re.compile(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'), )
'mask': 'REDACTED',
},
...
}
'handlers': {
...
'stdout': {
...
'filters': ['pii', ...],
},
...
}
...
}
logging.config.dictConfig(LOGGING)
...
The essence boils down to adding the RedactingFilter to your logging config, and to the filters section of the associated handlers to which you want to apply the redaction.
Release Notes - v0.0.6:
Improvements and Changes
- Allow redaction of any generic mapping type, including:
dict
collections.OrderedDict
frozendict.frozendict
collections.ChainMap
types.MappingProxyType
collections.UserDict
and any other mapping class that inherits from collections.Mapping
Bug Fixes
- Fix bug that was converting non-string data types to strings. (Reported in issue #7)
A Note about the Motivation behind Logging Redactor:
Logging Redactor started as a fork of logredactor. However, due to the bugs present in the original (specifically the data mutations), it was not usable in production environments where the purpose was to only redact variables in the logs, not in their usage in the code. This, along with the fact that the original package is no longer maintained lead to the creation of Logging Redactor.