Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Sanitary is a simple utility that can remove/mask sensitive information, such as PII, from any data structure. It also includes a Structlog-compatible processor to clean up structured log messages.
It will automatically mask sensitive information such as PII, login credentials and the like. By default, the masked data is replaced by a generic string, which can be configured to use a hashing function instead.
Sanitizer needs to be installed like any other Python package:
> pip install sanitary
The first step is to instantiate a Sanitizer
object:
>>> from sanitary import Sanitizer
>>> sanitizer = Sanitizer(keys={"foo", "bar"})
>>> sanitizer.sanitize({"foo": 123, "bar": "abc", "baz": "boom"})
{"foo": "********", "bar": "********", "baz": "boom"}
The Sanitizer
class accepts the following arguments:
keys
: An iterator of key names that will be searched for recursively. Any of these keys will have its value replaced by the replacement value.patterns
: An iterator of regular expression patterns that will be used to search the textual values. A value that matches any of the patterns will be entirely replaced by the message value.replacement
: Can be any of the following types of values:
hashlib
functions to mask the data.message
: The textual message which will replace the value that matches any of the defined patterns.If the replacement
argument is a callable, the value of a corresponding sensitive key will be replaced with the return value of the callable (or its hexdigest
). This way, the sanitized data can still be tracked (e.g. an email address will always have the same hash value) without exposing the actual value.
>>> import hashlib
>>> from sanitary import Sanitizer
>>> sanitizer = Sanitizer(keys={"password", "email"}, replacement=hashlib.sha256)
>>> sanitizer.sanitize({"event": "clean password", "password": "blabla", "foo": {"Email": "test@example.com"}})
{
'event': 'clean password',
'password': 'ccadd99b16cd3d200c22d6db45d8b6630ef3d936767127347ec8a76ab992c2ea',
'foo': {'Email': '973dfe463ec85785f5f95af5ba3906eedb2d931c24e69824a89ea65dba4e813b'}
}
>>>
Sanitizer can also clean up any text values that match specific regular expression patterns; any such value is completely replaced with a hardcoded warning message.
>>> from sanitary import Sanitizer
>>> sanitizer = Sanitizer(patterns={r"""'Authentication':"""})
>>> sanitizer.sanitize("'Authentication': 1234")
"#### WARNING: Message replaced due to sensitive pattern: 'Authentication':"
>>> sanitizer.sanitize({"example": "'Authentication': 1234"})
{'example': "#### WARNING: Message replaced due to sensitive pattern: 'Authentication':"}
>>>
The special subclass, StructlogSanitizer
, is provided to enable sanitizing the logging context managed by the structlog
library. It needs to be instantiated and added to the list of configured processors:
import hashlib
import structlog
from sanitary import StructlogSanitizer
structlog.configure(
processors=[
StructlogSanitizer(keys={"foo", "bar", "baz"}, replacement=hashlib.sha256),
structlog.processors.JSONRenderer()
],
logger_factory=structlog.stdlib.LoggerFactory(),
)
*[PII]: Personally Identifiable Information
FAQs
Utility to remove or replace sensitive data from complex structures.
We found that sanitary demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.