Security News
Fluent Assertions Faces Backlash After Abandoning Open Source Licensing
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Presidio Anonymizer package - replaces analyzed text with desired values.
The Presidio anonymizer is a Python based module for anonymizing detected PII text entities with desired values.
Use the following button to deploy presidio anonymizer to your Azure subscription.
The Presidio-Anonymizer package contains both Anonymizers and Deanonymizers.
Presidio anonymizer comes by default with the following anonymizers:
Replace: Replaces the PII with desired value.
new_value
- replaces existing text with the given value.
If new_value
is not supplied or empty, default behavior will be: <entity_type>
e.g: <PHONE_NUMBER>Redact: Removes the PII completely from text.
Hash: Hashes the PII using either sha256, sha512 or md5.
hash_type
: Sets the type of hashing.
Can be either sha256
, sha512
or md5
.
The default hash type is sha256
.Mask: Replaces the PII with a sequence of a given character.
Parameters:
chars_to_mask
: The amount of characters out of the PII that should be
replaced.masking_char
: The character to be replaced with.from_end
: Whether to mask the PII from it's end.Encrypt: Encrypt the PII entity text and replace the original with the encrypted string.
Custom: Replace the PII with the result of the function executed on the PII string.
lambda
: Lambda function to execute on the PII string.
The lambda return type must be a string.The Anonymizer default setting is to use the Advanced Encryption Standard (AES) as the encryption algorithm, also known as Rijndael.
key
: A cryptographic key used for the encryption.
The length of the key needs to be of 128, 192 or 256 bits, in a string format.Note: If the default anonymizer is not provided, the default anonymizer is "replace" for all entities. The replacing value will be the entity type e.g.: <PHONE_NUMBER>
As the input text could potentially have overlapping PII entities, there are different anonymization scenarios:
I'm George Washington Square Park.
Assuming one entity is George Washington
and the other is Washington State Park
and assuming the default anonymizer, the result would be
I'm <PERSON><LOCATION>.
Additional examples for overlapping PII scenarios:
Text:
My name is Inigo Montoya. You Killed my Father. Prepare to die. BTW my number is:
03-232323.
Inigo
is recognized as NAME:
My name is <NAME> Montoya. You Killed my Father. Prepare to die. BTW my number is:
03-232323.
My name is Inigo Montoya. You Killed my Father. Prepare to die. BTW my number is: <
PHONE_NUMBER>.
My name is <NAME>. You Killed my Father. Prepare to die. BTW my number is: 03-232323.
My name is Inigo Montoya. You Killed my Father. Prepare to die. BTW my number is: <
PHONE_NUMBER><SSN>.
Presidio deanonymizer currently contains one operator:
key
- a cryptographic key used for the encryption.
The length of the key needs to be of 128, 192 or 256 bits, in a string format.Please notice: you can use "DEFAULT" as an operator key to define an operator over all entities.
To install Presidio Anonymizer, run the following, preferably in a virtual environment:
pip install presidio-anonymizer
from presidio_anonymizer import AnonymizerEngine
from presidio_anonymizer.entities import RecognizerResult, OperatorConfig
# Initialize the engine with logger.
engine = AnonymizerEngine()
# Invoke the anonymize function with the text,
# analyzer results (potentially coming from presidio-analyzer) and
# Operators to get the anonymization output:
result = engine.anonymize(
text="My name is Bond, James Bond",
analyzer_results=[
RecognizerResult(entity_type="PERSON", start=11, end=15, score=0.8),
RecognizerResult(entity_type="PERSON", start=17, end=27, score=0.8),
],
operators={"PERSON": OperatorConfig("replace", {"new_value": "BIP"})},
)
print(result)
This example take the output of the AnonymizerEngine with encrypted PII entities, and decrypt it back to the original text:
from presidio_anonymizer import DeanonymizeEngine
from presidio_anonymizer.entities import OperatorResult, OperatorConfig
# Initialize the engine with logger.
engine = DeanonymizeEngine()
# Invoke the deanonymize function with the text, anonymizer results and
# Operators to define the deanonymization type.
result = engine.deanonymize(
text="My name is S184CMt9Drj7QaKQ21JTrpYzghnboTF9pn/neN8JME0=",
entities=[
OperatorResult(start=11, end=55, entity_type="PERSON"),
],
operators={"DEFAULT": OperatorConfig("decrypt", {"key": "WmZq4t7w!z%C&F)J"})},
)
print(result)
In folder presidio/presidio-anonymizer run:
docker-compose up -d
Follow the API Spec for the Anonymizer REST API reference details
FAQs
Presidio Anonymizer package - replaces analyzed text with desired values.
We found that presidio-anonymizer demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 4 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Research
Security News
Socket researchers uncover the risks of a malicious Python package targeting Discord developers.
Security News
The UK is proposing a bold ban on ransomware payments by public entities to disrupt cybercrime, protect critical services, and lead global cybersecurity efforts.