Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

anonymization

Package Overview
Dependencies
Maintainers
3
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

anonymization

Text anonymization using Faker

  • 0.1.9
  • PyPI
  • Socket score

Maintainers
3

Anonymization

Text anonymization in many languages for python3.6+ using Faker.

Install

pip install anonymization

Example

Replace emails and named entities in english

This example use NamedEntitiesAnonymizer which require spacy and a spacy model.

pip install spacy
python -m spacy download en_core_web_lg
>>> from anonymization import Anonymization, AnonymizerChain, EmailAnonymizer, NamedEntitiesAnonymizer

>>> text = "Hi John,\nthanks for you for subscribing to Superprogram, feel free to ask me any question at secret.mail@Superprogram.com \n Superprogram the best program!"
>>> anon = AnonymizerChain(Anonymization('en_US'))
>>> anon.add_anonymizers(EmailAnonymizer, NamedEntitiesAnonymizer('en_core_web_lg'))
>>> anon.anonymize(text)
'Hi Holly,\nthanks for you for subscribing to Ariel, feel free to ask me any question at shanestevenson@gmail.com \n Ariel the best program!'

Or make it reversible with pseudonymize:

>>> from anonymization import Anonymization, AnonymizerChain, EmailAnonymizer, NamedEntitiesAnonymizer

>>> text = "Hi John,\nthanks for you for subscribing to Superprogram, feel free to ask me any question at secret.mail@Superprogram.com \n Superprogram the best program!"
>>> anon = AnonymizerChain(Anonymization('en_US'))
>>> anon.add_anonymizers(EmailAnonymizer, NamedEntitiesAnonymizer('en_core_web_lg'))
>>> clean_text, patch = anon.pseudonymize(text)

>>> print(clean_text)
'Christopher, \nthanks for you for subscribing to Audrey, feel free to ask me any question at colemanwesley@hotmail.com \n Audrey the best program!'

revert_text = anon.revert(clean_text, patch)

>>> print(text == revert_text)
true

Replace a french phone number with a fake one

Our solution supports many languages along with their specific information formats.

For example, we can generate a french phone number:

>>> from anonymization import Anonymization, PhoneNumberAnonymizer
>>>
>>> text = "C'est bien le 0611223344 ton numéro ?"
>>> anon = Anonymization('fr_FR')
>>> phoneAnonymizer = PhoneNumberAnonymizer(anon)
>>> phoneAnonymizer.anonymize(text)
"C'est bien le 0144939332 ton numéro ?"

More examples in /examples

Included anonymizers

Files

namelang
FilePathAnonymizer-

Internet

namelang
EmailAnonymizer-
UriAnonymizer-
MacAddressAnonymizer-
Ipv4Anonymizer-
Ipv6Anonymizer-

Phone numbers

namelang
PhoneNumberAnonymizer47+
msisdnAnonymizer47+

Date

namelang
DateAnonymizer-

Other

namelang
NamedEntitiesAnonymizer7+
DictionaryAnonymizer-
SignatureAnonymizer7+

Custom anonymizers

Custom anonymizers can be easily created to fit your needs:

class CustomAnonymizer():
    def __init__(self, anonymization: Anonymization):
        self.anonymization = anonymization

    def anonymize(self, text: str) -> str:
        return modified_text
        # or replace by regex patterns in text using a faker provider
        return self.anonymization.regex_anonymizer(text, pattern, provider)
        # or replace all occurences using a faker provider
        return self.anonymization.replace_all(text, matchs, provider)

You may also add new faker provider with the helper Anonymization.add_provider(FakerProvider) or access the faker instance directly Anonymization.faker.

Benchmark

This module is benchmarked on synth_dataset from presidio-research and returns accuracy result(0.79) better than Microsoft's solution(0.75)

You can run the benchmark using docker:

docker build . -f ./benchmark/dockerfile -t anonbench
docker run -it --rm --name anonbench anonbench

License

MIT

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc