You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP
Socket
Book a DemoInstallSign in
Socket

structured-data-transformer

Package Overview
Dependencies
Maintainers
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

structured-data-transformer

Declarative transformation tool for JSON, CSV, and filenames with stable mappings - ideal for anonymization and data manipulation

0.2.1
pipPyPI
Maintainers
2

Structured Data Transformer

Anonymize sensitive fields in JSON, CSV, and filenames — with stable mappings for reversible anonymization.

Installation

pip install .

Or for development:

pip install -e '.[dev]'

Usage (CLI)

sdt --config CONFIG --base-dir BASE_DIR

Usage help:

sdt -h

Output:

usage: sdt [-h] --config CONFIG --base-dir BASE_DIR [--cache-in CACHE_IN] [--cache-out CACHE_OUT] [--reverse-cache]

Apply structured data transforms in place.

options:
  -h, --help            show this help message and exit
  --config CONFIG, -c CONFIG
                        Path to the JSON config file to load.
  --base-dir BASE_DIR, -d BASE_DIR
                        Base directory containing the data to transform in place.
  --cache-in CACHE_IN, -ci CACHE_IN
                        Optional path to input JSON cache file for stable anonymizer.
  --cache-out CACHE_OUT, -co CACHE_OUT
                        Optional path to output JSON cache file for stable anonymizer.
  --reverse-cache, -r   Reverse keys and values in the input cache (for decoding instead of encoding).

Examples

There is examples folder containing simple input, output, config and key.json.

Anonymize:

sdt --config examples/simple/config.json --base-dir examples/simple/input

Anonymize using existing key:

sdt -c examples/simple/config.json -d examples/simple/input -ci examples/simple/key.json

Reverse:

sdt -c examples/simple/config.json -d examples/simple/output -r -ci examples/simple/key.json

What it does

  • Anonymizes fields in JSON, CSV, and filenames with pattern rules.
  • Handles Bitcoin transactions, addresses, company names, emails, etc.
  • Keeps empty values unchanged.
  • Maintains a stable mapping so each value is always replaced the same way.
  • Saves the key for reuse so data can be deanonymized with --reverse-cache.

Customization

Custom transforms

You can add your own transform function. It doesn't necessarily need to anonymize, it can transform the field to any kind of form. Transform function expected input and output is Optional[str | int | float | bool] (Json primitives). Only str can happen for csv/path adapters.

To register the transform function, use register_transform(func: callable, name: str)

Example custom transform can be uppercase:

from typing import Optional
from structured_data_transformer.transforms import register_transform
from structured_data_transformer.types import JSONPrimitive


def uppercase(value: Optional[JSONPrimitive]) -> Optional[JSONPrimitive]:
    if value:
        return str(value).upper()
    return value


register_transform(uppercase, "uppercase")

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts