Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

redact-phi

Package Overview
Dependencies
Maintainers
2
Versions
10
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

redact-phi

Assists in the redaction of PHI in structured or semi-structured data, supporting excel (XSLX), CSV or JSON.

  • 1.0.1
  • latest
  • Source
  • npm
  • Socket score

Version published
Weekly downloads
0
Maintainers
2
Weekly downloads
 
Created
Source

Build Status tracker

redact-PHI

A command-line utility to remove, redact and fabricate PHI, PII or protected data (CSV, JSON, XLSX).

Online Demo

Browser-based version

Local Installation

Install globally using npm i -g redact-phi

Usage

Removing PII from CSV

redact [options] <infile> [outfile]

Options:
  -V, --version            output the version number
  --delimiter <delimiter>  sets delimiter type (default: ",")
  --override <override>    Provide the location of your redaction specification
  -h, --help               display help for command

Redaction Example

redact-PHI includes an example (./example/) :

  • example.csv - CSV source file (data to be redacted)
  • example.json - JSON redaction strategy (defines how columns will be redacted with: faker, a custom redactor or a constant)
  • example.js - JavaScript custom redactors (this file is optional)

The .csv, .json and .js files must all be named the same (e.g. data.csv, data.json, data.js)


example/example.csv :

id,email,ssn,first name,last name,full address,order date,order amount
783,Nicola42@yahoo.com,706-25-4558,Caroline,Goodwin,82703 Yasmeen Corner Apt. 379,"10/9/2020, 4:48:52 AM",252.94
784,Ned_Bernier13@yahoo.com,282-57-6226,Izabella,Bosco,2930 Alisa Heights Apt. 572,"3/27/2021, 11:51:26 PM",689.40
784,Ned_Bernier13@yahoo.com,282-57-6226,Izabella,Bosco,2930 Alisa Heights Apt. 572,"6/17/2021, 9:44:54 PM",446.29
786,Nathen_Kovacek64@hotmail.com,677-86-2303,Celine,Dicki,68305 Labadie Shoal Suite 608,"4/23/2021, 6:49:34 PM",889.37
...

example/example.json :

{
  "columns": [
    {
      "redactWith": "random.number",
      "columnNum": "",
      "columnKey": "order id",
      "tracked": false
    },
    {
      "redactWith": "customGenerateId",
      "columnNum": 0,
      "columnKey": "",
      "tracked": true
    },
...

The columns array contains objects with properties that describe how each column should be redacted:

  • columnNum - (integer) Identifies the column to be redacted with an integer index (zero-based). (columnNum or columnKey must be set)

  • columnKey - (string) Identifies the column to be redacted using the file's header. (columnNum or columnKey must be set)

  • redactWith - (string) The strategy used to redact a value, there are four options:

    • faker function - Available functions : name.firstName or internet.email
    • faker template - Faker methods in a mustache template : {{address.streetAddress(true)}} or {{address.city}}, {{address.state}} {{address.zip}}
    • custom JavaScript - For more complicated redacted values you can call a JavaScript function defined in your .js file (see example.js)
    • constant value - If you wish to replace every value in this column with a constant, e.g. John Doe
  • tracked - Tracking preserves the relationships within your data while de-identifying. With tracking enabled, the redaction engine tracks a column's original value and reuses the redacted value if the original is re-encountered. To see this in-action, run redact /example/example.csv and note how columns with tracking enabled (id,ssn,email) are redacted with the same value in example_redacted.csv.


example/example.js :

module.exports = {
    customGenerateId: () => {
        return ++currentId;
    },
...

Custom redactors provide more control over your data. A custom redactor is referenced from your .json file using the redactWith property. For example: redactWith: "customGenerateId"

De-identification Examples

These JSON examples will remove the following PII:

  • First Name: "redactWith": "name.firstName"
  • Last Name: "redactWith": "name.lastName"
  • Full Name: "redactWith": "name.findName"
  • Social Security number: "redactWith": "{{datatype.number({\"min\":100,\"max\":999})}}-{{datatype.number({\"min\":10,\"max\":99})}}-{{datatype.number({\"min\":1000,\"max\":9999})}}"
  • Email Address: "redactWith": "internet.email"
  • Phone / Fax number: "redactWith": "phone.phoneNumber"
  • Street Address: "redactWith": "address.streetAddress"
  • City: "redactWith": "address.city"
  • Zip Code: "redactWith": "address.zipCode"
  • City, State, Zip: "redactWith": "{{address.city()}}, {{address.stateAbbr()}} {{address.zipCode()}}"
  • Full Address: "redactWith": "{{address.streetAddress(true)}}"
  • IP: "redactWith": "internet.ip"
  • IP v6: "redactWith": "internet.ipv6"

Full De-identification example.json:

{
  "columns": [
    {
      "redactWith": "name.firstName",
      "columnNum": "",
      "columnKey": "first name",
      "tracked": false
    },
    {
      "redactWith": "name.lastName",
      "columnNum": "",
      "columnKey": "last name",
      "tracked": false
    },
    {
      "redactWith": "name.findName",
      "columnNum": "",
      "columnKey": "full name",
      "tracked": false
    },
    {
      "redactWith": "{{datatype.number({\"min\":100,\"max\":999})}}-{{datatype.number({\"min\":10,\"max\":99})}}-{{datatype.number({\"min\":1000,\"max\":9999})}}",
      "columnNum": "",
      "columnKey": "social security number",
      "tracked": false
    },
    {
      "redactWith": "internet.email",
      "columnNum": "",
      "columnKey": "email",
      "tracked": false
    },
    {
      "redactWith": "phone.phoneNumber",
      "columnNum": "",
      "columnKey": "phone",
      "tracked": false
    },
    {
      "redactWith": "address.city",
      "columnNum": "",
      "columnKey": "city",
      "tracked": false
    },
    {
      "redactWith": "address.zipCode",
      "columnNum": "",
      "columnKey": "zip",
      "tracked": false
    },
    {
      "redactWith": "address.streetAddress",
      "columnNum": "",
      "columnKey": "street address",
      "tracked": false
    },
    {
      "redactWith": "{{address.city()}}, {{address.stateAbbr()}} {{address.zipCode()}}",
      "columnNum": "",
      "columnKey": "city state zip",
      "tracked": false
    },
    {
      "redactWith": "{{address.streetAddress(true)}}",
      "columnNum": "",
      "columnKey": "full address",
      "tracked": false
    },
    {
      "redactWith": "internet.ip",
      "columnNum": "",
      "columnKey": "ip",
      "tracked": false
    },
    {
      "redactWith": "internet.ipv6",
      "columnNum": "",
      "columnKey": "ipv6",
      "tracked": false
    }
  ]
}

Redaction Strategy JSON Schema

JSON Schema for creating a spec

Keywords

FAQs

Package last updated on 29 Nov 2021

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc