redact-PHI
A command-line utility to redact and fabricate PHI, PII or protected data (CSV, JSON, XLSX).
Installation
Install globally using npm i -g redact-phi
Usage
% redact --help
Usage: redact [options] <infile> [outfile]
Arguments:
infile File to redact
outfile Output file, defaults to {input_filename_redacted.csv}
Options:
-V, --version output the version number
--delimiter <delimiter> sets delimiter type (default: ",")
--override <override> Provide the location of your redaction specification
-h, --help display help for command
Redaction Example
redact-PHI includes an example (./example/
) :
example.csv
- CSV source file (data to be redacted)example.json
- JSON redaction strategy (defines how columns will be redacted with: faker
, a custom redactor
or a constant
)example.js
- JavaScript custom redactors (this file is optional)
The .csv
, .json
and .js
files must all be named the same (e.g. data.csv
, data.json
, data.js
)
example/example.csv
:
id,email,ssn,first name,last name,full address,order date,order amount
783,Nicola42@yahoo.com,706-25-4558,Caroline,Goodwin,82703 Yasmeen Corner Apt. 379,"10/9/2020, 4:48:52 AM",252.94
784,Ned_Bernier13@yahoo.com,282-57-6226,Izabella,Bosco,2930 Alisa Heights Apt. 572,"3/27/2021, 11:51:26 PM",689.40
784,Ned_Bernier13@yahoo.com,282-57-6226,Izabella,Bosco,2930 Alisa Heights Apt. 572,"6/17/2021, 9:44:54 PM",446.29
786,Nathen_Kovacek64@hotmail.com,677-86-2303,Celine,Dicki,68305 Labadie Shoal Suite 608,"4/23/2021, 6:49:34 PM",889.37
...
example/example.json
:
{
"columns": [
{
"redactWith": "random.number",
"columnNum": "",
"columnKey": "order id",
"tracked": false
},
{
"redactWith": "customGenerateId",
"columnNum": 0,
"columnKey": "",
"tracked": true
},
...
The columns
array contains objects with properties that describe how each column should be redacted:
-
columnNum
- (integer) Identifies the column to be redacted with an integer index (zero-based). (columnNum
or columnKey
must be set)
-
columnKey
- (string) Identifies the column to be redacted using the file's header. (columnNum
or columnKey
must be set)
-
redactWith
- (string) The strategy used to redact a value, there are four options:
- faker function - e.g.
name.firstName
or internet.email
- faker template - e.g.
{{address.streetAddress(true)}}
- custom JavaScript - If you require a more complicated redacted value you can call a JavaScript function you've defined in your
.js
file (see example.js
) - constant value - You can simply enter a string, if you wish to replace every value in this column with a constant, e.g.
John Doe
-
tracked
- Tracking preserves the relationships within your data while de-identifying. With tracking enabled, the redaction engine tracks a column's original value and reuses the redacted value if the original is re-encountered. To see this in-action, run redact /example/example.csv
and note how columns with tracking enabled (id
,ssn
,email
) are redacted with the same value in example_redacted.csv
.
example/example.js
:
module.exports = {
customGenerateId: () => {
return ++currentId;
},
...
Custom redactors provide more control over your data. A custom redactor is referenced from your .json
file using the redactWith
property. For example: redactWith: "customGenerateId"
JSON schema for creating a redaction strategy
JSON Schema for creating a spec