core-suite
Core suite of tests for Dataproofer. These tests relate to common problems and data checks — namely, making sure data has not been truncated by looking for specific cut-off indicators.
Table of Contents
Tests
columnsContainNothing.js
Calculates the percentage of rows that are empty for each column
Parameters
rows
Array an array of objects representing rows in the spreadsheetcolumnHeads
Array an array of strings for column names of the spreadsheet
Returns Object describing the result
columnsContainsSpecialChars.js
Calculates the percentage of rows that contain special, non-typical Latin characters for each column
Source: http://www.w3schools.com/charsets/ref_html_utf8.asp
Parameters
rows
Array an array of objects representing rows in the spreadsheetcolumnHeads
Array an array of strings for column names of the spreadsheet
Returns Object describing the result
stringsHaveExactly255Characters.js
src/stringsHaveExactly255Characters.js:14-66
Determine the cells that have exactly 255 characters (SQL upper limit error). See ProPublica's bad data guide for further information
https://github.com/propublica/guides/blob/master/data-bulletproofing.md#integrity-checks-for-every-data-set
Parameters
rows
Array an array of objects representing rows in the spreadsheetcolumnHeads
Array an array of strings for column names of the spreadsheet
Returns Object describing the result
maxBigInteger.js
src/maxBigInteger.js:15-71
Indicates an bigint
at its upper signed limit (MySQL or PostgreSQL) of 9,223,372,036,854,775,807 or its upper unsigned limit (MySQL) of 18,446,744,073,709,551,616.
Common database programs, like MySQL, have a cap on how big of a number it can save.
Please see the MySQL documentation or PostgreSQL documentation for more information.
Parameters
rows
Array an array of objects representing rows in the spreadsheetcolumnHeads
Array an array of strings for column names of the spreadsheet
Returns Object describing the result
maxInteger.js
src/maxInteger.js:15-71
Indicates a integer at its upper signed limit is 2,147,483,647 (MySQL or PostgreSQL) or its upper unsigned limit (MySQL) of 4,294,967,295.
Common database programs, like MySQL, have a cap on how big of a number it can save.
Please see the MySQL documentation for more information.
Parameters
rows
Array an array of objects representing rows in the spreadsheetcolumnHeads
Array an array of strings for column names of the spreadsheet
Returns Object describing the result
maxSmallInteger.js
src/maxSmallInteger.js:15-71
Indicates an smallint
at its upper signed limit (MySQL or PostgreSQL) of 32,767 or its upper unsigned limit (MySQL) of 65,535.
Common database programs, like MySQL, have a cap on how big of a number it can save.
Please see the MySQL documentation or PostgreSQL documentation for more information.
Parameters
rows
Array an array of objects representing rows in the spreadsheetcolumnHeads
Array an array of strings for column names of the spreadsheet
Returns Object describing the result
maxSummedInteger.js
src/maxSummedInteger.js:15-71
Indicates a summed integers at its upper limit of 2,097,152.
Please see the Integrity Checks section of the ProPublica Data Bulletproofing Guide for more information.
Parameters
rows
Array an array of objects representing rows in the spreadsheetcolumnHeads
Array an array of strings for column names of the spreadsheet
Returns Object describing the result
checkDuplicateRows.js
src/checkDuplicateRows.js:13-73
Check for any duplicate rows in the spreadsheet. Optionally
Parameters
rows
Array an array of objects representing rows in the spreadsheetcolumnHeads
Array an array of strings for column names of the spreadsheetinput
Object accept user input, such as selected Columns
Returns Object describing the result
numberOfRowsIs65k.js
src/numberOfRowsIs65k.js:12-31
Test to see if number of rows is exactly 65,536 rows (cutoff by Excel)
Parameters
rows
Array an array of objects representing rows in the spreadsheetcolumnHeads
Array an array of strings for column names of the spreadsheet
Returns Object describing the result
Development
Getting Started
git clone git@github.com:dataproofer/core-suite.git
cd core-suite
npm install
Writing Tests
Building Docs
We use documentation.js, but have created a handy script for regenerating documentation.
npm run docs
Then open up and check your docs in DOCUMENTATION.md