Security News
The Risks of Misguided Research in Supply Chain Security
Snyk's use of malicious npm packages for research raises ethical concerns, highlighting risks in public deployment, data exfiltration, and unauthorized testing.
A simple library for validating data contained in CSV files or similar row-oriented data sources.
This module provides some simple utilities for validating data contained in CSV files, or other similar data sources.
The source code for this module lives at:
https://github.com/alimanfoo/csvvalidator
Please report any bugs or feature requests via the issue tracker there.
This module is registered with the Python package index, so you can do::
$ easy_install csvvalidator
... or download from http://pypi.python.org/pypi/csvvalidator and install in the usual way::
$ python setup.py install
If you want the bleeding edge, clone the source code repository::
$ git clone git://github.com/alimanfoo/csvvalidator.git
$ cd csvvalidator
$ python setup.py install
The CSVValidator
class is the foundation for all validator objects that are
capable of validating CSV data.
You can use the CSVValidator class to dynamically construct a validator, e.g.::
import sys
import csv
from csvvalidator import *
field_names = (
'study_id',
'patient_id',
'gender',
'age_years',
'age_months',
'date_inclusion'
)
validator = CSVValidator(field_names)
# basic header and record length checks
validator.add_header_check('EX1', 'bad header')
validator.add_record_length_check('EX2', 'unexpected record length')
# some simple value checks
validator.add_value_check('study_id', int,
'EX3', 'study id must be an integer')
validator.add_value_check('patient_id', int,
'EX4', 'patient id must be an integer')
validator.add_value_check('gender', enumeration('M', 'F'),
'EX5', 'invalid gender')
validator.add_value_check('age_years', number_range_inclusive(0, 120, int),
'EX6', 'invalid age in years')
validator.add_value_check('date_inclusion', datetime_string('%Y-%m-%d'),
'EX7', 'invalid date')
# a more complicated record check
def check_age_variables(r):
age_years = int(r['age_years'])
age_months = int(r['age_months'])
valid = (age_months >= age_years * 12 and
age_months % age_years < 12)
if not valid:
raise RecordError('EX8', 'invalid age variables')
validator.add_record_check(check_age_variables)
# validate the data and write problems to stdout
data = csv.reader('/path/to/data.csv', delimiter='\t')
problems = validator.validate(data)
write_problems(problems, sys.stdout)
For more complex use cases you can also sub-class CSVValidator
to define
re-usable validator classes for specific data sources.
For a complete account of all of the functionality available from this module, see the example.py and tests.py modules in the source code repository.
Note that the csvvalidator
module is intended to be used in combination with
the standard Python csv
module. The csvvalidator
module will not
validate the syntax of a CSV file. Rather, the csvvalidator
module can be
used to validate any source of row-oriented data, such as is provided by a
csv.reader
object.
I.e., if you want to validate data from a CSV file, you have to first construct
a CSV reader using the standard Python csv
module, specifying the appropriate
dialect, and then pass the CSV reader as the source of data to either the
CSVValidator.validate
or the CSVValidator.ivalidate
method.
FAQs
A simple library for validating data contained in CSV files or similar row-oriented data sources.
We found that csvvalidator demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Snyk's use of malicious npm packages for research raises ethical concerns, highlighting risks in public deployment, data exfiltration, and unauthorized testing.
Research
Security News
Socket researchers found several malicious npm packages typosquatting Chalk and Chokidar, targeting Node.js developers with kill switches and data theft.
Security News
pnpm 10 blocks lifecycle scripts by default to improve security, addressing supply chain attack risks but sparking debate over compatibility and workflow changes.