Sign inDemoInstall


Package Overview
File Explorer

Install Socket

Protect your apps from supply chain attacks



A simple library for validating data contained in CSV files or similar row-oriented data sources.





This module provides some simple utilities for validating data contained in CSV 
files, or other similar data sources.

The source code for this module lives at:

Please report any bugs or feature requests via the issue tracker there.


This module is registered with the Python package index, so you can do::

    $ easy_install csvvalidator

... or download from and
install in the usual way::

    $ python install

If you want the bleeding edge, clone the source code repository::

    $ git clone git://
    $ cd csvvalidator
    $ python install


The `CSVValidator` class is the foundation for all validator objects that are 
capable of validating CSV data. 

You can use the CSVValidator class to dynamically construct a validator, e.g.::

    import sys
    import csv
    from csvvalidator import *

    field_names = (

    validator = CSVValidator(field_names)
    # basic header and record length checks
    validator.add_header_check('EX1', 'bad header')
    validator.add_record_length_check('EX2', 'unexpected record length')
    # some simple value checks
    validator.add_value_check('study_id', int, 
                              'EX3', 'study id must be an integer')
    validator.add_value_check('patient_id', int, 
                              'EX4', 'patient id must be an integer')
    validator.add_value_check('gender', enumeration('M', 'F'), 
                              'EX5', 'invalid gender')
    validator.add_value_check('age_years', number_range_inclusive(0, 120, int), 
                              'EX6', 'invalid age in years')
    validator.add_value_check('date_inclusion', datetime_string('%Y-%m-%d'),
                              'EX7', 'invalid date')
    # a more complicated record check
    def check_age_variables(r):
        age_years = int(r['age_years'])
        age_months = int(r['age_months'])
        valid = (age_months >= age_years * 12 and 
                 age_months % age_years < 12)
        if not valid:
            raise RecordError('EX8', 'invalid age variables')

    # validate the data and write problems to stdout    
    data = csv.reader('/path/to/data.csv', delimiter='\t')
    problems = validator.validate(data)
    write_problems(problems, sys.stdout)

For more complex use cases you can also sub-class `CSVValidator` to define 
re-usable validator classes for specific data sources.

For a complete account of all of the functionality available from this module, 
see the and modules in the source code repository.


Note that the `csvvalidator` module is intended to be used in combination with 
the standard Python `csv` module. The `csvvalidator` module **will not** 
validate the *syntax* of a CSV file. Rather, the `csvvalidator` module can be 
used to validate any source of row-oriented data, such as is provided by a 
`csv.reader` object.

I.e., if you want to validate data from a CSV file, you have to first construct 
a CSV reader using the standard Python `csv` module, specifying the appropriate 
dialect, and then pass the CSV reader as the source of data to either the 
`CSVValidator.validate` or the `CSVValidator.ivalidate` method.


Did you know?

Socket installs a GitHub app to automatically flag issues on every pull request and report the health of your dependencies. Find out what is inside your node modules and prevent malicious activity before you update the dependencies.


Related posts

SocketSocket SOC 2 Logo


  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap

Stay in touch

Get open source security insights delivered straight into your inbox.

  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc