![Oracle Drags Its Feet in the JavaScript Trademark Dispute](https://cdn.sanity.io/images/cgdhsj6q/production/919c3b22c24f93884c548d60cbb338e819ff2435-1024x1024.webp?w=400&fit=max&auto=format)
Security News
Oracle Drags Its Feet in the JavaScript Trademark Dispute
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Simple light CSV reader
This CSV reader is implemented in just pure Python. It allows to specify a separator, a quote char and column titles (or get the first row as titles). Nothing more, nothing else.
Usage is pretty straightforward:
from lightcsv import LightCSV
for row in LightCSV().read_file("myfile.csv"):
print(row)
This will open a file named myfile.csv
and iterate over the CSV file returning each
row as a key-value dictionary. Line endings can be either \n
or \r\n
. The file will be opened
in text-mode with utf-8
encoding.
You can supply your own stream (i.e. an open file instead of a filename). You can use this, for example, to open a file with a different encoding, etc.:
from lightcsv import LightCSV
with open("myfile.csv") as f:
for row in LightCSV().read(f):
print(row)
NOTE: Blank lines at any point in the file will be ignored
LightCSV can be parametrized during initialization to fine-tune its behaviour.
The following example shows initialization with default parameters:
from lightcsv import LightCSV
myCSV_reader = LightCSV(
separator=",",
quote_char='"',
field_names = None,
strict=True,
has_headers=False
)
Available settings:
separator
: character used as separator (defaults to ,
)
quote_char
: character used to quote strings (defaults to "
).
This char can be escaped by duplicating it.
field_names
: can be any iterable or sequence of str
(i.e. a list of strings).
If set, these will be used as column titles (dictionary keys), and also sets the expected number of columns.
strict
: Sets whether the parser runs in strict mode or not.
In strict mode the parser will raise a ValueError
exception if a cell cannot be decoded or column
numbers don't match. In non-strict mode non-recognized cells will be returned as strings. If there are more
columns than expected they will be ignored. If there are less, the dictionary will contain also fewer values.
has_headers
: whether the first row should be taken as column titles or not.
If set, field_names
cannot be specified. If not set, and no field names are specified, dictionary keys will
be just the column positions of the cells.
The parser will try to match the following types are recognized in this order:
None
(empty values). Unlike CSV reader, it will return None
(null) for empty values. ""
) are recognized correctly.str
(strings): Anything that is quoted with the quotechar
. Default quotechar is "
. "HELLO ""WORLD"""
decodes
to HELLO "WORLD"
string.int
(integers): an integer with a preceding optional sign.float
: any float recognized by Pythondatetime
: a datetime in ISO format (with 'T' or whitespace in the middle), like 2022-02-02 22:02:02
date
: a date in ISO format, like 2022-02-02
time
: a time in ISO format, like 22:02:02
If all this parsing attempts fails, a string will be returned, unless strict_mode
is set to True
. In the latter
case, a ValueError
exception will be raised.
You can implement your own deserialization by subclassing LightCSV
and override the method parse_obj()
.
For example, suppose we want to recognize hexadecimal integers in the format 0xNNN...
. We can implement it
this way:
import re
from lightcsv import LightCSV
RE_HEXA = re.compile('0[xX][A-Za-z0-9]+$') # matches 0xNNNN (hexadecimals)
class CSVHexRecognizer(LightCSV):
def parse_obj(self, lineno: int, chunk: str):
if RE_HEXA.match(chunk):
return int(chunk[2:], 16)
return super().parse_obj(lineno, chunk)
As you can see, you have to override parse_obj()
. If your match fails, you have to invoke super()
(overridden)
parse_obj()
method and return its result.
Python built-in CSV module is a bit over-engineered for simple tasks, and one normally doesn't need all bells
and whistles. With LightCSV
you just open a filename and iterate over its rows.
Decoding None
for empty cells is needed very often and can be really cumbersome as the standard csv
tries hard to cover many corner-cases (if that's your case, this tool might not be suitable for you).
FAQs
Simple pure Python CSV parser
We found that lightcsv demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Security News
The Linux Foundation is warning open source developers that compliance with global sanctions is mandatory, highlighting legal risks and restrictions on contributions.
Security News
Maven Central now validates Sigstore signatures, making it easier for developers to verify the provenance of Java packages.