
Security News
AGENTS.md Gains Traction as an Open Format for AI Coding Agents
AGENTS.md is a fast-growing open format giving AI coding agents a shared, predictable way to understand project setup, style, and workflows.
A module to extend the python json package functionality:
Documentation: https://jsonextended.readthedocs.io
From Conda (recommended):
conda install -c conda-forge jsonextended
From PyPi:
pip install jsonextended
jsonextended has no import dependancies, on Python 3.x and only
pathlib2
on 2.7 but, for full functionallity, it is advised to install
the following packages:
conda install -c conda-forge ijson numpy pint h5py pandas
from jsonextended import edict, plugins, example_mockpaths
Take a directory structure, potentially containing multiple file types:
datadir = example_mockpaths.directory1
print(datadir.to_string(indentlvl=3,file_content=True))
Folder("dir1")
File("file1.json") Contents:
{"key2": {"key3": 4, "key4": 5}, "key1": [1, 2, 3]}
Folder("subdir1")
File("file1.csv") Contents:
# a csv file
header1,header2,header3
val1,val2,val3
val4,val5,val6
val7,val8,val9
File("file1.literal.csv") Contents:
# a csv file with numbers
header1,header2,header3
1,1.1,string1
2,2.2,string2
3,3.3,string3
Folder("subdir2")
Folder("subsubdir21")
File("file1.keypair") Contents:
# a key-pair file
key1 val1
key2 val2
key3 val3
key4 val4
Plugins can be defined for parsing each file type (see Creating Plugins section):
plugins.load_builtin_plugins('parsers')
plugins.view_plugins('parsers')
{'csv.basic': 'read *.csv delimited file with headers to {header:[column_values]}',
'csv.literal': 'read *.literal.csv delimited files with headers to {header:column_values}, with number strings converted to int/float',
'hdf5.read': 'read *.hdf5 (in read mode) files using h5py',
'json.basic': 'read *.json files using json.load',
'keypair': "read *.keypair, where each line should be; '<key> <pair>'"}
LazyLoad then takes a path name, path-like object or dict-like object, which will lazily load each file with a compatible plugin.
lazy = edict.LazyLoad(datadir)
lazy
{file1.json:..,subdir1:..,subdir2:..}
Lazyload can then be treated like a dictionary, or indexed by tab completion:
list(lazy.keys())
['subdir1', 'subdir2', 'file1.json']
lazy[['file1.json','key1']]
[1, 2, 3]
lazy.subdir1.file1_literal_csv.header2
[1.1, 2.2, 3.3]
For pretty printing of the dictionary:
edict.pprint(lazy,depth=2)
file1.json:
key1: [1, 2, 3]
key2: {...}
subdir1:
file1.csv: {...}
file1.literal.csv: {...}
subdir2:
subsubdir21: {...}
Numerous functions exist to manipulate the nested dictionary:
edict.flatten(lazy.subdir1)
{('file1.csv', 'header1'): ['val1', 'val4', 'val7'],
('file1.csv', 'header2'): ['val2', 'val5', 'val8'],
('file1.csv', 'header3'): ['val3', 'val6', 'val9'],
('file1.literal.csv', 'header1'): [1, 2, 3],
('file1.literal.csv', 'header2'): [1.1, 2.2, 3.3],
('file1.literal.csv', 'header3'): ['string1', 'string2', 'string3']}
LazyLoad parses the plugins.decode
function to parser plugin's
read_file
method (keyword 'object_hook'). Therefore, bespoke decoder
plugins can be set up for specific dictionary key signatures:
print(example_mockpaths.jsonfile2.to_string())
File("file2.json") Contents:
{"key1":{"_python_set_": [1, 2, 3]},"key2":{"_numpy_ndarray_": {"dtype": "int64", "value": [1, 2, 3]}}}
edict.LazyLoad(example_mockpaths.jsonfile2).to_dict()
{u'key1': {u'_python_set_': [1, 2, 3]},
u'key2': {u'_numpy_ndarray_': {u'dtype': u'int64', u'value': [1, 2, 3]}}}
plugins.load_builtin_plugins('decoders')
plugins.view_plugins('decoders')
{'decimal.Decimal': 'encode/decode Decimal type',
'numpy.ndarray': 'encode/decode numpy.ndarray',
'pint.Quantity': 'encode/decode pint.Quantity object',
'python.set': 'decode/encode python set'}
dct = edict.LazyLoad(example_mockpaths.jsonfile2).to_dict()
dct
{u'key1': {1, 2, 3}, u'key2': array([1, 2, 3])}
This process can be reversed, using encoder plugins:
plugins.load_builtin_plugins('encoders')
plugins.view_plugins('encoders')
{'decimal.Decimal': 'encode/decode Decimal type',
'numpy.ndarray': 'encode/decode numpy.ndarray',
'pint.Quantity': 'encode/decode pint.Quantity object',
'python.set': 'decode/encode python set'}
import json
json.dumps(dct,default=plugins.encode)
'{"key2": {"_numpy_ndarray_": {"dtype": "int64", "value": [1, 2, 3]}}, "key1": {"_python_set_": [1, 2, 3]}}'
from jsonextended import plugins, utils
Plugins are recognised as classes with a minimal set of attributes matching the plugin category interface:
plugins.view_interfaces()
{'decoders': ['plugin_name', 'plugin_descript', 'dict_signature'],
'encoders': ['plugin_name', 'plugin_descript', 'objclass'],
'parsers': ['plugin_name', 'plugin_descript', 'file_regex', 'read_file']}
plugins.unload_all_plugins()
plugins.view_plugins()
{'decoders': {}, 'encoders': {}, 'parsers': {}}
For example, a simple parser plugin would be:
class ParserPlugin(object):
plugin_name = 'example'
plugin_descript = 'a parser for *.example files, that outputs (line_number:line)'
file_regex = '*.example'
def read_file(self, file_obj, **kwargs):
out_dict = {}
for i, line in enumerate(file_obj):
out_dict[i] = line.strip()
return out_dict
Plugins can be loaded as a class:
plugins.load_plugin_classes([ParserPlugin],'parsers')
plugins.view_plugins()
{'decoders': {},
'encoders': {},
'parsers': {'example': 'a parser for *.example files, that outputs (line_number:line)'}}
Or by directory (loading all .py files):
fobj = utils.MockPath('example.py',is_file=True,content="""
class ParserPlugin(object):
plugin_name = 'example.other'
plugin_descript = 'a parser for *.example.other files, that outputs (line_number:line)'
file_regex = '*.example.other'
def read_file(self, file_obj, **kwargs):
out_dict = {}
for i, line in enumerate(file_obj):
out_dict[i] = line.strip()
return out_dict
""")
dobj = utils.MockPath(structure=[fobj])
plugins.load_plugins_dir(dobj,'parsers')
plugins.view_plugins()
{'decoders': {},
'encoders': {},
'parsers': {'example': 'a parser for *.example files, that outputs (line_number:line)',
'example.other': 'a parser for *.example.other files, that outputs (line_number:line)'}}
For a more complex example of a parser, see
jsonextended.complex_parsers
{'a':1,'b':2}
plugins.decode
function will use the method denoted by the
intype parameter, e.g. if intype='json', then from_json
will
be called.plugins.encode
function will use the method denoted by the
outtype parameter, e.g. if outtype='json', then to_json
will be called.For more information, all functions contain doc-strings with tested examples.
from jsonextended import ejson, edict, utils
path = utils.get_test_path()
ejson.jkeys(path)
['dir1', 'dir2', 'dir3']
jdict1 = ejson.to_dict(path)
edict.pprint(jdict1,depth=2)
dir1:
dir1_1: {...}
file1: {...}
file2: {...}
dir2:
file1: {...}
dir3:
edict.to_html(jdict1,depth=2)
To try the rendered JSON tree, output in the Jupyter Notebook, go to : https://chrisjsewell.github.io/
jdict2 = ejson.to_dict(path,['dir1','file1'])
edict.pprint(jdict2,depth=1)
initial: {...}
meta: {...}
optimised: {...}
units: {...}
filtered = edict.filter_keys(jdict2,['vol*'],use_wildcards=True)
edict.pprint(filtered)
initial:
crystallographic:
volume: 924.62752781
primitive:
volume: 462.313764
optimised:
crystallographic:
volume: 1063.98960509
primitive:
volume: 531.994803
edict.pprint(edict.flatten(filtered))
(initial, crystallographic, volume): 924.62752781
(initial, primitive, volume): 462.313764
(optimised, crystallographic, volume): 1063.98960509
(optimised, primitive, volume): 531.994803
from jsonextended.units import apply_unitschema, split_quantities
withunits = apply_unitschema(filtered,{'volume':'angstrom^3'})
edict.pprint(withunits)
initial:
crystallographic:
volume: 924.62752781 angstrom ** 3
primitive:
volume: 462.313764 angstrom ** 3
optimised:
crystallographic:
volume: 1063.98960509 angstrom ** 3
primitive:
volume: 531.994803 angstrom ** 3
newunits = apply_unitschema(withunits,{'volume':'nm^3'})
edict.pprint(newunits)
initial:
crystallographic:
volume: 0.92462752781 nanometer ** 3
primitive:
volume: 0.462313764 nanometer ** 3
optimised:
crystallographic:
volume: 1.06398960509 nanometer ** 3
primitive:
volume: 0.531994803 nanometer ** 3
edict.pprint(split_quantities(newunits),depth=4)
initial:
crystallographic:
volume:
magnitude: 0.92462752781
units: nanometer ** 3
primitive:
volume:
magnitude: 0.462313764
units: nanometer ** 3
optimised:
crystallographic:
volume:
magnitude: 1.06398960509
units: nanometer ** 3
primitive:
volume:
magnitude: 0.531994803
units: nanometer ** 3
FAQs
Extending the python json package functionality
We found that jsonextended demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
AGENTS.md is a fast-growing open format giving AI coding agents a shared, predictable way to understand project setup, style, and workflows.
Security News
/Research
Malicious npm package impersonates Nodemailer and drains wallets by hijacking crypto transactions across multiple blockchains.
Security News
This episode explores the hard problem of reachability analysis, from static analysis limits to handling dynamic languages and massive dependency trees.