Security News
Fluent Assertions Faces Backlash After Abandoning Open Source Licensing
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
This set of modules provides the following benefits:
mo_json.stream
)Version 6.x.x - Typed encoder no longer encodes to typed multivalues, rather, encodes to array of typed values. For example, instead of
{"a": {"~n~": [1, 2]}}
we get
{"a": {"~a~": [{"~n~": 1},{"~n~": 2}]}}
__json__
Add a __json__
method to any class you wish to serialize to JSON. It is incumbent on you to ensure valid JSON is emitted:
class MyClass(object):
def __init__(self, a, b):
self.a = a
self.b = b
def __json__(self):
separator = "{"
for k, v in self.__dict__.items():
yield separator
separator = ","
yield value2json(k)+": "+value2json(v)
yield "}"
With the __json__
function defined, you may use the value2json
function:
from mo_json import value2json
result = value2json(MyClass(a="name", b=42))
__data__
Add a __data__
method that will convert your class into some JSON-serializable data structures. You may find this easier to implement than emitting pure JSON. If both __data__
and __json__
exist, then __json__
is used.
from mo_json import value2json
class MyClass(object):
def __init__(self, a, b):
self.a = a
self.b = b
def __data__(self):
return self.__dict__
result = value2json(MyClass(a="name", b=42))
The json2value
function provides a couple of options
flexible
- will be very forgiving of JSON accepted (see hjson)leaves
- will interpret keys with dots (".
") as dot-delimited pathsfrom mo_json import json2value
result = json2value(
"http.headers.referer: http://example.com",
flexible=True,
leaves=True
)
assert result=={'http': {'headers': {'referer': 'http://example.com'}}}
Notice the lack of quotes in the JSON (hjson) and the deep structure created by the dot-delimited path name
pip install -r tests/requirements.txt
set PYTHONPATH=.
python.exe -m unittest discover .
mo_json.scrub()
Remove, or convert, a number of objects from a structure that are not JSON-izable. It is faster to scrub
and use the default (aka c-based) python encoder than it is to use default
serializer that forces the use of an interpreted python encoder.
mo_json.stream
A module that supports queries over very large JSON strings. The overall objective is to make a large JSON document appear like a hierarchical database, where arrays of any depth, can be queried like tables.
This is not a generic streaming JSON parser. It is only intended to breakdown the top-level array, or object for less memory usage.
expected_vars
). The code will raise an exception if
you can not extract all expected variables.mo_json.stream.parse()
Will return an iterator over all objects found in the JSON stream.
Parameters:
"."
if your JSON starts with [
, and is a list.The most common use of parse()
is to iterate over all the objects in a large, top-level, array:
parse(json, path=".", required_vars=["."]}
For example, given the following JSON:
[
{"a": 1},
{"a": 2},
{"a": 3},
{"a": 4}
]
returns a generator that provides
{"a": 1}
{"a": 2}
{"a": 3}
{"a": 4}
Simple Iteration
json = {"b": "done", "a": [1, 2, 3]}
parse(json, path="a", required_vars=["a", "b"]}
We will iterate through the array found on property a
, and return both a
and b
variables. It will return the following values:
{"b": "done", "a": 1}
{"b": "done", "a": 2}
{"b": "done", "a": 3}
Bad - Property follows array
The same query, but different JSON with b
following a
:
json = {"a": [1, 2, 3], "b": "done"}
parse(json, path="a", required_vars=["a", "b"]}
Since property b
follows the array we're iterating over, this will raise an error.
Good - No need for following properties
The same JSON, but different query, which does not require b
:
json = {"a": [1, 2, 3], "b": "done"}
parse(json, path="a", required_vars=["a"]}
If we do not require b
, then streaming will proceed just fine:
{"a": 1}
{"a": 2}
{"a": 3}
Complex Objects
This streamer was meant for very long lists of complex objects. Use dot-delimited naming to refer to full name of the property
json = [{"a": {"b": 1, "c": 2}}, {"a": {"b": 3, "c": 4}}, ...
parse(json, path=".", required_vars=["a.c"])
The dot (.
) can be used to refer to the top-most array. Notice the structure is maintained, but only includes the required variables.
{"a": {"c": 2}}
{"a": {"c": 4}}
...
Nested Arrays
Nested array iteration is meant to mimic a left-join from parent to child table; as such, it includes every record in the parent.
json = [
{"o": 1: "a": [{"b": 1}: {"b": 2}: {"b": 3}: {"b": 4}]},
{"o": 2: "a": {"b": 5}},
{"o": 3}
]
parse(json, path=[".", "a"], required_vars=["o", "a.b"])
The path
parameter can be a list, which is used to indicate which properties
are expected to have an array, and to iterate over them. Please notice if no
array is found, it is treated like a singleton array, and missing arrays still
produce a result.
{"o": 1, "a": {"b": 1}}
{"o": 1, "a": {"b": 2}}
{"o": 1, "a": {"b": 3}}
{"o": 1, "a": {"b": 4}}
{"o": 2, "a": {"b": 5}}
{"o": 3}
Large top-level objects
Some JSON is a single large object, rather than an array of objects. In these cases, you can use the items
operator to iterate through all name/value pairs of an object:
json = {
"a": "test",
"b": 2,
"c": [1, 2]
}
parse(json, {"items": "."}, {"name", "value"})
produces an iterator of
{"name": "a", "value": "test"}
{"name": "b", "value": 2}
{"name": "c", "value": [1,2]}
typed_encoder
One reason that NoSQL documents stores are wonderful is their schema can automatically expand to accept new properties. Unfortunately, this flexibility is not limitless; A string assigned to property prevents an object being assigned to the same, or visa-versa. This flexibility is under attack by the strict-typing zealots; who, in their self-righteous delusion, believe explicit types are better. They make the lives of humans worse; as we are forced to toil over endless schema modifications.
This module translates JSON documents into "typed" form; which allows document containers to store both objects and primitives in the same property. This also enables the storage of values with no containing object!
The typed JSON has a different form than the original, and queries into the document store must take this into account. This conversion is intended to be hidden behind a query abstraction layer that can understand this format.
There are three main conversions:
{"a": true} -> {"a": {"~b~": true}}
{"a": 1 } -> {"a": {"~n~": 1 }}
{"a": "1" } -> {"a": {"~s~": "1" }}
~e~
, to mark existence. This allows us to query for object existence, and to count the number of objects.
{"a": {}} -> {"a": {"~e~": 1}, "~e~": 1}
~e~
to count the number of elements in the array:
{"a": [1, 2, 3]} -> {"a": {
"~e~": 3,
"~a~": [
{"~n~": 1},
{"~n~": 2},
{"~n~": 3}
]
}}
Note the sum of a.~e~
works for both objects and arrays; letting us interpret sub-objects as single-value nested object arrays.typed_encode()
Accepts a dict
, list
, or primitive value, and generates the typed JSON that can be inserted into a document store.
json2typed()
Converts an existing JSON unicode string and returns the typed JSON unicode string for the same.
Update Mar2016 - PyPy version 5.x appears to have improved C integration to the point that the C library callbacks are no longer a significant overhead: This pure Python JSON encoder is no longer faster than a compound C/Python solution.
Fast JSON encoder used in convert.value2json()
when running in Pypy. Run the
speed test
to compare with default implementation and ujson
FAQs
More JSON Tools!
We found that mo-json demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Research
Security News
Socket researchers uncover the risks of a malicious Python package targeting Discord developers.
Security News
The UK is proposing a bold ban on ransomware payments by public entities to disrupt cybercrime, protect critical services, and lead global cybersecurity efforts.