:Author: John Millikin
:Copyright: This document has been placed in the public domain.
Overview
JSON <http://json.org/>
_ is a lightweight data-interchange format. It
is often used for exchanging data between a web server and user agent.
This module aims to produce a library for serializing and deserializing
JSON that conforms strictly to RFC 4627.
For the Python 3 version of jsonlib, see jsonlib-python3 <http://pypi.python.org/pypi/jsonlib-python3/>
_.
Other JSON implementations of interest include simplejson <http://pypi.python.org/pypi/simplejson>
_ (available in the standard library as
of Python 2.6) and demjson <http://pypi.python.org/pypi/demjson/>
_.
.. contents::
Usage
jsonlib has two functions of interest, read
and write
. It also
defines some exception: ReadError
, WriteError
, and
UnknownSerializerError
.
For compatibility with the standard library, read
is aliased to
loads
and write
is aliased to dumps
. They do not have the
same set of advanced parameters, but may be used interchangeably for
simple invocations.
Deserialization
To deserialize a JSON expression, call the jsonlib.read
function with
an instance of unicode
or bytes
. ::
>>> import jsonlib
>>> jsonlib.read (b'["Hello world!"]')
[u'Hello world!']
Floating-point values
By default, ``jsonlib`` will parse values such as "1.1" into an instance of
``decimal.Decimal``. To use the built-in value type ``float`` instead, set
the ``use_float`` parameter to ``True``. ``float`` values are much faster to
construct, so this flag may substantially increase parser performance.
Please note that using ``float`` will cause a loss of precision when
parsing some values. ::
>>> jsonlib.read ('[3.14159265358979323846]', use_float = True)
[3.1415926535897931]
Serialization
-------------
Serialization has more options, but they are set to reasonable defaults.
The simplest use is to call ``jsonlib.write`` with a Python value. ::
>>> import jsonlib
>>> jsonlib.write (['Hello world!'])
'["Hello world!"]'
Pretty-Printing
~~~~~~~~~~~~~~~
To "pretty-print" the output, pass a value for the ``indent`` parameter. ::
>>> print (jsonlib.write (['Hello world!'], indent = ' ').decode ('utf8'))
[
"Hello world!"
]
>>>
Mapping Key Sorting
~~~~~~~~~~~~~~~~~~~
By default, mapping keys are serialized in whatever order they are
stored by Python. To force a consistent ordering (for example, in doctests)
use the ``sort_keys`` parameter. ::
>>> jsonlib.write ({'e': 'Hello', 'm': 'World!'})
'{"m":"World!","e":"Hello"}'
>>> jsonlib.write ({'e': 'Hello', 'm': 'World!'}, sort_keys = True)
'{"e":"Hello","m":"World!"}'
Encoding and Unicode
~~~~~~~~~~~~~~~~~~~~
By default, the output is encoded in UTF-8. If you require a different
encoding, pass the name of a Python codec as the ``encoding`` parameter. ::
>>> jsonlib.write (['Hello world!'], encoding = 'utf-16-be')
'\x00[\x00"\x00H\x00e\x00l\x00l\x00o\x00 \x00w\x00o\x00r\x00l\x00d\x00!\x00"\x00]'
To retrieve an unencoded ``unicode`` instance, pass ``None`` for the
encoding. ::
>>> jsonlib.write (['Hello world!'], encoding = None)
u'["Hello world!"]'
By default, non-ASCII codepoints are forbidden in the output. To include
higher codepoints in the output, set ``ascii_only`` to ``False``. ::
>>> jsonlib.write ([u'Hello \u266a'], encoding = None)
u'["Hello \\u266a"]'
>>> jsonlib.write ([u'Hello \u266a'], encoding = None, ascii_only = False)
u'["Hello \u266a"]'
Mapping Key Coercion
~~~~~~~~~~~~~~~~~~~~
Because JSON objects must have string keys, an exception will be raised when
non-string keys are encountered in a mapping. It can be useful to coerce
mapping keys to strings, so the ``coerce_keys`` parameter is available. ::
>>> jsonlib.write ({True: 1})
Traceback (most recent call last):
WriteError: Only strings may be used as object keys.
>>> jsonlib.write ({True: 1}, coerce_keys = True)
'{"True":1}'
Serializing Other Types
If the object implements the iterator or mapping protocol, it will be
handled automatically. If the object is intended for use as a basic value,
it should subclass one of the supported basic values.
String-like objects that do not inherit from unicode
or
UserString.UserString
will likely be serialized as a list. This will
not be changed. If iterating them returns an instance of the same type, the
serializer might crash. This (hopefully) will be changed.
To serialize a type not known to jsonlib, use the on_unknown
parameter
to write
::
>>> from datetime import date
>>> def unknown_handler (value):
... if isinstance (value, date):
... return str (value)
... raise jsonlib.UnknownSerializerError
>>> jsonlib.write ([date (2000, 1, 1)], on_unknown = unknown_handler)
'["2000-01-01"]'
Streaming Serializer
When serializing large objects, the use of an in-memory buffer may cause
too much memory to be used. For these situations, use the ``dump`` function
to write objects to a file-like object::
>>> import sys
>>> jsonlib.dump (["Written to stdout"], sys.stdout, encoding = None)
["Written to stdout"]
>>> with open ("/dev/null", "wb") as out:
... jsonlib.dump (["Written to a file"], out)
>>>
Exceptions
-----------
ReadError
~~~~~~~~~
Raised by ``read`` if an error was encountered parsing the expression. Will
contain the line, column, and character position of the error.
Note that this will report the *character*, not the *byte*, of the character
that caused the error.
WriteError
~~~~~~~~~~
Raised by ``write`` or ``dump`` if an error was encountered serializing
the passed value.
UnknownSerializerError
A subclass of WriteError
that is raised when a value cannot be
serialized. See the on_unknown
parameter to write
.
Change Log
1.6.1
- Fixed error in
write()
which could cause output truncation.
1.6
- Performance improvements
coerce_keys
no longer attempts to determine the "JSON" format for
a coerced value -- it will simply call unicode()
.