New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More →

json-flattener

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

json-flattener

Python library for denormalizing nested dicts or json objects to tables and back

0.1.9
PyPI

Maintainers: 2

json-flattener

Python library for denormalizing/flattening lists of complex objects to tables/data frames, with roundtripping

Notebook Example

EXAMPLE.ipynb

Description

Given YAML/JSON/JSON-Lines such as:

- id: S001
  name: Lord of the Rings
  genres:
    - fantasy
  creator:
    name: JRR Tolkein
    from_country: England
  books:
    - id: S001.1
      name: Fellowship of the Ring
      price: 5.99
      summary: Hobbits
    - id: S001.2
      name: The Two Towers
      price: 5.99
      summary: More hobbits
    - id: S001.3
      name: Return of the King
      price: 6.99
      summary: Yet more hobbits
- id: S002
  name: The Culture Series
  genres:
    - scifi
  creator:
    name: Ian M Banks
    from_country: Scotland
  books:
    - id: S002.1
      name: Consider Phlebas
      price: 5.99
    - id: S002.2
      name: Player of Games
      price: 5.99

Denormalize using jfl command:

jfl flatten -C creator=flat -C books=multivalued -i examples/books1.yaml -o examples/books1-flattened.tsv

id	name	genres	creator_name	creator_from_country	books_name	books_summary	books_price	books_id	creator_genres
S001	Lord of the Rings	[fantasy]	JRR Tolkein	England	[Fellowship of the Ring\|The Two Towers\|Return of the King]	[Hobbits\|More hobbits\|Yet more hobbits]	[5.99\|5.99\|6.99]	[S001.1\|S001.2\|S001.3]
S002	The Culture Series	[scifi]	Ian M Banks	Scotland	[Consider Phlebas\|Player of Games]		[5.99\|5.99]	[S002.1\|S002.2]

Convert back to JSON/YAML:

jfl unflatten -C creator=flat -C books=multivalued -i examples/books1.tsv -o examples/books1.yaml

This library also allows complex fields to be directly serialized as json or yaml (the default is to append _json to the key). For example:

jfl flatten -C creator=json -C books=json -i examples/books1.yaml -o examples/books1-jsonified.tsv

id	name	genres	creator_json	books_json
S001	Lord of the Rings	[fantasy]	{"name": "JRR Tolkein", "from_country": "England"}	[{"id": "S001.1", "name": "Fellowship of the Ring", "summary": "Hobbits", "price": 5.99}, {"id": "S001.2", "name": "The Two Towers", "summary": "More hobbits", "price": 5.99}, {"id": "S001.3", "name": "Return of the King", "summary": "Yet more hobbits", "price": 6.99}]
S002	The Culture Series	[scifi]	{"name": "Ian M Banks", "from_country": "Scotland"}	[{"id": "S002.1", "name": "Consider Phlebas", "price": 5.99}, {"id": "S002.2", "name": "Player of Games", "price": 5.99}]
S003	Book of the New Sun	[scifi, fantasy]	{"name": "Gene Wolfe", "genres": ["scifi", "fantasy"], "from_country": "USA"}	[{"id": "S003.1", "name": "Shadow of the Torturer"}, {"id": "S003.2", "name": "Claw of the Conciliator", "price": 6.99}]
S004	Example with single book		{"name": "Ms Writer", "genres": ["romance"], "from_country": "USA"}	[{"id": "S004.1", "name": "Blah"}]
S005	Example with no books		{"name": "Mr Unproductive", "genres": ["romance", "scifi", "fantasy"], "from_country": "USA"}

See

The primary use case is to go from a rich normalized data model (as python objects, JSON, or YAML) to a flatter representation that is amenable to processing with:

Solr/Lucene
Pandas/R Dataframes
Excel/Google sheets
Unix cut/grep/cat/etc
Simple denormalized SQL database representations

The target denormalized format is a list of rows / a data matrix, where each cell is either an atom or a list of atoms.

Method

Each top level key becomes a column
if the key value is a dict/object, then flatten
- by default a '_' is used to separate the parent key from the inner key
- e.g. the composition of creator and from_country becomes creator_from_country
- currently one level of flattening is supported
if the key value is a list of atomic entities, then leave as is
if the key value is a list of dicts/objects, then flatten each key of this inner dict into a list
- e.g. if books is a list of book objects, and name is a key on book, then books_name is a list of names of each book
- order is significant - the first element of books_name is matched to the first element of books_price, etc
Allow any key to be serialized as yaml/json/pickle if configured

Command line usage (TODO)

Usage from Python

Documentation coming soon: see test folder for now

use within LinkML

Comparison

Pandas json_normalize

https://pandas.pydata.org/pandas-docs/version/0.25.0/reference/api/pandas.io.json.json_normalize.html

Java json-flattener

https://github.com/wnameless/json-flattener

Python

csvjson

https://csvjson.com/json2csv

Keywords

linkml data-science denormalization yaml json data-frames

FAQs

What is json-flattener?

Is json-flattener well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install