Research
Security News
Malicious npm Packages Inject SSH Backdoors via Typosquatted Libraries
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
A little package to convert a JSON into a table! This project was born out of a need to transform many JSONs mined from APIs to something that Pandas or a relational database could understand. The difference between this package and json path packages is that its designed to create tables, not just extract single values.
The package is available through pypi. So simply go to your command line and:
pip install jsontable
You're also welcome to download the code from github and modify to suit your needs. And if you have time let me know what cool functionality you added and we can improve the project!
It works in a similar manner to JSON parsers
Here is a quick example to get you going
import jsontable as jsontable
#Create a list of paths you want to extract
paths = [{"$.id":"id"}, {"$.name":"name"}, {"$.address.city":"city"}]
#The JSON object you want to explore
sample = {"id":"1","name":"Foo","address":{"city":"Bar"}}
#Create an instance of a converter
converter = jsontable.converter()
#Set the paths you want to extract
converter.set_paths(paths)
#Input a JSON to be interpreted
converter.convert_json(sample)
In this case, you will get a table with two columns and two rows (header and first row of data) like these:
[['id', 'name', 'city'], ['1', 'Foo', 'Bar']]
For more examples, refer to the tests folder
Each path you specify is a column in your final table. Each path that is setup is expanded according to the standard JSON Path functionality. This is, for each path, the converter starts at the root of the JSON object and navigates each step (a.k.a node) of the path in order. When it reaches the final step in the path (a.k.a leaf), it outputs the resulting element of the JSON into the cell.
The final cell value is converted based on the standard JSON values as follows:
JSON Value | Conversion | Sample Output |
---|---|---|
object | stringified object | '{"city":"Bar"}' |
array | stringified array | '[1,2,3]' |
string | string | 'Foo' |
number | number | 4.7 |
boolean | stringidied boolean | 'False' |
null | None | None |
missing value (i.e. the path did not find an element) | None | None |
The intention behind stringifying the object, array and boolean is to be able to pass the output to other data libraries (e.g Pandas) or to a relational database.
With the exception of the final node, array elements are automatically expanded into rows. So for example a path '$.a.b'
applied to a JSON {"a":[{"b":1},{"b":2}]}
would result into two rows [[1],[2]]
. The array expansion functionality can be applied to the final node by explicitly using the *
operator as a final step (e.g. $.a.*
)
Example:
paths = [{"$.name":"Name"},{"$.telephones.type":"Telephone Type"},{"$.telephones.number":"Telephone Number"}]
sample = {
"name":"Foo",
"telephones":[
{"type":"mobile", "number":"0000"},
{"type":"home", "number":"1111"}
]
}
converter = jsontable.converter()
converter.set_paths(paths)
converter.convert_json(sample)
Result:
[['Name', 'Telephone Number', 'Telephone Type'], ['Foo', '0000', 'mobile'], ['Foo', '1111', 'home']]
The reverse of this functionality (not expand arrays if they are encountered before the end) is not implemented only due to the lack of need.
Since a path may result in multiple rows, there is the need to be able to combine the result of each column into the same table. The joining mechanism is similar to an SQL join, where each cell (row-cell combination) is "matched" to a row in the result using a "matching value". The matching value in this case is the last common element of the paths.
This is best illustrated with an example, the following table shows the transformations applied to the sample JSON.
sample = {
"contacts":[
{
"name":"Foo",
"telephones":[
{"type":"mobile", "number":"0000"},
{"type":"home", "number":"1111"}
],
"emails":[
{"type":"work", "email":"foo@w.com"},
{"type":"personal", "email":"foo@p.com"}
]
},
{
"name":"Bar",
"telephones":[
{"type":"mobile", "number":"2222"},
{"type":"home", "number":"3333"}
],
"emails":[
{"type":"work", "email":"bar@w.com"},
{"type":"personal", "email":"bar@p.com"}
]
}
]
}
Paths | Result |
---|---|
[ {"$.contacts.name":"Name"}, {"$.contacts.telephones.type":"Type"}, {"$.contacts.telephones.number":"Number"} ] |
[ ['Name', 'Type', 'Number'], ['Foo', 'mobile', '0000'], ['Foo', 'home', '1111'], ['Bar', 'mobile', '2222'], ['Bar', 'home', '3333'] ] |
[ {"$.contacts.name":"Name"}, {"$.contacts.telephones.number":"Number"}, {"$.contacts.emails.email":"Email"} ] |
[ ['Name', 'Number', 'Email'], ['Foo', '0000', 'foo@w.com'], ['Foo', '1111', 'foo@w.com'], ['Foo', '0000', 'foo@p.com'], ['Foo', '1111', 'foo@p.com'], ['Bar', '0000', 'bar@w.com'], ['Bar', '1111', 'bar@w.com'], ['Bar', '0000', 'bar@p.com'], ['Bar', '1111', 'bar@p.com'], ] |
In the first case, the type
and number
have a common path telephone
and therefore the columns are combined for the same telephone element. If we then look at the name
path it has a common path contacts
with the rest of the columns, and therefore, the value is repeated across the rows.
In the second case the email
and number
only have a common path contacts
and since each path results in two rows, the only possible way to match these is to combine all the values, resulting in 4 rows per contact (total 8 rows since there are 2 contacts).
Currently there are two operators supported: * and ~
Syntax | Description |
---|---|
* | Returns all values of the current element. If its an array, it will return one row per array value. If its an object (dictionary in Python) it will return one row per value. If its a value (string, number, boolean, null), it returns the same value |
~ | Return all indices of the current element. If its an array, it returns an ascending numbered sequence starting with 0 (e.g. [1,2] would return [[0],[1]]) . If its an object, it will return the keys (e.g. {"a":1,"b":2} would return [['a'],['b']]). If its a value it returns 0 |
More operators will be implemented in later releases.
Both these changes were made possible by changing the search method from depth first to breadth first, as well as recursing through a tree rather than iterating through one column at a time.
In the wishlist we have:
I want to mention that whilst I inted to expand the functionality of this package, at the moment it can only take a simple sequence of keys to navigate a path. This is, the full functionality proposed by Stefan Gossner in his jsonpath is not yet implemented.... but we will get there.
If you are looking for a package that simply extracts a single value from a JSON by using more complex paths (and its functions), I recommend you look at jsonpath-rw by Kenn Knowles jsonpath-ng by Tomas Aparicio or jsonpath2 by Mark Borkum.
I will continue to look for improvements in the package and hopefully add some useful functionality. Given the current popularity of the package, the maintenance is in a best effort manner. However if you have issues or bugs to report let me know here and I will try my best to help.
You can use this package as you wish, but unfortunatelly, I cannot take responsibility of how this code is used, or the results it provides. It is up to you to test this does what you want it to!
FAQs
Convert a JSON to a table
We found that jsontable demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.
Security News
MITRE's 2024 CWE Top 25 highlights critical software vulnerabilities like XSS, SQL Injection, and CSRF, reflecting shifts due to a refined ranking methodology.
Security News
In this segment of the Risky Business podcast, Feross Aboukhadijeh and Patrick Gray discuss the challenges of tracking malware discovered in open source softare.