New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

json_data_extractor

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

json_data_extractor

  • 0.1.03
  • Rubygems
  • Socket score

Version published
Maintainers
1
Created
Source

JsonDataExtractor

Transform JSON data structures with the help of a simple schema and JsonPath expressions. Use the JsonDataExtractor gem to extract and modify data from complex JSON structures using a straightforward syntax and a range of built-in or custom modifiers.

Another try to make something for JSON that is XSLT for XML. We transform one JSON into another JSON with the help of a third JSON!!!111!!eleventy!!

Remap one JSON structure into another with some basic rules and jsonpath.

Heavily inspired by xml_data_extractor.

Installation

Add this line to your application's Gemfile:

gem 'json_data_extractor'

And then execute:

$ bundle

Or install it yourself as:

$ gem install json_data_extractor

Usage

JsonDataExtractor allows you to remap one JSON structure into another with some basic rules and JSONPath expressions. The process involves defining a schema that maps the input JSON structure to the desired output structure.

We'll base our examples on the following source:

{
  "store": {
    "book": [
      {
        "category": "reference",
        "author": "Nigel Rees",
        "title": "Sayings of the Century",
        "price": 8.95
      },
      {
        "category": "fiction",
        "author": "Evelyn Waugh",
        "title": "Sword of Honour",
        "price": 12.99
      },
      {
        "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "isbn": "0-553-21311-3",
        "price": 8.99
      },
      {
        "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "isbn": "0-395-19395-8",
        "price": 22.99
      }
    ],
    "bicycle": {
      "color": "red",
      "price": 19.95
    }
  }
}

Defining a Schema

A schema consists of one or more mappings that specify how to extract data from the input JSON and where to place it in the output JSON.

Each mapping has a path field that specifies the JsonPath expression to use for data extraction, and an optional modifier field that specifies one or more modifiers to apply to the extracted data. Modifiers are used to transform the data in some way before placing it in the output JSON.

Here's an example schema that extracts the authors and categories from a JSON structure similar to the one used in the previous example:

{
  "authors": {
    "path": "$.store.book[*].author",
    "modifier": "downcase"
  },
  "categories": "$..category"
}

The resulting json will be:

{
  "authors": [
    "nigel rees",
    "evelyn waugh",
    "herman melville",
    "j. r. r. tolkien"
  ],
  "categories": [
    "reference",
    "fiction",
    "fiction",
    "fiction"
  ]
}

Handling Default Values

With JsonDataExtractor, you can specify default values in your schema for keys that might be absent in the input JSON. Use the path and default keys in the schema for this purpose.

schema = {
  absent_value: { path: nil },
  default: { path: '$.some_real_path', default: 'foo' },
  default_with_lambda: { path: '$.table', default: -> { 'DEFAULT' } },
  absent_with_default: { path: nil, default: 'bar' }
}
  • absent_value: Will be nil in the output as there's no corresponding key in the input JSON and no default is provided.
  • default: Will either take the value from $.some_real_path in the input JSON or 'foo' if the path does not exist.
  • default_with_lambda: Will take the value from $..table in the input JSON or 'DEFAULT' if the path does not exist.
  • absent_with_default: Will be 'bar' in the output as there's no corresponding key in the input JSON but a default is provided.
Simplified Syntax for Absent Values

For keys that you expect to be absent in the input JSON but still want to include in the output with a nil value, you can use a simplified syntax by setting the schema value to nil.

schema = {
  absent_value: nil
}

Modifiers

Modifiers can be supplied on object creation and/or added later by calling the #add_modifier method. Modifiers allow you to perform transformations on the extracted data before it is returned. They are useful for cleaning up data, formatting it, or applying any custom logic.

Modifiers can now be defined in several ways:

  1. By providing a symbol: This symbol should correspond to the name of a method (e.g., :to_i) that will be called on each extracted value.
  2. By providing an anonymous lambda or block: Use a lambda or block to define the transformation logic inline.
  3. By providing any callable object: A class or object that implements a call method can be used as a modifier. This makes it flexible to use pre-defined classes, lambdas, or procs.

Here’s an example schema showcasing the use of modifiers:

schema = {
  name:  '$.name', # Extract as-is
  age:   { path: '$.age', modifier: :to_i }, # Apply the `to_i` method
  email: { 
    path: '$.contact.email', 
    modifiers: [ 
      :downcase, 
      ->(email) { email.gsub(/\s/, '') } # Lambda to remove whitespace
    ]
  }
}
  • Name: The value is simply extracted as-is.
  • Age: The extracted value is converted to an integer using the to_i method.
  • Email:
    1. The value is transformed to lowercase using downcase.
    2. Whitespace is removed using an anonymous lambda.
Defining Custom Modifiers

You can define your own custom modifiers using add_modifier. A modifier can be defined using a block, a lambda, or any callable object (such as a class that implements call):

# Using a block
extractor = JsonDataExtractor.new(json_data)
extractor.add_modifier(:remove_newlines) { |value| value.gsub("\n", '') }

# Using a class with a `call` method
class ReverseString
  def call(value)
    value.reverse
  end
end

extractor.add_modifier(:reverse_string, ReverseString.new)

# Lambda example
capitalize = ->(value) { value.capitalize }
extractor.add_modifier(:capitalize, capitalize)

# Apply these modifiers in a schema
schema = {
  name: 'name',
  bio:  { path: 'bio', modifiers: [:remove_newlines, :reverse_string] },
  category: { path: 'category', modifier: :capitalize }
}

# Extract data
results = extractor.extract(schema)
Modifier Order

Modifiers are called in the order in which they are defined. Keep this in mind when chaining multiple modifiers for complex transformations. For example, if you want to first format a string and then clean it up (or vice versa), define the order accordingly.

You can also configure the behavior of modifiers. By default, JDE raises an ArgumentError if a modifier cannot be applied to the extracted value. However, this strict behavior can be configured to ignore such errors. See the Configuration section for more details.

Maps

The JsonDataExtractor gem provides a powerful feature called "maps" that allows you to transform extracted data using predefined mappings. Maps are useful when you want to convert specific values from the source data into different values based on predefined rules. The best use case is when you need to traverse a complex tree to get to a value and them just convert it to your own disctionary. E.g.:

data = {
  cars: [
          { make: 'A', fuel: 1 },
          { make: 'B', fuel: 2 },
          { make: 'C', fuel: 3 },
          { make: 'D', fuel: nil },
        ]
}

FUEL_TYPES = { 1 => 'Petrol', 2 => 'Diesel', nil => 'Unknown' }
schema     = {
  fuel: {
    path: '$.cars[*].fuel',
    map:  FUEL_TYPES
  }
}
result     = JsonDataExtractor.new(data).extract(schema) # => {"fuel":["Petrol","Diesel",nil,"Unknown"]}

A map is essentially a dictionary that defines key-value pairs, where the keys represent the source values and the corresponding values represent the transformed values. When extracting data, you can apply one or multiple maps to modify the extracted values.

Syntax

To define a map, you can use the map or maps key in the schema. The map value can be a single hash or an array of hashes, where each hash represents a separate mapping rule. Here's an example:

{
  path: "$.data[*].category",
  map:  {
    "fruit"     => "Fresh Fruit",
    "vegetable" => "Organic Vegetable",
    "meat"      => "Premium Meat"
  },
}

Multiple maps can also be provided. In this case, each map is applied to the result of previous transformation:

{
  path: "$.data[*].category",
  maps: [
          {
            "fruit"     => "Fresh Fruit",
            "vegetable" => "Organic Vegetable",
            "meat"      => "Premium Meat",
          },
          {
            "Fresh Fruit"       => "Frisches Obst",
            "Organic Vegetable" => "Biologisches Gemüse",
            "Premium Meat"      => "Hochwertiges Fleisch",
          }
        ]
}

(the example is a little bit silly, but you should get the idea of chaining maps)

You can use keys :map and :maps interchangeably much like :modifier, :modifiers.

Notes
  • Maps can be used together with modifiers but this has less sense as you can always apply complex mapping rules in modifiers themselves.
  • If used together with modifiers, maps are applied after modifiers.
  • If a map does not have a key corresponding to a transformed value, it will return nil, be careful
  • Maps are applied in the order they are defined in the schema. Be cautious of the order if you have overlapping or conflicting mapping rules.

Nested schemas

JDE supports nested schemas. Just provide your element with a type of array and add a schema key for its data.

E.g. this is a valid real-life schema with nested data:

{
  "name": "$.Name",
  "code": "$.Code",
  "services": "$.Services[*].Code",
  "locations": {
    "path": "$.Locations[*]",
    "type": "array",
    "schema": {
      "name": "$.Name",
      "type": "$.Type",
      "code": "$.Code"
    }
  }
}

Nested schema can be also applied to objects, not arrays. See specs for more examples.

Configuration Options

The JsonDataExtractor gem provides a configuration option to control the behavior when encountering invalid modifiers.

Strict Modifiers

By default, the gem operates in strict mode, which means that if an invalid modifier is encountered, an ArgumentError will be raised. This ensures that only valid modifiers are applied to the extracted data.

To change this behavior and allow the use of invalid modifiers without raising an error, you can configure the gem to operate in non-strict mode.

JsonDataExtractor.configure do |config|
  config.strict_modifiers = false
end

When strict_modifiers is set to false, any invalid modifiers will be ignored, and the original value will be returned without applying any modification.

It is important to note that enabling non-strict mode should be done with caution, as it can lead to unexpected behavior if there are typos or incorrect modifiers specified in the schema.

By default, strict_modifiers is set to true, providing a safe and strict behavior. However, you can customize this configuration option according to your specific needs.

TODO

Update this readme for better usage cases. Add info on arrays and modifiers.

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/austerlitz/json_data_extractor. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

License

The gem is available as open source under the terms of the MIT License.

Code of Conduct

Everyone interacting in the JsonDataExtractor project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.

FAQs

Package last updated on 11 Dec 2024

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc