🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more

offline-sort

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

offline-sort

0.2.0
96

Supply Chain Security

100

Vulnerability

100

Quality

100

Maintenance

100

License

Shell access

Supply chain risk

This module accesses the system shell. Accessing the system shell increases the risk of executing arbitrary code.

Found 1 instance in 1 package

Version published
Maintainers
1
Created

offline-sort

Sort arbitrarily large collections of data with limited memory usage. Given an enumerable and a sort_by proc, this gem will break the input data into sorted chunks, persist the chunks, and return an Enumerator. Data read from this enumerator will be in its final sorted order.

The size of the chunks and the strategy for serializing and deserializing the data are configurable. The gem comes with builtin strategies for Marshal, MessagePack and YAML.

The development of this gem is documented in this post from the Salsify Engineering Blog.

Installation

Add this line to your application's Gemfile:

gem 'offline-sort'

And then execute:

$ bundle

Or install it yourself as:

$ gem install offline-sort

Usage

  arrays = [ [4,5,6], [7,8,9], [1,2,3] ]
  
  # Create a sorted enumerator
  sorted = OfflineSort.sort(arrays, chunk_size: 1) do |array|
    array.first
  end
  
  # Stream results in sorted order
  sorted.each do |entry|
    # e.g. write to a file
  end

The example above will create 3 files with 1 array each, then output them in sorted order. You should try different values of chunk_size to find the best speed/memory combination for your use case. In general larger chunk sizes will use more memory but run faster.

Sorting is not limited to arrays. You can use anything that can be expressed in a Enumerable#sort_by block.

Using MessagePack

Message pack serialization is faster than the default Ruby Marshal strategy. To enable message pack serialization follow these steps.

gem install msgpack

require 'msgpack'

Requiring MessagePack before you require offline_sort will automatically enable MessagePack serialization in the gem.

Limitations

The MessagePack serialize/deserialize process stringifies hash keys so it is important to write your sort_by in terms of string keys.

Contributing

  • Fork it
  • Create your feature branch (git checkout -b my-new-feature)
  • Commit your changes (git commit -am 'Add some feature')
  • Push to the branch (git push origin my-new-feature)
  • Create new Pull Request

FAQs

Package last updated on 18 Oct 2021

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts