New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

arquero-arrow

Package Overview
Dependencies
Maintainers
1
Versions
4
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

arquero-arrow

Arrow serialization support for Arquero.

  • 0.2.0
  • latest
  • Source
  • npm
  • Socket score

Version published
Weekly downloads
1
decreased by-80%
Maintainers
1
Weekly downloads
 
Created
Source

arquero-arrow

Arrow serialization support for Arquero. The toArrow(data) method encodes either an Arquero table or an array of objects into the Apache Arrow format. This package provides a convenient interface to the apache-arrow JavaScript library, while also providing more performant encoders for standard integer, float, date, boolean, and string dictionary types.

API Documentation

# aq.toArrow(input, types) · Source

Create an Apache Arrow table for an input dataset. The input data can be either an Arquero table or an array of standard JavaScript objects. This method will throw an error if type inference fails or if the generated columns have differing lengths.

  • input: An input dataset to convert to Arrow format. If array-valued, the data should consist of an array of objects where each entry represents a row and named properties represent columns. Otherwise, the input data should be an Arquero table.
  • options: Options for Arrow encoding.
    • columns: Ordered list of column names to include. If function-valued, the function should accept the input data as a single argument and return an array of column name strings.

    • limit: The maximum number of rows to include (default Infinity).

    • offset: The row offset indicating how many initial rows to skip (default 0).

    • types: An optional object indicating the Arrow data type to use for named columns. If specified, the input should be an object with column names for keys and Arrow data types for values. If a column's data type is not explicitly provided, type inference will be performed.

      Type values can either be instantiated Arrow DataType instances (for example, new Float64(),new DateMilliseconds(), etc.) or type enum codes (Type.Float64, Type.Date, Type.Dictionary). For convenience, arquero-arrow re-exports the apache-arrow Type enum object (see examples below). High-level types map to specific data type instances as follows:

      • Type.Datenew DateMilliseconds()
      • Type.Dictionarynew Dictionary(new Utf8(), new Int32())
      • Type.Floatnew Float64()
      • Type.Intnew Int32()
      • Type.Intervalnew IntervalYearMonth()
      • Type.Timenew TimeMillisecond()

      Types that require additional parameters (including List, Struct, and Timestamp) can not be specified using type codes. Instead, use data type constructors from apache-arrow, such as new List(new Int32()).

Examples

Encode Arrow data from an input Arquero table:

const { table } = require('arquero');
const { toArrow, Type } = require('arquero-arrow');

// create Arquero table
const dt = table({
  x: [1, 2, 3, 4, 5],
  y: [3.4, 1.6, 5.4, 7.1, 2.9]
});

// encode as an Arrow table (infer data types)
// here, infers Uint8 for 'x' and Float64 for 'y'
const at1 = toArrow(dt);

// encode into Arrow table (set explicit data types)
const at2 = toArrow(dt, {
  types: {
    x: Type.Uint16,
    y: Type.Float32
  }
});

// serialize Arrow table to a transferable byte array
const bytes = at1.serialize();

Register a toArrow() method for all Arquero tables:

const { internal: { ColumnTable }, table } = require('arquero');
const { toArrow } = require('arquero-arrow');

// add new method to Arquero tables
ColumnTable.prototype.toArrow = function(types) {
  return toArrow(this, types);
};

// create Arquero table, encode as an Arrow table (infer data types)
const at = table({
  x: [1, 2, 3, 4, 5],
  y: [3.4, 1.6, 5.4, 7.1, 2.9]
}).toArrow();

Encode Arrow data from an input object array:

const { toArrow } = require('arquero-arrow');

// encode object array as an Arrow table (infer data types)
const at = toArrow([
  { x: 1, y: 3.4 },
  { x: 2, y: 1.6 },
  { x: 3, y: 5.4 },
  { x: 4, y: 7.1 },
  { x: 5, y: 2.9 }
]);

Build Instructions

To build and develop locally:

Keywords

FAQs

Package last updated on 03 Feb 2021

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc