Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

@dictadata/storage-junctions

Package Overview
Dependencies
Maintainers
1
Versions
126
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@dictadata/storage-junctions

Node.js library for distributed data storage access and streaming transfers.

  • 1.2.6
  • unpublished
  • Source
  • npm
  • Socket score

Version published
Weekly downloads
52
increased by160%
Maintainers
1
Weekly downloads
 
Created
Source

dictadata.org: Open source software for Civic Data Engineering & Analytics

@dictadata/storage-junctions 1.2.3

Node.js library for distributed data storage access and streaming transfers.

A storage junction provides a normalized, plug-in interface to a specific data source such as data file, database table, document collection, key/value store, etc.

Supported Storage Sources

modelencodingstorerecallretrievedullstreamablekey-valuedocumentstables
csvyesnono-noyesnonoyes
jsonyesnono-noyesnoyesyes
parquetyesnono-noyesnoyesyes
xlsx (Excel)yes----yesnonoyes
restyes--yes-yes--yes
elasticsearchyesyesyesyesyesyesyesyesyes
mysqlyesyesyesyesyesyesno-yes
redshiftyesyesyesyesyesyesno-yes
*mssqlno-yes
*postgresqlno-yes
*mongodbyesyesyes
-memcacheyesnono

* In the plans for future development. ‐ Not planned, but will be developed as needed.

Supported File Storage Systems

File Storage systems provide read and write streams to objects (files) on local and cloud storage systems. GZip compression is handled seemlessly based on filename extension .gz.

modellistreadwritescan
localyesyesyesyes
FTPyesyesyesyes
AWS S3yesyesyesyes
*Azure ADLS----
*Google CS----

* Not currently in plans for development. ‐ Not planned, but will be developed as needed.

Storage Memory Trace

A storage memory trace (SMT) is a data source definition. It is made up of four parts.

SMTDescription
modelThe type of storage source which determines how to communicate with the storage source.
locusThe location or address of the data source such as a file folder, URL or database connection string.
schemaThe name of the container that holds the data such as file name, table, collection or bucket.
keyIn addition to defining a key it determines how to address data stored in the schema.

An SMT can be represented as string separated by pipe | characters or as a json object. Special characters in an SMT string should be URL encoded.

csv|/path/to/folder/|filename.csv|*
mysql|connection string|talblename|=column1,column2
elastic|node address|index|!field
{
  "model": "mysql",
  "locus": "connection string",
  "schema": "tablename",
  "key": "=column1,column2"
}

SMT Key Formats

FormatDescriptionExamples
!Key. Keys are used to store and recall data. A single ! character denotes the data source assigns keys. Field names following a ! will be used to compute the key. Useful for key-value or document stores!
!userid
!lastname+firstname
=Primary key. Field name(s) must follow the = character. Values must be supplied for key fields when calling store, recall and dull functions. Useful for structured data like database tables.=userid
=lastname+firstname.
*Any or All. If primary key(s) are specified in the schema encodings then this is effectively equivalent to = key format. Otherwise, * is a generic place holder primarily used when the source is only used for searching or streaming transfers.*
uidUID. A unique ID value (key) that addresses a specific piece of data on the data source. Similar to ! as the UID is a specific key. Used as the default value if no key is passed to store, recall and dull functions. Otherwise, the storage junction will behave the same as the bare ! key format. Rarely useful except in special cases.1234
default

Storage Engram Encoding

{
  "model": "*",
  "locus": "*",
  "schema": "my_schema",
  "key": "=Foo",
  "fields": {
    "Foo": {
      "name": "Foo",
      "type": "keyword",
      "size": 0,
      "default": null,
      "isNullable": false,
      "isKey": true,
      "label": "Foo"
    },
    "Bar": {
      "name": "Bar",
      "type": "integer",
      "size": 0,
      "default": null,
      "isNullable": true,
      "isKey": false,
      "label": "Bar"
    },```
    ...
  }
}

Storage Retrieval Pattern

pattern: {
  match: {
    "Foo": "first",
    "Bar": { "gte": 0, "lte": 1000 }
  },
  count: 3,
  order: { "Bar": "desc" },
  fields: ["Foo","Bar","Baz"]
}

Data Transforms

// example transform with filter and field mapping
"transform": {
  "filter" {
    "match": {
      "Bar": "row",
      "Baz": { "gt": 100 }
      }
    },
    "drop": {
      "Baz": 5678
      }
    }
  },
  "select": {
    "inject_before": {
      "Fie": "where's fum?"
    },
    "fields": {
      "Foo": "Foo",
      "Bar": "Bar",
      "Baz": "Bazzy"
    },
    "remove": [ "Fobe" ],
    "inject_after": {
      "Fie": "override the fum"
    }
  }
}

FilterTransform

  // example filter transform

  transforms: {
    "filter": {
      // match all expressions to forward
      match: {
        "field1": 'value',
        "field2": {gt: 100, lt: 200}
      },
      // match all expressions to drop
      drop: {
        "field1": 'value',
        "field2": { lt: 0 }
        }
      }
    }
  };

SelectTransform

  // example fields transform

  transforms: {
    "select": {
      // inject new fields or set defaults in case of missing values
      "inject_before": {
        "newField": <value>
        "existingField": <default value>
      },

      // select a list of fields
      "fields": ["field1", "field2", ... ],
      // or select and map fields using dot notation
      // { src: dest, ...}
      "fields": {
        "field1": "Field1",
        "object1.subfield":  "FlatField"
      },

      // remove fields from the new construct
      "remove": ["field1", "field2"],

      // inject new fields or override existing values
      "inject_after": {
        "newField": <value>,
        "existingField": <override value>
      }

    }
  };

order of operations

  • inject_before
  • select, mapping or copy
  • remove
  • inject_after

AggregateTransform

Summarize and/or aggregate a stream of objects. Functionality similar to SQL GROUP BY and aggregate functions like SUM or Elasticsearch's _search aggregations.

  // example aggregate Summary transform
  // summary totals for field1
  // format "newField: { "function": "field name" }
  {
    "transforms": {
      "aggregate": {
        "mySum": {"sum": "myField"},
        "myMin": {"min": "myField"},
        "myMax": {"max": "myField"},
        "myAvg": {"avg": "myField"},
        "myCount": {"count": "myField"},
      }
    }
  }

  // Example aggregate Group By transform
  // format: "group by field": { "newField": { "function": "field name" }}
  {
    "transforms": {
      "aggregate": {
        "field1": {
          "subTotal": { "sum": "field2" } },
          "count": { "count": "field2" } }
      }
    }
  }

Storage-Junctions Functions

getEncoding()

putEncoding(encoding)

store(construct)

recall(key)

retrieve(pattern)

createReadStream(options)

createWriteStream(options)

createTransform(tfType, options)

getFileSystem()

Transform Plugins

  • Codify - Infer field encodings from examining a stream of objects.
  • Aggregate - Summarize a data stream similar to SQL GROUP BY and SUM
  • Fields - field selection and mappings.
  • Filter - select constructs to forward or drop.
  • MetaStats - calculate meta statistics about fields for a stream of constructs.

FileSystem Plugins

  • fs - Local file system support for Windwos, Linux and Mac iOS.
  • ftp - FTP file transport protocal servers.
  • s3 - AWS S3 storage.

FAQs

Package last updated on 13 Jan 2021

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc