Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

supersalmon

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

supersalmon

Stream processor for xlsx files

1.6.2
latest
Source
npm

Version published: 5 years ago

Maintainers: 1

Created: 6 years ago

Source

Supersalmon

Features

Process xlsx files while streaming
Records are processed one at a time by a user supplied promise returning function
Strives for memory efficiency - Automatic backpressure handling
Line count support (not all xlsx files contain this metadata)
Can transform column names
Will reject on non xlsx data
Battle tested in production code
Skips empty rows (maybe it's a caveat)
Limit parsing: process a file to a specified row, then stop
Allows to process rows in chunks (for multiple inserts)

Caveats

Only parses the first on sheet in a workbook
Requires the first row to contain column names
Skips empty rows (maybe it's a feature)

Example Usage

  const supersalmon = require('supersalmon')

  // The promise is resolved when the stream is completely processed
  // It resolves to the count of processed rows
  const processed = await supersalmon({

    // (required) Any readable stream will work - remember that only the first sheet will be parsed
    inputStream: createReadStream('huge.xlsx'),

    hasHeaders: true,

    // optional - returns data alongside their formats ( such as 'DD/MM/YYYY' for dates) - see tests
    hasFormats: false,

    // (required) transform column names- column names will become the key names of the processed objects
    mapColumns: cols => colName => colName.toLowerCase().trim(),

    // (optional) the last function is called when the line count metadata is encountered in the stream
    onLineCount: lineCount => notifyLineCount(lineCount),

    // enable or disable the underlying xlsx-stream-reader formatting feature
    formatting: false,

    // if true, does not throw an error when a cell does not have corresponding column name and skips the cell instead
    lenientColumnMapping: false

    // Returns rows in arrays of 3 elements
    chunkSize: 3
  }).processor({
    // (optional) parse until the 10th line then destroy streams and return
    limit: 10,
    // the row index tis provided as the second parameter
    onRow: ({name, surname}, i) => {
      doSomethingWithRowIndex(i)
      repository.insert({ name, surname })
    },
  });

  // Alternatively it is possibile to access directly the object stream

  const stream = supersalmon({ /* config */ }).stream()

  // it is possibile to take only the first n records
  const stream10 = supersalmon({ /* config */ }).stream(10)


  stream.pipe(myOtherStream)

Or see tests

TODO List

Issues and PRs more than welcome!

Credits

A large part of the code base has been adapted from xlsx-stream-reader

Keywords

FAQs

What is supersalmon?

Is supersalmon well maintained?

Package last updated on 05 Aug 2019

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

supersalmon

Supersalmon

Features

Caveats

Example Usage

TODO List

Credits

Keywords

Related posts

Threat Actor Exposes Playbook for Exploiting npm to Build Blockchain-Powered Botnets

NVD Backlog Tops 20,000 CVEs Awaiting Analysis as NIST Prepares System Updates