Supersalmon
Features
- Process xlsx files while streaming
- Records are processed one at a time by a user supplied promise returning function
- Strives for memory efficiency - Automatic backpressure handling
- Line count support (not all xlsx files contain this metadata)
- Can transform column names
- Will reject on non xlsx data
- Battle tested in production code
- Skips empty rows (maybe it's a caveat)
- Limit parsing: process a file to a specified row, then stop
- Allows to process rows in chunks (for multiple inserts)
Caveats
- Only parses the first on sheet in a workbook
- Requires the first row to contain column names
- Skips empty rows (maybe it's a feature)
Example Usage
const supersalmon = require('supersalmon')
const processed = await supersalmon({
inputStream: createReadStream('huge.xlsx'),
hasHeaders: true,
hasFormats: false,
mapColumns: cols => colName => colName.toLowerCase().trim(),
onLineCount: lineCount => notifyLineCount(lineCount),
formatting: false,
lenientColumnMapping: false
chunkSize: 3
}).processor({
limit: 10,
onRow: ({name, surname}, i) => {
doSomethingWithRowIndex(i)
repository.insert({ name, surname })
},
});
const stream = supersalmon({ }).stream()
const stream10 = supersalmon({ }).stream(10)
stream.pipe(myOtherStream)
Or see tests
TODO List
Issues and PRs more than welcome!
Credits
A large part of the code base has been adapted from xlsx-stream-reader