
Security News
Attackers Are Hunting High-Impact Node.js Maintainers in a Coordinated Social Engineering Campaign
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.
read-excel-file-tmp
Advanced tools
Read small to medium `*.xlsx` files in a browser or Node.js. Parse to JSON with a strict schema.
read-excel-fileRead small to medium *.xlsx files in a browser or Node.js. Parse to JSON with a strict schema.
There have been some complaints about this library not being able to handle large *.xlsx spreadsheets. It's true that this library's main point have been usability and convenience, and not performance or the ability to handle huge datasets. For example, the time of parsing a 2000 rows / 20 columns file is about 3 seconds, and when parsing a 30k+ rows file, it may throw a RangeError: Maximum call stack size exceeded. So, for handling huge datasets, use something like xlsx package instead. This library is suitable for handling small to medium *.xlsx files.
On March 9th, 2020, GitHub, Inc. silently banned my account (and all my libraries) without any notice. I opened a support ticked but they didn't answer. Because of that, I had to move all my libraries to GitLab.
npm install read-excel-file --save
If you're not using a bundler then use a standalone version from a CDN.
<input type="file" id="input" />
import readXlsxFile from 'read-excel-file'
const input = document.getElementById('input')
input.addEventListener('change', () => {
readXlsxFile(input.files[0]).then((rows) => {
// `rows` is an array of rows
// each row being an array of cells.
})
})
const readXlsxFile = require('read-excel-file/node');
// File path.
readXlsxFile('/path/to/file').then((rows) => {
// `rows` is an array of rows
// each row being an array of cells.
})
// Readable Stream.
readXlsxFile(fs.createReadStream('/path/to/file')).then((rows) => {
...
})
XLSX format has no dedicated "date" type so dates are stored internally as simply numbers along with a "format" (e.g. "MM/DD/YY"). When using readXlsx() with schema parameter all dates get parsed correctly in any case. But if using readXlsx() without schema parameter (to get "raw" data) then this library attempts to guess whether a cell value is a date or not by examining the cell "format" (e.g. "MM/DD/YY"), so in most cases dates are detected and parsed automatically. For exotic cases one can pass an explicit dateFormat parameter (e.g. "MM/DD/YY") to instruct the library to parse numbers with such "format" as dates:
readXlsxFile(file, { dateFormat: 'MM/DD/YY' })
To convert rows to JSON pass schema option to readXlsxFile(). It will return { rows, errors } object instead of just rows.
// An example *.xlsx document:
// -----------------------------------------------------------------------------------------
// | START DATE | NUMBER OF STUDENTS | IS FREE | COURSE TITLE | CONTACT | STATUS |
// -----------------------------------------------------------------------------------------
// | 03/24/2018 | 10 | true | Chemistry | (123) 456-7890 | SCHEDULED |
// -----------------------------------------------------------------------------------------
const schema = {
'START DATE': {
prop: 'date',
type: Date
// Excel stores dates as integers.
// E.g. '24/03/2018' === 43183.
// Such dates are parsed to UTC+0 timezone with time 12:00 .
},
'NUMBER OF STUDENTS': {
prop: 'numberOfStudents',
type: Number,
required: true
},
// 'COURSE' is not a real Excel file column name,
// it can be any string — it's just for code readability.
'COURSE': {
prop: 'course',
type: {
'IS FREE': {
prop: 'isFree',
type: Boolean
// Excel stored booleans as numbers:
// `1` is `true` and `0` is `false`.
// Such numbers are parsed to booleans.
},
'COURSE TITLE': {
prop: 'title',
type: String
}
}
},
'CONTACT': {
prop: 'contact',
required: true,
type: (value) => {
const number = parsePhoneNumber(value)
if (!number) {
throw new Error('invalid')
}
return number
}
},
'STATUS': {
prop: 'status',
type: String,
oneOf: [
'SCHEDULED',
'STARTED',
'FINISHED'
]
}
}
readXlsxFile(file, { schema }).then(({ rows, errors }) => {
// `errors` have shape `{ row, column, error, value }`.
errors.length === 0
rows === [{
date: new Date(2018, 2, 24),
numberOfStudents: 10,
course: {
isFree: true,
title: 'Chemistry'
},
contact: '+11234567890',
status: 'SCHEDULED'
}]
})
If no type is specified then the cell value is returned "as is".
There are also some additional exported types:
Integer for parsing integer Numbers.URL for parsing URLs.Email for parsing email addresses.A schema entry for a column may also define an optional validate(value) function for validating the parsed value: in that case, it must throw an Error if the value is invalid. The validate(value) function is only called when value exists.
convertToJson() function is also exported as a standalone one from read-excel-file/schema
import convertToJson from "read-excel-file/schema"
// `data` is an array of rows, each row being an array of cells.
// `schema` is a "to JSON" convertion schema (see above).
const objects = convertToJson(data, schema)
Sometimes, a developer might want to use some other (more advanced) solution for schema parsing and validation (like yup). If a developer passes a map instead of a schema to readXlsxFile(), then it would just map each data row to a JSON object without doing any parsing or validation.
// An example *.xlsx document:
// ------------------------------------------------------------
// | START DATE | NUMBER OF STUDENTS | IS FREE | COURSE TITLE |
// ------------------------------------------------------------
// | 03/24/2018 | 10 | true | Chemistry |
// ------------------------------------------------------------
const map = {
'START DATE': 'date',
'NUMBER OF STUDENTS': 'numberOfStudents',
'COURSE': {
'course': {
'IS FREE': 'isFree',
'COURSE TITLE': 'title'
}
}
}
readXlsxFile(file, { map }).then(({ rows }) => {
rows === [{
date: new Date(2018, 2, 24),
numberOfStudents: 10,
course: {
isFree: true,
title: 'Chemistry'
}
}]
})
A React component for displaying schema parsing/validation errors could look like this:
import { parseExcelDate } from 'read-excel-file'
function ParseExcelError({ children: error }) {
// Get a human-readable value.
let value = error.value
if (error.type === Date) {
value = parseExcelDate(value).toString()
}
// Render error summary.
return (
<div>
<code>"{error.error}"</code>
{' for value '}
<code>"{value}"</code>
{' in column '}
<code>"{error.column}"</code>
{error.type && ' of type '}
{error.type && <code>"{error.type.name}"</code>}
{' in row '}
<code>"{error.row}"</code>
</div>
)
}
When using a schema there's also an optional transformData(data) parameter which can be used for the cases when the spreadsheet rows/columns aren't in the correct format. For example, the heading row may be missing, or there may be some purely presentational or empty rows. Example:
readXlsxFile(file, {
schema,
transformData(data) {
// Adds header row to the data.
return [['ID', 'NAME', ...]].concat(data)
// Removes empty rows.
return data.filter(row => row.filter(column => column !== null).length > 0)
}
})
See testing index.d.ts.
Node.js *.xlxs parser uses xpath and xmldom packages for XML parsing. The same packages could be used in a browser because all modern browsers (except IE 11) have native DOMParser built-in which could is used instead (meaning smaller footprint and better performance) but since Internet Explorer 11 support is still required the browser version doesn't use the native DOMParser and instead uses xpath and xmldom packages for XML parsing just like the Node.js version.
Dynamically calculated cells using formulas (SUM, etc) are not supported.
By default it reads the first sheet in the document. If you have multiple sheets in your spreadsheet then pass either sheet: number (sheet index, starting from 1) or sheet: string (sheet name) as part of the options argument (options.sheet is 1 by default):
readXlsxFile(file, { sheet: 2 }).then((data) => {
...
})
readXlsxFile(file, { sheet: 'Sheet1' }).then((data) => {
...
})
To get the list of sheets one can pass getSheets: true option:
readXlsxFile(file, { getSheets: true }).then((sheets) => {
// sheets === [{ name: 'Sheet1' }, { name: 'Sheet2' }]
})
One can use any npm CDN service, e.g. unpkg.com or jsdelivr.net
<script src="https://unpkg.com/read-excel-file@4.x/bundle/read-excel-file.min.js"></script>
<script>
var input = document.getElementById('input')
input.addEventListener('change', function() {
readXlsxFile(input.files[0]).then(function() {
// `rows` is an array of rows
// each row being an array of cells.
})
})
</script>
For XML parsing xmldom and xpath are used.
FAQs
Read small to medium `*.xlsx` files in a browser or Node.js. Parse to JSON with a strict schema.
We found that read-excel-file-tmp demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Multiple high-impact npm maintainers confirm they have been targeted in the same social engineering campaign that compromised Axios.

Security News
Axios compromise traced to social engineering, showing how attacks on maintainers can bypass controls and expose the broader software supply chain.

Security News
Node.js has paused its bug bounty program after funding ended, removing payouts for vulnerability reports but keeping its security process unchanged.