officeparser
Advanced tools
Comparing version
@@ -457,2 +457,3 @@ #!/usr/bin/env node | ||
.flat() // Flatten all the items object | ||
.filter(item => item.str != '') // Ignore the empty string items. | ||
.reduce((a, v) => ( | ||
@@ -559,3 +560,3 @@ { | ||
default: | ||
throw ERRORMSG.extensionUnsupported(extension); | ||
internalCallback(undefined, ERRORMSG.extensionUnsupported(extension)); // Call the internalCallback function which removes the temp files if required. | ||
} | ||
@@ -562,0 +563,0 @@ |
{ | ||
"name": "officeparser", | ||
"version": "4.1.0", | ||
"version": "4.1.1", | ||
"description": "A Node.js library to parse text out of any office file. Currently supports docx, pptx, xlsx, odt, odp, ods, pdf files.", | ||
@@ -5,0 +5,0 @@ "main": "officeParser.js", |
@@ -16,2 +16,3 @@ # officeParser | ||
#### Update | ||
* 2024/05/06 - Replaced pdf parsing support from pdf-parse library to natively building it using pdf.js library from Mozilla by analyzing its output. Added pdfjs-dist build as a local library. | ||
* 2023/11/25 - Fixed error catching when an error occurs within the parsing of a file, especially after decompressing it. Also fixed the problem with parallel parsing of files as we were using only timestamp in file names. | ||
@@ -18,0 +19,0 @@ * 2023/10/24 - Revamped content parsing code. Fixed order of content in files, especially in word files where table information would always land up at the end of the text. Added config object as argument for parseOffice which can be used to set new line delimiter and multiple other configurations. Added support for parsing pdf files using the popular npm library pdf-parse. Removed support for individual file parsing functions. |
Major refactor
Supply chain riskPackage has recently undergone a major refactor. It may be unstable or indicate significant internal changes. Use caution when updating to versions that include significant changes.
Found 1 instance in 1 package
6283971
0.01%61723
0193
0.52%7
-12.5%