office-text-extractor
Advanced tools
Comparing version 3.0.0-beta.3 to 3.0.0
{ | ||
"name": "office-text-extractor", | ||
"version": "3.0.0-beta.3", | ||
"version": "3.0.0", | ||
"description": "Yet another library to extract text from MS Office and PDF files", | ||
@@ -35,3 +35,4 @@ "keywords": [ | ||
"tsconfig.json", | ||
"build/" | ||
"build/", | ||
"source/" | ||
], | ||
@@ -43,9 +44,17 @@ "type": "module", | ||
"dependencies": { | ||
"fflate": "0.8.0", | ||
"file-type": "18.2.1", | ||
"got": "12.6.0", | ||
"js-yaml": "4.1.0", | ||
"mammoth": "1.6.0", | ||
"pdf-parse": "1.1.1" | ||
"pdf-parse": "1.1.1", | ||
"text-encoding": "0.7.0", | ||
"xlsx": "0.18.5", | ||
"xml2js": "0.6.0" | ||
}, | ||
"devDependencies": { | ||
"@types/js-yaml": "4.0.5", | ||
"@types/node": "18.15.11", | ||
"@types/text-encoding": "0.0.36", | ||
"@types/xml2js": "0.4.11", | ||
"ava": "5.3.0", | ||
@@ -87,3 +96,3 @@ "np": "7.7.0", | ||
"compile": "tsc", | ||
"test": "run-s test:*", | ||
"test": "run-s test:compile test:integration", | ||
"test:compile": "tsc --noEmit", | ||
@@ -90,0 +99,0 @@ "test:quality": "xo source/ test/", |
@@ -43,5 +43,3 @@ # <div align="center"> `office-text-extractor` </div> | ||
type of files | ||
- [`decompress`](https://www.npmjs.com/package/decompress) - to unzip files | ||
- [`read-chunk`](https://www.npmjs.com/package/read-chunk) - to read chunks of | ||
data from large files | ||
- [`fflate`](https://www.npmjs.com/package/fflate) - to unzip files | ||
@@ -52,2 +50,4 @@ A big thank you to the contributors of these projects! | ||
#### NodeJs | ||
> **Note** | ||
@@ -72,23 +72,29 @@ > | ||
#### Browser | ||
To use this package in the browser, fetch it using your preferred CDN: | ||
```tsx | ||
<script src="https://unpkg.com/office-text-extractor@latest/build/index.js"></script> | ||
``` | ||
## Usage | ||
```js | ||
import { extractText } from 'office-text-extractor' | ||
```ts | ||
import { getTextExtractor } from 'office-text-extractor' | ||
// Extract the text using `async-await`. | ||
const text = await extractText('path/to/file') | ||
// Create a new instance of the extractor. | ||
const extractor = getTextExtractor() | ||
// Extract text from a URL, file or buffer. | ||
const location = | ||
'https://raw.githubusercontent.com/gamemaker1/office-text-extractor/rewrite/test/fixtures/docs/pptx.pptx' | ||
const text = await extractor.extractText({ | ||
input: location, // this can be a file path or a buffer | ||
type: 'url', // this is can be 'url', 'file' or 'buffer' | ||
}) | ||
console.log(text) | ||
// Extract the text using Promises. | ||
extractText('path/to/file') | ||
.then((text) => console.log(text)) | ||
.catch((error) => console.error(error)) | ||
``` | ||
> **Note** | ||
> | ||
> There is no support for browser environments yet. If you want to add support, | ||
> please feel free to | ||
> [open a pull request](https://github.com/gamemaker1/office-text-extractor/pulls). | ||
## License | ||
@@ -95,0 +101,0 @@ |
@@ -14,5 +14,6 @@ // tsconfig.json | ||
"declaration": true, | ||
"outDir": "build/" | ||
"outDir": "build/", | ||
"skipLibCheck": true | ||
}, | ||
"include": ["source/"] | ||
} |
Major refactor
Supply chain riskPackage has recently undergone a major refactor. It may be unstable or indicate significant internal changes. Use caution when updating to versions that include significant changes.
Found 1 instance in 1 package
Network access
Supply chain riskThis module accesses the network.
Found 1 instance in 1 package
No v1
QualityPackage is not semver >=1. This means it is not stable and does not support ^ ranges.
Found 1 instance in 1 package
17888
355
0
101
9
11
11
+ Addedfflate@0.8.0
+ Addedjs-yaml@4.1.0
+ Addedtext-encoding@0.7.0
+ Addedxlsx@0.18.5
+ Addedxml2js@0.6.0
+ Addedadler-32@1.3.1(transitive)
+ Addedargparse@2.0.1(transitive)
+ Addedcfb@1.2.2(transitive)
+ Addedcodepage@1.15.0(transitive)
+ Addedcrc-32@1.2.2(transitive)
+ Addedfflate@0.8.0(transitive)
+ Addedfrac@1.1.2(transitive)
+ Addedjs-yaml@4.1.0(transitive)
+ Addedsax@1.3.0(transitive)
+ Addedssf@0.11.2(transitive)
+ Addedtext-encoding@0.7.0(transitive)
+ Addedwmf@1.0.2(transitive)
+ Addedword@0.3.0(transitive)
+ Addedxlsx@0.18.5(transitive)
+ Addedxml2js@0.6.0(transitive)
+ Addedxmlbuilder@11.0.1(transitive)