office-text-extractor - npm Package Compare versions

		{
		"name": "office-text-extractor",
		"version": "3.0.0-beta.3",
		"version": "3.0.0",
		"description": "Yet another library to extract text from MS Office and PDF files",
		@@ -35,3 +35,4 @@ "keywords": [
		"tsconfig.json",
		"build/"
		"build/",
		"source/"
		],
		@@ -43,9 +44,17 @@ "type": "module",
		"dependencies": {
		"fflate": "0.8.0",
		"file-type": "18.2.1",
		"got": "12.6.0",
		"js-yaml": "4.1.0",
		"mammoth": "1.6.0",
		"pdf-parse": "1.1.1"
		"pdf-parse": "1.1.1",
		"text-encoding": "0.7.0",
		"xlsx": "0.18.5",
		"xml2js": "0.6.0"
		},
		"devDependencies": {
		"@types/js-yaml": "4.0.5",
		"@types/node": "18.15.11",
		"@types/text-encoding": "0.0.36",
		"@types/xml2js": "0.4.11",
		"ava": "5.3.0",
		@@ -87,3 +96,3 @@ "np": "7.7.0",
		"compile": "tsc",
		"test": "run-s test:*",
		"test": "run-s test:compile test:integration",
		"test:compile": "tsc --noEmit",
		@@ -90,0 +99,0 @@ "test:quality": "xo source/ test/",

readme.md

		@@ -43,5 +43,3 @@ # <div align="center"> `office-text-extractor` </div>
		type of files
		- [`decompress`](https://www.npmjs.com/package/decompress) - to unzip files
		- [`read-chunk`](https://www.npmjs.com/package/read-chunk) - to read chunks of
		data from large files
		- [`fflate`](https://www.npmjs.com/package/fflate) - to unzip files

		@@ -52,2 +50,4 @@ A big thank you to the contributors of these projects!

		#### NodeJs

		> Note
		@@ -72,23 +72,29 @@ >

		#### Browser

		To use this package in the browser, fetch it using your preferred CDN:

		```tsx
		<script src="https://unpkg.com/office-text-extractor@latest/build/index.js"></script>
		```

		## Usage

		```js
		import { extractText } from 'office-text-extractor'
		```ts
		import { getTextExtractor } from 'office-text-extractor'

		// Extract the text using `async-await`.
		const text = await extractText('path/to/file')
		// Create a new instance of the extractor.
		const extractor = getTextExtractor()

		// Extract text from a URL, file or buffer.
		const location =
		'https://raw.githubusercontent.com/gamemaker1/office-text-extractor/rewrite/test/fixtures/docs/pptx.pptx'
		const text = await extractor.extractText({
		input: location, // this can be a file path or a buffer
		type: 'url', // this is can be 'url', 'file' or 'buffer'
		})

		console.log(text)

		// Extract the text using Promises.
		extractText('path/to/file')
		.then((text) => console.log(text))
		.catch((error) => console.error(error))
		```

		> Note
		>
		> There is no support for browser environments yet. If you want to add support,
		> please feel free to
		> [open a pull request](https://github.com/gamemaker1/office-text-extractor/pulls).

		## License
		@@ -95,0 +101,0 @@

tsconfig.json

		@@ -14,5 +14,6 @@ // tsconfig.json
		"declaration": true,
		"outDir": "build/"
		"outDir": "build/",
		"skipLibCheck": true
		},
		"include": ["source/"]
		}

build/index.d.ts

build/index.js

build/lib.d.ts

build/lib.js

build/parsers/doc.d.ts

build/parsers/doc.js

build/parsers/pdf.d.ts

build/parsers/pdf.js

build/util.d.ts

build/util.js

New alerts

Fixed alerts

Improved metrics

Worsened metrics

Dependency changes