Security News
RubyGems.org Adds New Maintainer Role
RubyGems.org has added a new "maintainer" role that allows for publishing new versions of gems. This new permission type is aimed at improving security for gem owners and the service overall.
office-text-extractor
Advanced tools
Yet another library to extract text from MS Office (docx
, pptx
, xlsx
) and PDF (pdf
) files.
There are other great libraries that do the same job and have inspired this project, such as:
This module uses some amazing existing libraries that perform better than the ones that originally existed in this module, and are therefore used instead:
This module also uses:
xml2js
- to convert the MS Office XML files into JSONjs-yaml
- to convert JSON into YAMLfile-type
- to detect the mime type of filesdecompress
- to unzip filesread-chunk
- to read chunks of data from large filesA big thank you to the contributors of these projects.
To use this in an npm project, simply type in:
npm install office-text-extractor
There is no support for browser environments yet. If you want to add support, please feel free to open a pull request.
// Importing the library:
// CommonJS import
const extractText = require('office-text-extractor');
// ES import
import extractText from 'office-text-extractor';
// Extracting text:
// Async-await way
const text = await extractText('path/to/file');
console.log(text);
// Promise way
extractText('path/to/file');
.then((text) => {
console.log(text);
})
.catch((err) => {
console.error(err);
})
If you want to help out, please do open a pull request.
Copyright (c) 2021, gamemaker1
Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
FAQs
Yet another library to extract text from MS Office and PDF files
The npm package office-text-extractor receives a total of 3,509 weekly downloads. As such, office-text-extractor popularity was classified as popular.
We found that office-text-extractor demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
RubyGems.org has added a new "maintainer" role that allows for publishing new versions of gems. This new permission type is aimed at improving security for gem owners and the service overall.
Security News
Node.js will be enforcing stricter semver-major PR policies a month before major releases to enhance stability and ensure reliable release candidates.
Security News
Research
Socket's threat research team has detected five malicious npm packages targeting Roblox developers, deploying malware to steal credentials and personal data.