![Oracle Drags Its Feet in the JavaScript Trademark Dispute](https://cdn.sanity.io/images/cgdhsj6q/production/919c3b22c24f93884c548d60cbb338e819ff2435-1024x1024.webp?w=400&fit=max&auto=format)
Security News
Oracle Drags Its Feet in the JavaScript Trademark Dispute
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
@arbs.io/asset-extractor-wasm
Advanced tools
This npm package offers a straightforward method to extract text content from various binary and text file formats. The package comes with a pre-built configuration that works out-of-the-box, requiring no additional setup. It is designed for use in Browse
Caution: This package is currently in development and should be treated as a preview release (pre-v1.0)
Welcome to @arbs.io/asset-extractor-wasm
, a powerful npm package that provides a straightforward method to extract content from a wide range of binary and text file formats. This package is pre-configured to work seamlessly, requiring no additional setup. It is designed to be compatible with both Browsers and Node.js environments, including Visual Studio Code extensions, making it a versatile tool for your development needs.
The current version of the package supports content extraction from an extensive list of MIME types, including but not limited to:
Text | Media | extension | Mimetype |
---|---|---|---|
✅ | ⚫ | txt | text/plain |
✅ | ✅ | docx | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
✅ | ✅ | pptx | application/vnd.openxmlformats-officedocument.presentationml.presentation |
🔲 | 🔲 | xlsx | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet |
✅ | ✅ | odp | application/vnd.oasis.opendocument.presentation |
✅ | ✅ | ods | application/vnd.oasis.opendocument.spreadsheet |
✅ | ✅ | odt | application/vnd.oasis.opendocument.text |
✅ | 🔲 | xml | text/xml |
✅ | 🔲 | application/pdf | |
✅ | 🔲 | html | text/html |
✅ | 🔲 | epub | application/epub+zip |
✅ | 🔲 | mobi | application/x-mobipocket-ebook |
✅: Completed 🔲: Coming soon ⚫: Not Applicable
We are always looking to expand the capabilities of @arbs.io/asset-extractor-wasm
. If you need support for additional file formats, please submit an enhancement issue on the project's repository. We value your feedback and contributions as they help us improve this package for the broader developer community.
To install the package, use the following npm command:
npm install @arbs.io/asset-extractor-wasm
This command will add the package to your project's dependencies.
Here's an example of how to extract text from a buffer. If the file type is binary, the mime-type is verified using file-type.
import * as fs from 'fs'
import {
createDocumentParser,
getTextPlain,
} from '@arbs.io/asset-extractor-wasm'
export const documentParserExample = () => {
const buf = fs.readFileSync(`./data_source/microservices.docx`)
const documentParser = createDocumentParser(new Uint8Array(buf))
console.log(`mimetype: (${documentParser?.mimetype})`)
console.log(`extension: (${documentParser?.extension})`)
console.log(`content [text/plain]: (${documentParser?.contents?.text!})`)
}
This example demonstrates how to read a file, convert it to a Uint8Array
, and then extract the assets.
The DocumentParser
object provides the following properties:
mimetype
: The mime-type of the buffer determined by the binary signature.extension
: The (file) extension of the buffer determined by the binary signature.contents
: An array of Content
within the buffer (text, images, ...)interface DocumentParser {
mimetype: string
extension: string
contents: ParserContent | null
}
text
: Text content of the buffer. There is only ever a single text content for each buffer.media
: Array of all embedded media assets with the buffer (images, audio, video, ...).interface ParserContent {
text: string | null
media: ContentData[] | null
}
identity
: The identity of the binary embedded object. For example: image1.png
mimetype
: The mime-type is set to the format of the data send to the function. For example: image/png
data
: The raw data base64 of the image binary formatinterface ContentData {
identity: string
mimetype: string
data: string
}
We hope you find @arbs.io/asset-extractor-wasm
useful for your projects. If you have any questions, issues, or suggestions, please feel free to open an issue on our GitHub repository. We appreciate your support and are committed to making this package even better for the developer community.
FAQs
This npm package offers a straightforward method to extract text content from various binary and text file formats. The package comes with a pre-built configuration that works out-of-the-box, requiring no additional setup. It is designed for use in Browse
The npm package @arbs.io/asset-extractor-wasm receives a total of 15 weekly downloads. As such, @arbs.io/asset-extractor-wasm popularity was classified as not popular.
We found that @arbs.io/asset-extractor-wasm demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Oracle seeks to dismiss fraud claims in the JavaScript trademark dispute, delaying the case and avoiding questions about its right to the name.
Security News
The Linux Foundation is warning open source developers that compliance with global sanctions is mandatory, highlighting legal risks and restrictions on contributions.
Security News
Maven Central now validates Sigstore signatures, making it easier for developers to verify the provenance of Java packages.