Security News
PyPI’s New Archival Feature Closes a Major Security Gap
PyPI now allows maintainers to archive projects, improving security and helping users make informed decisions about their dependencies.
@171h/docx-to-vfile
Advanced tools
> **Note** > This repository is automatically generated from the [main parser monorepo](https://github.com/TrialAndErrorOrg/parsers). Please submit any issues or pull requests there.
Note This repository is automatically generated from the main parser monorepo. Please submit any issues or pull requests there.
Reads a .docx
file and stores its components in vfile format to be processed by other tools, like reoff-parse
.
Currently extremely dumb and just stores it all in memory, no streams for you. File reading does happen in streams.
Based on docxtract
This package reads a .docx
file and stores its components in vfile format to be processed by other tools, like reoff-parse
. This is the first step in a pipeline to convert a .docx
file to many other formats using the unified
ecosystem.
A .docx
document is just a zip file with a bunch of XML and other files (such as images) in it. This package unzips the .docx
file, reads the XML files and images and stores them in a VFile
object, which is a virtual file format that can be used by other tools in the unified
ecosystem.
Probably only exclusively to read a docx
file to feed into reoff-parse
or something similar, or if you want to access the raw data of a docx
file for some reason.
This package is ESM only. In Node.js (version 12.20+, 14.14+, 16.0+, 18.0+), install as
pnpm add docx-to-vfile
# or with yarn
# yarn add docx-to-vfile
# or with npm
# npm install docx-to-vfile
import { docxToVFile } from 'docx-to-vfile'
Pass a path to a .docx
file
const file = await docxToVFile('path/to/file.docx')
Pass a Blob
const blob = await fetch('https://path/to/file.docx').then((res) => res.blob())
const file = await docxToVFile(blob)
Pass a Buffer
import { readFile } from 'fs/promises'
const buffer = await readFile('path/to/file.docx')
const file = await docxToVFile(buffer)
Pass a ReadStream
import { createReadStream } from 'fs'
const file = await docxToVFile(createReadStream('path/to/file.docx'))
import { docxToVFile } from 'docx-to-vfile/browser'
Pass a File
<input type="file" />
document.querySelector('input[type="file"]')?.addEventListener('change', async (e) => {
const file = await docxToVFile(e.target.files[0])
})
Using the default settings, the main value of the VFile will be the content of the main document, and the data will contain the content of the other files in the .docx archive. Media files will be stored in the media property.
const output = {
data: {
'word/footnotes.xml': '<?xml version ...',
'_rels/rels': '<?xml version ...',
// ...
relations: {
rId9: 'footnotes.xml',
rId8: 'endnotes.xml',
// ...
},
media: {
media/image1.png: //<Blob>,
},
},
value: //'[the content of word/document.xml, the main document]',
// other vfile stuff
messages: [],
history: [],
cwd: './',
}
String(output) === output.value // true
docxToVFile()
Takes a docx file as an ArrayBuffer and returns a VFile with the contents of the document.xml file as the root, and the contents of the other xml files as data.
docxToVFile(file: ArrayBuffer, userOptions: Options = {}): Promise<VFile>;
Name | Type | Description |
---|---|---|
file | ArrayBuffer | The docx file as an ArrayBuffer |
userOptions | Options | - |
Promise
<VFile
>
A VFile with the contents of the document.xml file as the root, and the contents of the other xml files as data.
Defined in: src/lib/docx-to-vfile-unzipit.ts:90
DocxData
The data attribute of a VFile Is set to the DataMap interface in the vfile module
Data
.DocxData[key
: XMLOrRelsString
]: string
| undefined
media
object
The media files in the .docx file
Overrides: Data.media
Defined in: src/lib/docx-to-vfile-unzipit.ts:45
relations
object
The relations between the .xml files in the .docx file
Overrides: Data.relations
Defined in: src/lib/docx-to-vfile-unzipit.ts:49
DocxVFile
Extends VFile with a custom data attribute
This information should be on the VFile interface, this is just used in contexts where you only want to know the type of the data attribute,
e.g. when writing a library that does something with the output of docxToVFile
.
VFile
.DocxVFilecwd
string
Base of path
(default: process.cwd()
or '/'
in browsers).
Inherited from: VFile.cwd
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:53
data
Overrides: VFile.data
Defined in: src/lib/docx-to-vfile-unzipit.ts:80
history
string
[]
List of filepaths the file moved between.
The first is the original path and the last is the current path.
Inherited from: VFile.history
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:47
map
undefined
|null
|Map
Source map.
This type is equivalent to the RawSourceMap
type from the source-map
module.
Inherited from: VFile.map
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:85
messages
VFileMessage
[]
List of messages associated with the file.
Inherited from: VFile.messages
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:39
result
unknown
Custom, non-string, compiled, representation.
This is used by unified to store non-string results. One example is when turning markdown into React nodes.
Inherited from: VFile.result
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:76
stored
boolean
Whether a file was saved to disk.
This is used by vfile reporters.
Inherited from: VFile.stored
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:67
value
Value
Raw value.
Inherited from: VFile.value
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:59
basename
Get the basename (including extname) (example: 'index.min.js'
).
basename(): undefined | string;
undefined
| string
Inherited from: VFile.basename
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:123
Set basename (including extname) ('index.min.js'
).
Cannot contain path separators ('/'
on unix, macOS, and browsers, '\'
on windows).
Cannot be nullified (use file.path = file.dirname
instead).
basename(arg: undefined | string): void;
| Name | Type |
| :---- | :---------- | -------- |
| arg
| undefined
| string
|
void
Inherited from: VFile.basename
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:119
Inherited from: VFile.basename
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:119 node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:123
dirname
Get the parent path (example: '~'
).
dirname(): undefined | string;
undefined
| string
Inherited from: VFile.dirname
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:111
Set the parent path (example: '~'
).
Cannot be set if there’s no path
yet.
dirname(arg: undefined | string): void;
| Name | Type |
| :---- | :---------- | -------- |
| arg
| undefined
| string
|
void
Inherited from: VFile.dirname
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:107
Inherited from: VFile.dirname
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:107 node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:111
extname
Get the extname (including dot) (example: '.js'
).
extname(): undefined | string;
undefined
| string
Inherited from: VFile.extname
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:135
Set the extname (including dot) (example: '.js'
).
Cannot contain path separators ('/'
on unix, macOS, and browsers, '\'
on windows).
Cannot be set if there’s no path
yet.
extname(arg: undefined | string): void;
| Name | Type |
| :---- | :---------- | -------- |
| arg
| undefined
| string
|
void
Inherited from: VFile.extname
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:131
Inherited from: VFile.extname
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:131 node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:135
path
Get the full path (example: '~/index.min.js'
).
path(): string;
string
Inherited from: VFile.path
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:101
Set the full path (example: '~/index.min.js'
).
Cannot be nullified.
You can set a file URL (a URL
object with a file:
protocol) which will
be turned into a path with url.fileURLToPath
.
path(arg: string): void;
Name | Type |
---|---|
arg | string |
void
Inherited from: VFile.path
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:95
Inherited from: VFile.path
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:95 node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:101
stem
Get the stem (basename w/o extname) (example: 'index.min'
).
stem(): undefined | string;
undefined
| string
Inherited from: VFile.stem
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:147
Set the stem (basename w/o extname) (example: 'index.min'
).
Cannot contain path separators ('/'
on unix, macOS, and browsers, '\'
on windows).
Cannot be nullified (use file.path = file.dirname
instead).
stem(arg: undefined | string): void;
| Name | Type |
| :---- | :---------- | -------- |
| arg
| undefined
| string
|
void
Inherited from: VFile.stem
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:143
Inherited from: VFile.stem
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:143 node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:147
fail()
Create a fatal error associated with the file.
Its fatal
is set to true
and file
is set to the current file path.
Its added to file.messages
.
👉 Note: a fatal error means that a file is no longer processable.
Message.
fail(reason: string | VFileMessage | Error, place?: null | Node<Data> | NodeLike | Position | Point, origin?: null | string): never;
| Name | Type | Description |
| :-------- | :------- | :------------- | -------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- | ------- | ----------------------------------------- |
| reason
| string
| VFileMessage
| Error
| Reason for message, uses the stack and message of the error if given. |
| place?
| null
| Node
<Data
> | NodeLike
| Position
| Point
| Place in file where the message occurred. |
| origin?
| null
| string
| Place in code where the message originates (example: 'my-package:my-rule'
or 'my-rule'
). |
never
Message.
Inherited from: VFile.fail
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:220
info()
Create an info message associated with the file.
Its fatal
is set to null
and file
is set to the current file path.
Its added to file.messages
.
info(reason: string | VFileMessage | Error, place?: null | Node<Data> | NodeLike | Position | Point, origin?: null | string): VFileMessage;
| Name | Type | Description |
| :-------- | :------- | :------------- | -------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- | ------- | ----------------------------------------- |
| reason
| string
| VFileMessage
| Error
| Reason for message, uses the stack and message of the error if given. |
| place?
| null
| Node
<Data
> | NodeLike
| Position
| Point
| Place in file where the message occurred. |
| origin?
| null
| string
| Place in code where the message originates (example: 'my-package:my-rule'
or 'my-rule'
). |
VFileMessage
Message.
Inherited from: VFile.info
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:195
message()
Create a warning message associated with the file.
Its fatal
is set to false
and file
is set to the current file path.
Its added to file.messages
.
message(reason: string | VFileMessage | Error, place?: null | Node<Data> | NodeLike | Position | Point, origin?: null | string): VFileMessage;
| Name | Type | Description |
| :-------- | :------- | :------------- | -------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- | ------- | ----------------------------------------- |
| reason
| string
| VFileMessage
| Error
| Reason for message, uses the stack and message of the error if given. |
| place?
| null
| Node
<Data
> | NodeLike
| Position
| Point
| Place in file where the message occurred. |
| origin?
| null
| string
| Place in code where the message originates (example: 'my-package:my-rule'
or 'my-rule'
). |
VFileMessage
Message.
Inherited from: VFile.message
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:174
toString()
Serialize the file.
toString(encoding?: null | BufferEncoding): string;
| Name | Type | Description |
| :---------- | :----- | :--------------- | ------------------------------------------------------------------------------------- |
| encoding?
| null
| BufferEncoding
| Character encoding to understand value
as when it’s a Buffer
(default: 'utf8'
). |
string
Serialized file.
Inherited from: VFile.toString
Defined in: node_modules/.pnpm/vfile@5.3.7/node_modules/vfile/lib/index.d.ts:157
Options
include?
string
[] |RegExp
[] | (key
:string
) =>boolean
|"all"
|"allWithDocumentXML"
Include only the specified files on the data
attribute of the VFile.
This may be useful if you want to only do something with a subset of the files in the docx file, and don't intend to use 'reoff-stringify' to turn the VFile back into a docx file.
word/document.xml
, even though that is already the root of the VFile. Useful if you really want to mimic the original docx file.You should keep it at the default value if you intend to use 'reoff-stringify' to turn the VFile back into a docx file.
'all'
Defined in: src/lib/docx-to-vfile-unzipit.ts:30
withoutMedia?
boolean
Whether or not to include media in the VFile.
By default, images are included on the data.media
attribute of the VFile as an object of ArrayBuffers, which are accessible both client and serverside.
false
Defined in: src/lib/docx-to-vfile-unzipit.ts:16
XMLOrRelsString
${string}.xml
|${string}.rels
Defined in: src/lib/docx-to-vfile-unzipit.ts:71
docx-to-vfile
currently does not read macros, so it is not vulnerable to potential security issues with macros.
It does not however do any other security checks, so it is possible that maliciously crafted docx files could cause problems when e.g. parsed with rehype
.
reoff-parse
— Parse the output of docx-to-vfile
into a VFile
with an ooxast
tree.GPL-3.0-or-later © Thomas F. K. Jorna
FAQs
> **Note** > This repository is automatically generated from the [main parser monorepo](https://github.com/TrialAndErrorOrg/parsers). Please submit any issues or pull requests there.
We found that @171h/docx-to-vfile demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
PyPI now allows maintainers to archive projects, improving security and helping users make informed decisions about their dependencies.
Research
Security News
Malicious npm package postcss-optimizer delivers BeaverTail malware, targeting developer systems; similarities to past campaigns suggest a North Korean connection.
Security News
CISA's KEV data is now on GitHub, offering easier access, API integration, commit history tracking, and automated updates for security teams and researchers.