Security News
38% of CISOs Fear They’re Not Moving Fast Enough on AI
CISOs are racing to adopt AI for cybersecurity, but hurdles in budgets and governance may leave some falling behind in the fight against cyber threats.
mdast-util-to-nlcst
Advanced tools
mdast utility to transform to nlcst.
This package is a utility that takes an mdast (markdown) syntax tree as input and turns it into nlcst (natural language).
This project is useful when you want to deal with ASTs and inspect the natural language inside markdown. Unfortunately, there is no way yet to apply changes to the nlcst back into mdast.
The hast utility hast-util-to-nlcst
does the same but
uses an HTML tree as input.
The remark plugin remark-retext
wraps this utility to do the
same at a higher-level (easier) abstraction.
This package is ESM only. In Node.js (version 16+), install with npm:
npm install mdast-util-to-nlcst
In Deno with esm.sh
:
import {toNlcst} from 'https://esm.sh/mdast-util-to-nlcst@7'
In browsers with esm.sh
:
<script type="module">
import {toNlcst} from 'https://esm.sh/mdast-util-to-nlcst@7?bundle'
</script>
Say we have the following example.md
:
Some *foo*sball.
…and next to it a module example.js
:
import {fromMarkdown} from 'mdast-util-from-markdown'
import {toNlcst} from 'mdast-util-to-nlcst'
import {ParseEnglish} from 'parse-english'
import {read} from 'to-vfile'
import {inspect} from 'unist-util-inspect'
const file = await read('example.md')
const mdast = fromMarkdown(file)
const nlcst = toNlcst(mdast, file, ParseEnglish)
console.log(inspect(nlcst))
Yields:
RootNode[1] (1:1-1:17, 0-16)
└─0 ParagraphNode[1] (1:1-1:17, 0-16)
└─0 SentenceNode[4] (1:1-1:17, 0-16)
├─0 WordNode[1] (1:1-1:5, 0-4)
│ └─0 TextNode "Some" (1:1-1:5, 0-4)
├─1 WhiteSpaceNode " " (1:5-1:6, 4-5)
├─2 WordNode[2] (1:7-1:16, 6-15)
│ ├─0 TextNode "foo" (1:7-1:10, 6-9)
│ └─1 TextNode "sball" (1:11-1:16, 10-15)
└─3 PunctuationNode "." (1:16-1:17, 15-16)
This package exports the identifier toNlcst
.
There is no default export.
toNlcst(tree, file, Parser[, options])
Turn an mdast tree into an nlcst tree.
👉 Note:
tree
must have positional info andfile
must be aVFile
corresponding totree
.
tree
(MdastNode
)
— mdast tree to transformfile
(VFile
)
— virtual fileParser
(ParserConstructor
or
ParserInstance
)
— parser to useoptions
(Options
, optional)
— configurationnlcst tree (NlcstNode
).
Options
Configuration (TypeScript type).
ignore
List of mdast node types to ignore (Array<string>
, optional).
The types 'table'
, 'tableRow'
, and 'tableCell'
are always ignored.
Say we have the following file example.md
:
A paragraph.
> A paragraph in a block quote.
…and if we now transform with ignore: ['blockquote']
, we get:
RootNode[2] (1:1-3:1, 0-14)
├─0 ParagraphNode[1] (1:1-1:13, 0-12)
│ └─0 SentenceNode[4] (1:1-1:13, 0-12)
│ ├─0 WordNode[1] (1:1-1:2, 0-1)
│ │ └─0 TextNode "A" (1:1-1:2, 0-1)
│ ├─1 WhiteSpaceNode " " (1:2-1:3, 1-2)
│ ├─2 WordNode[1] (1:3-1:12, 2-11)
│ │ └─0 TextNode "paragraph" (1:3-1:12, 2-11)
│ └─3 PunctuationNode "." (1:12-1:13, 11-12)
└─1 WhiteSpaceNode "\n\n" (1:13-3:1, 12-14)
source
List of mdast node types to mark as nlcst source nodes
(Array<string>
, optional).
The type 'inlineCode'
is always marked as source.
Say we have the following file example.md
:
A paragraph.
> A paragraph in a block quote.
…and if we now transform with source: ['blockquote']
, we get:
RootNode[3] (1:1-3:32, 0-45)
├─0 ParagraphNode[1] (1:1-1:13, 0-12)
│ └─0 SentenceNode[4] (1:1-1:13, 0-12)
│ ├─0 WordNode[1] (1:1-1:2, 0-1)
│ │ └─0 TextNode "A" (1:1-1:2, 0-1)
│ ├─1 WhiteSpaceNode " " (1:2-1:3, 1-2)
│ ├─2 WordNode[1] (1:3-1:12, 2-11)
│ │ └─0 TextNode "paragraph" (1:3-1:12, 2-11)
│ └─3 PunctuationNode "." (1:12-1:13, 11-12)
├─1 WhiteSpaceNode "\n\n" (1:13-3:1, 12-14)
└─2 ParagraphNode[1] (3:1-3:32, 14-45)
└─0 SentenceNode[1] (3:1-3:32, 14-45)
└─0 SourceNode "> A paragraph in a block quote." (3:1-3:32, 14-45)
ParserConstructor
Create a new parser (TypeScript type).
type ParserConstructor = new () => ParserInstance
ParserInstance
nlcst parser (TypeScript type).
For example, parse-dutch
, parse-english
, or
parse-latin
.
type ParserInstance = {
tokenizeSentencePlugins: ((node: NlcstSentence) => undefined)[]
tokenizeParagraphPlugins: ((node: NlcstParagraph) => undefined)[]
tokenizeRootPlugins: ((node: NlcstRoot) => undefined)[]
parse(value: string | null | undefined): NlcstRoot
tokenize(value: string | null | undefined): Array<NlcstSentenceContent>
}
This package is fully typed with TypeScript.
It exports the types Options
,
ParserConstructor
, and
ParserInstance
.
Projects maintained by the unified collective are compatible with maintained versions of Node.js.
When we cut a new major release, we drop support for unmaintained versions of
Node.
This means we try to keep the current release line, mdast-util-to-nlcst@^7
,
compatible with Node.js 16.
Use of mdast-util-to-nlcst
does not involve hast so there are no
openings for cross-site scripting (XSS) attacks.
mdast-util-to-hast
— transform mdast to hasthast-util-to-nlcst
— transform hast to nlcsthast-util-to-mdast
— transform hast to mdasthast-util-to-xast
— transform hast to xasthast-util-sanitize
— sanitize hast nodesSee contributing.md
in syntax-tree/.github
for
ways to get started.
See support.md
for ways to get help.
This project has a code of conduct. By interacting with this repository, organization, or community you agree to abide by its terms.
FAQs
mdast utility to transform to nlcst
The npm package mdast-util-to-nlcst receives a total of 67,719 weekly downloads. As such, mdast-util-to-nlcst popularity was classified as popular.
We found that mdast-util-to-nlcst demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
CISOs are racing to adopt AI for cybersecurity, but hurdles in budgets and governance may leave some falling behind in the fight against cyber threats.
Research
Security News
Socket researchers uncovered a backdoored typosquat of BoltDB in the Go ecosystem, exploiting Go Module Proxy caching to persist undetected for years.
Security News
Company News
Socket is joining TC54 to help develop standards for software supply chain security, contributing to the evolution of SBOMs, CycloneDX, and Package URL specifications.