Security News
Weekly Downloads Now Available in npm Package Search Results
Socket's package search now displays weekly downloads for npm packages, helping developers quickly assess popularity and make more informed decisions.
gfm-escape
Advanced tools
...the only escaper passing backtranslation tests.
GfmEscape
is an enterprise-grade library for transforming untagged plain text
to CommonMark and
GitHub Flavored Markdown (GFM).
There are neat and configurable markup converters like Turndown, which even allows transforming any markup that can be converted to HTML first.
While conversion of inline and block constructs is well covered, little attention is paid to transforming text content itself. And this is tricky especially with non-delimited "extended" autolinks, which make escaping heavily context-dependent.
In short:
GfmEscape addresses these issues without significant performance penalty, as it is based on UnionReplacer. See below for more details.
In browsers:
<script src="https://unpkg.com/gfm-escape/dist/gfm-escape.umd.js" />
Using npm:
npm install gfm-escape
In Node.js:
const GfmEscape = require('gfm-escape');
escaper = new GfmEscape(escapingOptions[, syntax])
newStr = escaper.escape(str[, gfmContext[, metadata]])
A created GfmEscape
instance is intended to be reused and shared in your code.
escapingOptions
: option object defining how to perform escaping, its keys
correspond to individual replaces. When a replace option is set to any truthy
value, suboption defaults are applied and can be overriden by passed suboptions.
A single option object can be reused for instantiating escapers for
different syntaxes, some options would just render irrelevant.
The current full options are:
{
strikethrough: { // default false
optimizeForDoubleTilde: false,
},
extAutolink: { // default false
breakUrl: false,
breakWww: false,
breaker: '<!-- -->',
allowedTransformations: [ 'entities', 'commonmark' ],
allowAddHttpScheme: false
},
table: true, // default false
emphasisNonDelimiters: { // default true
maxIntrawordUnderscoreRun: false
},
}
See below for more details.
syntax
: suggests the syntax escaper is built for.
The predefined syntaxes are available as members of GfmEscape.Syntax
:
text
: normal text, the default.linkDestination
: text rendered [sometext](here)
.cmAutolink
: text rendered <here>
. Please note that a valid CommonMark must
contain a URI scheme, which cannot be addressed by the escaper. When deciding if
CommonMark autolink is an appropriate construct to use, we suggest to use
the isEncodable(str)
and wouldBeUnaltered(str)
methods on the
Syntax.cmAutolink
object.codeSpan
: text rendered `here`
.When escaping, gfmContext
is extra contextual information to be considered.
The contexts have no defaults, i.e. they are falsy by default.
The following contexts are available:
{
inLink: true, // indicates suppressing nested links
inTable: true, // indicates extra escaping of table contents
}
When escaping, metadata
is extra input-output parameter that collects
metadata about the actual escaping. Currently metadata
are used for
codeSpan
syntax, where two output parameters delimiter
and space
are passed:
const escaper = new GfmEscape({ table: true }, GfmEscape.Syntax.codeSpan);
const x = {};
const context = { inTable: true };
const output = escaper.escape('`array|string`', context, x);
console.log(`${x.delimiter}${x.space}${output}${x.space}${x.delimiter}`);
// `` `array\|string` ``
strikethrough
Defaults to false
, i.e. '~' is not special and it is not escaped.
Suboptions:
optimizeForDoubleTilde
: only eventual sequences of two tildes are escaped.
Default false
.extAutolink
Defaults to false
, i.e. autolinks are not detected and do not form special
case for escaping.
Suboptions:
breakUrl
: if a string capable of forming extended url autolink is encountered,
it is broken to prevent that. E.g. https://orchi.tech
becomes
https://<!-- -->orchi.tech
. Default false
.breakWww
: if a string capable of forming extended www autolink is encountered,
it is broken to prevent that. E.g. www.orchi.tech
becomes
www.<!-- -->orchi.tech
. Default false
.breaker
: a sequence used to break extended autolinks, used both for breaking
and terminating. Default <!-- -->
. Please note that some Markdown renderers
like Redcarpet do not support HTML comments - tag sequences like <span></span>
or artificial <nolink>
can be used instead.allowedTransformations
: array of transformations that are allowed if an
extended autolink-like string needs to be transformed to retain the expected
target and text. The order indicates priority. Defaults to
['entities', 'commonmark']
. Available transformations are:
'keep'
: always the most preferred, no reason to set it explicitly.'entites'
: entity name references are used to escape trailing
characters. E.g. *http://orchi.tech,*
becomes \*http://orchi.tech,*
.'commonmark'
: a CommonMark autolink is used to delimit the actual link
part. E.g. *http://orchi.tech,*
becomes \*<http://orchi.tech>,\*
.'breakup'
: autolink-like string is broken, so that it is not interpreted
as an autolink. E.g. *https://orchi.tech,*
becomes
\*https://<!-- -->orchi.tech,\*
.'breakafter'
: autolink-like string is terminated after the actual link part.
E.g. *https://orchi.tech,*
becomes \*https://orchi.tech<!-- -->,\*
.
This transformation is the default fallback, no reason to set it explicitly.allowAddHttpScheme
: add http://
scheme when a transformation needs it to
work. E.g. *www.orchi.tech,*
would become \*<http://www.orchi.tech>,\*
with the commonmark
transformation.How to choose the options:
'breakafter'
might be better option in
some situations.'breakUrl'
, as users may still expect www links
to be autolinked in the plain text.emphasisNonDelimiters
Defaults to true
, i.e. intraword emphasis delimiters are not escaped if it is safe
not to escape them. E.g. in My account is joe_average.
, the underscore stays
unescaped as joe_average
, not .joe\_average
Suboptions:
maxIntrawordUnderscoreRun
: if defined, it sets the maximum length of intraword
underscores to be kept as is. E.g. for 1
and input joe_average or joe__average
,
the output would be joe_average or joe\_\_average
. This is helpful for some renderers
like Redcarpet. Defaults to undefined
.table
Defaults to false
, i.e. table pipes are not escaped. If enabled, rendering of table
delimiter rows is suppressed by escaping its pipes and all pipes are escaped when in
table context.
Terminology:
Specs:
Reference implementations examined:
While cmark-gfm is somewhat a reference implementation of GFM Spec, we have found a few interesting details...
cmark_gfm-001
: Contrary to the GFM spec stating _All such recognized autolinks
can only come at the beginning of a line, after whitespace, or any of the
delimiting characters *, , ~, and (, it seems this applies just to extended
www autolinks in cmark-gfm. E.g. .https://orchi.tech
is recognized as an
autolink by this library. We follow this.cmark_gfm-002
: Contrary to the GFM spec, extended autolinks in cmark-gfm do
not treat [\v\f]
as space, while CM autolinks do. We follow this.cmark_gfm-003
: cmark-gfm considers <
as valid for autolink detection and
trims the resulting link afterwards. So https://or_chi.tech.<
leads to
autolinking of https://or_chi.tech
, although this wouldn't form autolink
without the trailing <
. We follow this, but non-explicit extended autolink
transformations would break the autolink detection - which is probaly good.
E.g. with the default settings, https://or_chi.tech.<
leads to
https://or_chi.tech.<
(wouldn't be detected as extended autolink by
cmark-gfm), while https://or_chi.tech.<~
leads to
<https://or_chi.tech>\<\~
(forced CM autolink).cmark_gfm-004
: GFM spec says If an autolink ends in a semicolon (;), we
check to see if it appears to resemble an entity reference; if the preceding
text is & followed by one or more alphanumeric characters. If so, it is
excluded from the autolink... Alphabetic references cmark-gfmcmark_gfm-005
: Backslash escape in link destination, e.g.
[foo](http://orchi.tech/foo\_bar)
does not prevent entity reference
from interpreting in rendered HTML. We use entity encoding instead, i.e. &
.FAQs
Markdown and GFM escaper for converting plaintext into escaped Markdown
The npm package gfm-escape receives a total of 28,932 weekly downloads. As such, gfm-escape popularity was classified as popular.
We found that gfm-escape demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Socket's package search now displays weekly downloads for npm packages, helping developers quickly assess popularity and make more informed decisions.
Security News
A Stanford study reveals 9.5% of engineers contribute almost nothing, costing tech $90B annually, with remote work fueling the rise of "ghost engineers."
Research
Security News
Socket’s threat research team has detected six malicious npm packages typosquatting popular libraries to insert SSH backdoors.