@nodable/entities
Standalone, zero-dependency XML/HTML entity replacement with:
- 5 entity categories processed in a fixed, predictable order
- Persistent vs. input entity separation — no state leaks between documents
getInstance() — clean per-document reset without cloning
- Composable named entity groups (HTML, currency, math, arrows, numeric refs)
- Security limits — cap total expansions and expanded length per document
- Granular limit targeting — apply limits to any subset of categories
postCheck hook — inspect or sanitize the fully resolved string
Installation
npm install @nodable/entities
Quick Start
import EntityReplacer from '@nodable/entities';
const replacer = new EntityReplacer({ default: true });
replacer.replace('5 < 10 && x > 0');
With named entity groups:
import EntityReplacer, { COMMON_HTML, CURRENCY_ENTITIES } from '@nodable/entities';
const replacer = new EntityReplacer({
default: true,
system: { ...COMMON_HTML, ...CURRENCY_ENTITIES },
});
replacer.replace('© 2024 — Price: £9.99');
Entity Categories
Entities are processed in this fixed order — not configurable:
persistent external → input/runtime → system → default → amp
persistent external — Caller-supplied configuration entities
Entities set at configuration time that survive across all documents. Never wiped by getInstance(). Set via setExternalEntities() or addExternalEntity() / addEntity().
const replacer = new EntityReplacer({ default: true });
replacer.setExternalEntities({ brand: 'Acme Corp', product: 'Widget Pro' });
replacer.replace('&brand; makes &product;');
input / runtime — Per-document DOCTYPE entities
Entities injected by the parser from the document's DOCTYPE block. Stored separately from persistent entities and wiped on every getInstance() call so they cannot leak between documents.
Set via addInputEntities(). Never call this manually — BaseOutputBuilder calls it automatically.
system — Named entity groups
Opt-in. Trusted programmer-supplied groups. Compose freely:
import {
COMMON_HTML,
CURRENCY_ENTITIES,
MATH_ENTITIES,
ARROW_ENTITIES,
NUMERIC_ENTITIES,
} from '@nodable/entities';
const replacer = new EntityReplacer({
system: { ...COMMON_HTML, ...MATH_ENTITIES },
});
COMMON_HTML | © ® ™ — – … « » ‘ ’ “ ” • ¶ § ° ½ ¼ ¾ |
CURRENCY_ENTITIES | ¢ £ ¥ € &inr; ¤ ƒ |
MATH_ENTITIES | × ÷ ± − ² ³ ‰ ∞ ∑ ∏ √ ≠ ≤ ≥ |
ARROW_ENTITIES | ← ↑ → ↓ ↔ ⇐ ⇑ ⇒ ⇓ ⇔ |
NUMERIC_ENTITIES | &#NNN; decimal and &#xHH; hex refs — any valid Unicode code point |
default — Built-in XML entities
Always on unless explicitly disabled.
amp — Final pass
& → &
Processed after all other categories to prevent double-expansion:
&lt; → < ✓ (not <)
&amp; → & ✓ (not &)
Constructor API
const replacer = new EntityReplacer({
default: true,
amp: true,
system: false,
maxTotalExpansions: 0,
maxExpandedLength: 0,
applyLimitsTo: 'external',
postCheck: null,
});
EntityReplacer Instance Methods
replace(str)
Replace all entity references in str. Returns str unchanged (same reference) if no & is present — fast path.
replacer.replace('Tom & Jerry <cartoons>');
setExternalEntities(map)
Replace the full set of persistent external entities. These survive across all documents and are not cleared by getInstance().
replacer.setExternalEntities({ brand: 'Acme', year: '2025' });
Calling this a second time replaces the entire persistent map. Values containing & are silently skipped.
addExternalEntity(key, value)
Append a single persistent external entity without disturbing the rest.
replacer.addExternalEntity('brand', 'Acme');
replacer.addExternalEntity('year', '2025');
addInputEntities(map)
Inject input/runtime (DOCTYPE) entities for the current document. These are stored separately from persistent entities and wiped on the next getInstance() call. Also resets per-document expansion counters.
replacer.addInputEntities(doctypeEntityMap);
Values containing & are silently skipped. Accepts pre-built { regex, val } or { regx, val } objects as produced by DocTypeReader.
getInstance()
Reset all per-document state and return this.
Clears:
- input/runtime entities (DOCTYPE)
_totalExpansions counter
_expandedLength counter
Preserves:
- persistent external entities set via
setExternalEntities() / addExternalEntity()
- all constructor config
The builder factory calls this when creating a new builder instance, ensuring each document starts clean whether or not it has a DOCTYPE.
getInstance() {
const builder = new MyBuilder(this.config);
builder.entityParser = this.entityVP.getInstance();
return builder;
}
Document-to-Document Safety
A key design goal is that entities from one document never bleed into the next. Here's how the two categories work together:
Document 1 parse:
factory.getInstance() → evp.getInstance() [clears input, resets counters]
builder sees DOCTYPE → evp.addInputEntities({ version: '1.0' })
builder processes values → evp.parse('&brand; v&version;') → 'Acme v1.0'
Document 2 parse (no DOCTYPE):
factory.getInstance() → evp.getInstance() [clears &version;, resets counters]
no DOCTYPE → addInputEntities() not called
builder processes values → evp.parse('&brand; v&version;') → 'Acme v&version;'
↑ persistent &brand; works
↑ &version; is gone — correct
Security Controls
Expansion count limit
Caps the number of entity references that may be expanded per document.
const replacer = new EntityReplacer({ maxTotalExpansions: 1000 });
Throws Error if exceeded:
[EntityReplacer] Entity expansion count limit exceeded: 1001 > 1000
Expanded length limit
Caps the total number of characters added by entity expansion per document.
const replacer = new EntityReplacer({ maxExpandedLength: 65536 });
Throws Error if exceeded:
[EntityReplacer] Expanded content length limit exceeded: 65537 > 65536
applyLimitsTo
Controls which categories count against the limits.
applyLimitsTo: 'external'
applyLimitsTo: 'all'
applyLimitsTo: ['external', 'system']
applyLimitsTo: ['external', 'default']
postCheck Hook
Fires once on the fully resolved string, after all categories have been processed. Not called if the string is unchanged (no & present or no matches found).
postCheck: (resolved: string, original: string) => string
resolved — string after all entity replacements
original — the original input string before any replacement
- Must return a string
- To reject expansion:
return original
- To sanitize: return a modified version of
resolved
Examples:
postCheck: (resolved, original) =>
/<[a-z]/i.test(resolved) ? original : resolved
postCheck: (resolved) =>
resolved.replace(/<[^>]*>/g, '')
EntitiesValueParser — flex-xml-parser adapter
EntitiesValueParser wraps EntityReplacer and implements the ValueParser interface used by @nodable/flexible-xml-parser.
Setup
import { EntitiesValueParser, COMMON_HTML } from '@nodable/entities';
const evp = new EntitiesValueParser({
system: COMMON_HTML,
maxTotalExpansions: 500,
});
evp.setExternalEntities({ brand: 'Acme', product: 'Widget' });
myBuilder.registerValueParser('entity', evp);
const parser = new XMLParser({ OutputBuilder: myBuilder });
parser.parse(xml);
Constructor options
All EntityReplacerOptions are accepted, plus one extra:
new EntitiesValueParser({
default: true,
system: COMMON_HTML,
maxTotalExpansions: 1000,
postCheck: (resolved, original) => resolved,
entities: { copy: '©', trade: '™', brand: 'Acme Corp' },
})
setExternalEntities(map)
Replace the full persistent entity map. These entities survive across all documents.
evp.setExternalEntities({ brand: 'Acme', copy: '©' });
addEntity(key, value)
Append a single persistent external entity. Previously registered entities are preserved.
evp.addEntity('copy', '©');
evp.addEntity('trade', '™');
evp.addEntity('year', '2024');
Throws if key contains & or ;, or if value contains &.
getInstance() — called by builder factory
Reset per-document state (input entities + counters) and return this. The builder factory calls this each time it creates a new builder instance.
getInstance() {
const builder = new CompactObjBuilder(this._config);
builder.entityParser = this._entityVP.getInstance();
return builder;
}
addInputEntities(entities) — called automatically
Receives the DOCTYPE entity map from BaseOutputBuilder once per parse. Resets per-document expansion counters. Accepts both plain string values and { regx, val } objects from DocTypeReader.
parse(val, context?)
Implements the ValueParser interface. context is accepted but ignored. Returns non-string input unchanged.
Custom Entity Tables
Pass any plain object as default or system to replace the built-in set:
const myEntities = {
br: { regex: /&br;/g, val: '\n' },
tab: { regex: /&tab;/g, val: '\t' },
};
const replacer = new EntityReplacer({ default: myEntities });
replacer.replace('line1&br;line2&tab;indented');
Extend the built-in tables via spreading:
import { DEFAULT_XML_ENTITIES } from '@nodable/entities';
const replacer = new EntityReplacer({
default: { ...DEFAULT_XML_ENTITIES, br: { regex: /&br;/g, val: '\n' } },
});
Comparison with entities npm package
| XML entity decoding | ✅ | ✅ |
| HTML entity decoding | ✅ full ~2000 | ✅ grouped, composable |
| Numeric refs with leading zeros | ✅ | ✅ |
| DOCTYPE / external entity injection | ❌ | ✅ |
| Persistent vs. input entity separation | ❌ | ✅ |
Per-document reset via getInstance() | ❌ | ✅ |
| Expansion count limit | ❌ | ✅ |
| Expanded length limit | ❌ | ✅ |
applyLimitsTo granularity | ❌ | ✅ |
postCheck hook | ❌ | ✅ |
| Encoding / HTML escaping | ✅ | ❌ out of scope |
| Zero dependencies | ✅ | ✅ |
TypeScript
Full TypeScript declarations are included via index.d.ts. No @types/ package needed.
import EntityReplacer, {
EntitiesValueParser,
COMMON_HTML,
EntityTable,
EntityReplacerOptions,
EntitiesValueParserOptions,
} from '@nodable/entities';
const opts: EntityReplacerOptions = {
default: true,
system: COMMON_HTML,
maxTotalExpansions: 500,
postCheck: (resolved, original) =>
/<script/i.test(resolved) ? original : resolved,
};
const replacer = new EntityReplacer(opts);
replacer.setExternalEntities({ brand: 'Acme' });
replacer.getInstance();
replacer.addInputEntities({ version: '1.0' });
const evpOpts: EntitiesValueParserOptions = {
system: COMMON_HTML,
entities: { brand: 'Acme' },
};
const evp = new EntitiesValueParser(evpOpts);
evp.addEntity('copy', '©');
evp.getInstance();
evp.addInputEntities({ company: 'Nodable' });
const result: string = evp.parse('<©&brand;');
License
MIT