@exodus/bytes
Uint8Array conversion to and from base64, base32, base58, hex, utf8, utf16, bech32 and wif
And a TextEncoder / TextDecoder polyfill
Strict
Performs proper input validation, ensures no garbage-in-garbage-out
Tested on Node.js, Deno, Bun, browsers (including Servo), Hermes, QuickJS and barebone engines in CI (how?)
Fast
10-20x faster than Buffer polyfill
2-10x faster than iconv-lite
The above was for the js fallback
It's up to 100x when native impl is available
e.g. in utf8fromString on Hermes / React Native or fromHex in Chrome
Also:
3-8x faster than bs58
10-30x faster than @scure/base (or >100x on Node.js <25)
- Faster in
utf8toString / utf8fromString than Buffer or TextDecoder / TextEncoder on Node.js
See Performance for more info
TextEncoder / TextDecoder polyfill
import { TextDecoder, TextEncoder } from '@exodus/bytes/encoding.js'
Less than half the bundle size of text-encoding, whatwg-encoding or iconv-lite (gzipped or not), and is much faster.
See also lite version.
Spec compliant, passing WPT and covered with extra tests.
Moreover, tests for this library uncovered bugs in all major implementations.
Faster than Node.js native implementation on Node.js.
Runs (and passes WPT) on Node.js built without ICU.
Caveat: TextDecoder / TextEncoder APIs are lossy by default per spec
These are only provided as a compatibility layer, prefer hardened APIs instead in new code.
-
TextDecoder can (and should) be used with { fatal: true } option for all purposes demanding correctness / lossless transforms
-
TextEncoder does not support a fatal mode per spec, it always performs replacement.
That is not suitable for hashing, cryptography or consensus applications.
Otherwise there would be non-equal strings with equal signatures and hashes — the collision is caused by the lossy transform of a JS string to bytes.
Those also survive e.g. JSON.stringify/JSON.parse or being sent over network.
Use strict APIs in new applications, see utf8fromString / utf16fromString below.
Those throw on non-well-formed strings by default.
Lite version
If you don't need support for legacy multi-byte encodings, you can use the lite import:
import { TextDecoder, TextEncoder } from '@exodus/bytes/encoding-lite.js'
This reduces the bundle size 10x:
from 90 KiB gzipped for @exodus/bytes/encoding.js to 9 KiB gzipped for @exodus/bytes/encoding-lite.js.
(For comparison, text-encoding module is 190 KiB gzipped, and iconv-lite is 194 KiB gzipped).
It still supports utf-8, utf-16le, utf-16be and all single-byte encodings specified by the spec,
the only difference is support for legacy multi-byte encodings.
See the list of encodings.
API
@exodus/bytes/utf8.js
import { utf8fromString, utf8toString } from '@exodus/bytes/utf8.js'
import { utf8fromStringLoose, utf8toStringLoose } from '@exodus/bytes/utf8.js'
utf8fromString(str, format = 'uint8')
utf8fromStringLoose(str, format = 'uint8')
utf8toString(arr)
utf8toStringLoose(arr)
@exodus/bytes/utf16.js
import { utf16fromString, utf16toString } from '@exodus/bytes/utf16.js'
import { utf16fromStringLoose, utf16toStringLoose } from '@exodus/bytes/utf16.js'
utf16fromString(str, format = 'uint16')
utf16fromStringLoose(str, format = 'uint16')
utf16toString(arr, 'uint16')
utf16toStringLoose(arr, 'uint16')
@exodus/bytes/single-byte.js
import { createSinglebyteDecoder } from '@exodus/bytes/single-byte.js'
import { windows1252toString } from '@exodus/bytes/single-byte.js'
Decode the legacy single-byte encodings according to the Encoding standard
(§9 and
§14.5).
Supports all single-byte encodings listed in the standard:
ibm866, iso-8859-2, iso-8859-3, iso-8859-4, iso-8859-5, iso-8859-6, iso-8859-7, iso-8859-8,
iso-8859-8-i, iso-8859-10, iso-8859-13, iso-8859-14, iso-8859-15, iso-8859-16, koi8-r, koi8-u,
macintosh, windows-874, windows-1250, windows-1251, windows-1252, windows-1253, windows-1254,
windows-1255, windows-1256, windows-1257, windows-1258, x-mac-cyrillic and x-user-defined.
createSinglebyteDecoder(encoding, loose = false)
Create a decoder for a supported one-byte encoding, given it's lowercased name encoding.
Returns a function decode(arr) that decodes bytes to a string.
windows1252toString(arr)
Decode windows-1252 bytes to a string.
Also supports ascii and latin-1 as those are strict subsets of windows-1252.
There is no loose variant for this encoding, all bytes can be decoded.
Same as:
const windows1252toString = createSinglebyteDecoder('windows-1252')
@exodus/bytes/multi-byte.js
import { createMultibyteDecoder } from '@exodus/bytes/multi-byte.js'
Decode the legacy multi-byte encodings according to the Encoding standard
(§10,
§11,
§12,
§13).
Supports all legacy multi-byte encodings listed in the standard:
gbk, gb18030, big5, euc-jp, iso-2022-jp, shift_jis, euc-kr.
createMultibyteDecoder(encoding, loose = false)
Create a decoder for a supported legacy multi-byte encoding, given it's lowercased name encoding.
Returns a function decode(arr, stream = false) that decodes bytes to a string.
That function will have state while stream = true is used.
@exodus/bytes/bigint.js
import { fromBigInt, toBigInt } from '@exodus/bytes/bigint.js'
fromBigInt(bigint, { length, format = 'uint8' })
toBigInt(arr)
@exodus/bytes/hex.js
import { fromHex, toHex } from '@exodus/bytes/hex.js'
fromHex(string)
toHex(arr)
@exodus/bytes/base64.js
import { fromBase64, toBase64 } from '@exodus/bytes/base64.js'
import { fromBase64url, toBase64url } from '@exodus/bytes/base64.js'
import { fromBase64any } from '@exodus/bytes/base64.js'
fromBase64(str, { format = 'uint8', padding = 'both' })
fromBase64url(str, { format = 'uint8', padding = false })
fromBase64any(str, { format = 'uint8', padding = 'both' })
toBase64(arr, { padding = true })
toBase64url(arr, { padding = false })
@exodus/bytes/base32.js
import { fromBase32, toBase32 } from '@exodus/bytes/base32.js'
import { fromBase32hex, toBase32hex } from '@exodus/bytes/base32.js'
fromBase32(str, { format = 'uint8', padding = 'both' })
fromBase32hex(str, { format = 'uint8', padding = 'both' })
toBase32(arr, { padding = false })
toBase32hex(arr, { padding = false })
@exodus/bytes/bech32.js
import { fromBech32, toBech32 } from '@exodus/bytes/bech32.js'
import { fromBech32m, toBech32m } from '@exodus/bytes/base32.js'
import { getPrefix } from '@exodus/bytes/base32.js'
getPrefix(str, limit = 90)
fromBech32(str, limit = 90)
toBech32(prefix, bytes, limit = 90)
fromBech32m(str, limit = 90)
toBech32m(prefix, bytes, limit = 90)
@exodus/bytes/base58.js
import { fromBase58, toBase58 } from '@exodus/bytes/base58.js'
import { fromBase58xrp, toBase58xrp } from '@exodus/bytes/base58.js'
fromBase58(str, format = 'uint8')
toBase58(arr)
fromBase58xrp(str, format = 'uint8')
toBase58xrp(arr)
@exodus/bytes/base58check.js
import { fromBase58check, toBase58check } from '@exodus/bytes/base58check.js'
import { fromBase58checkSync, toBase58checkSync } from '@exodus/bytes/base58check.js'
import { makeBase58check } from '@exodus/bytes/base58check.js'
On non-Node.js, requires peer dependency @exodus/crypto to be installed.
async fromBase58check(str, format = 'uint8')
async toBase58check(arr)
fromBase58checkSync(str, format = 'uint8')
toBase58checkSync(arr)
makeBase58check(hashAlgo, hashAlgoSync)
@exodus/bytes/wif.js
import { fromWifString, toWifString } from '@exodus/bytes/wif.js'
import { fromWifStringSync, toWifStringSync } from '@exodus/bytes/wif.js'
On non-Node.js, requires peer dependency @exodus/crypto to be installed.
async fromWifString(string, version)
fromWifStringSync(string, version)
async toWifString({ version, privateKey, compressed })
toWifStringSync({ version, privateKey, compressed })
@exodus/bytes/encoding.js
import { TextDecoder, TextEncoder } from '@exodus/bytes/encoding.js'
import { getBOMEncoding, legacyHookDecode, labelToName, normalizeEncoding } from '@exodus/bytes/encoding.js'
Implements the Encoding standard:
TextDecoder,
TextEncoder,
some hooks (see below).
new TextDecoder(label = 'utf-8', { fatal = false, ignoreBOM = false })
TextDecoder implementation/polyfill.
new TextEncoder()
TextEncoder implementation/polyfill.
labelToName(label)
Implements get an encoding from a string label.
Converts an encoding label to its name,
as a case-sensitive string.
If an encoding with that label does not exist, returns null.
All encoding names are also valid labels for corresponding encodings.
normalizeEncoding(label)
Converts an encoding label to its name,
as an ASCII-lowercased string.
If an encoding with that label does not exist, returns null.
This is the same as decoder.encoding getter,
except that it:
It is identical to:
labelToName(label)?.toLowerCase() ?? null
All encoding names are also valid labels for corresponding encodings.
getBOMEncoding(input)
Implements BOM sniff legacy hook.
Given a TypedArray or an ArrayBuffer instance input, returns either of:
'utf-8', if input starts with UTF-8 byte order mark.
'utf-16le', if input starts with UTF-16LE byte order mark.
'utf-16be', if input starts with UTF-16BE byte order mark.
null otherwise.
legacyHookDecode(input, fallbackEncoding = 'utf-8')
Implements decode legacy hook.
Given a TypedArray or an ArrayBuffer instance input and an optional fallbackEncoding
encoding label,
sniffs encoding from BOM with fallbackEncoding fallback and then
decodes the input using that encoding, skipping BOM if it was present.
Notes:
- BOM-sniffed encoding takes precedence over
fallbackEncoding option per spec.
Use with care.
- Always operates in non-fatal mode,
aka replacement. It can convert different byte sequences to equal strings.
This method is similar to the following code, except that it doesn't support encoding labels and
only expects lowercased encoding name:
new TextDecoder(getBOMEncoding(input) ?? fallbackEncoding).decode(input)
@exodus/bytes/encoding-lite.js
import { TextDecoder, TextEncoder } from '@exodus/bytes/encoding-lite.js'
import { getBOMEncoding, legacyHookDecode, labelToName, normalizeEncoding } from '@exodus/bytes/encoding-lite.js'
The exact same exports as @exodus/bytes/encoding.js are also exported as
@exodus/bytes/encoding-lite.js, with the difference that the lite version does not load
multi-byte TextDecoder encodings by default to reduce bundle size 10x.
The only affected encodings are: gbk, gb18030, big5, euc-jp, iso-2022-jp, shift_jis
and their labels when used with TextDecoder.
Legacy single-byte encodingds are loaded by default in both cases.
TextEncoder and hooks for standards (including labelToName / normalizeEncoding) do not have any behavior
differences in the lite version and support full range if inputs.
To avoid inconsistencies, the exported classes and methods are exactly the same objects.
> lite = require('@exodus/bytes/encoding-lite.js')
[Module: null prototype] {
TextDecoder: [class TextDecoder],
TextEncoder: [class TextEncoder],
getBOMEncoding: [Function: getBOMEncoding],
labelToName: [Function: labelToName],
legacyHookDecode: [Function: legacyHookDecode],
normalizeEncoding: [Function: normalizeEncoding]
}
> new lite.TextDecoder('big5').decode(Uint8Array.of(0x25))
Uncaught:
Error: Legacy multi-byte encodings are disabled in /encoding-lite.js, use /encoding.js for full encodings range support
> full = require('@exodus/bytes/encoding.js')
[Module: null prototype] {
TextDecoder: [class TextDecoder],
TextEncoder: [class TextEncoder],
getBOMEncoding: [Function: getBOMEncoding],
labelToName: [Function: labelToName],
legacyHookDecode: [Function: legacyHookDecode],
normalizeEncoding: [Function: normalizeEncoding]
}
> full.TextDecoder === lite.TextDecoder
true
> new full.TextDecoder('big5').decode(Uint8Array.of(0x25))
'%'
> new lite.TextDecoder('big5').decode(Uint8Array.of(0x25))
'%'
License
MIT