@solana/codecs-strings

Dependencies

Maintainers

Versions

989

Alerts

File Explorer

Advanced tools

License

Install Socket

Detect and block malicious and high-risk dependencies

Install

@solana/codecs-strings

Codecs for strings of different sizes and encodings

2.0.0-preview.3.20240710221308.98f0c2d018230b4a6928bbf614b72ea82d7d17ac
Source
npm

Version published: 2 months ago

Weekly downloads: 122K; decreased by-14.21%

Maintainers: 14

Weekly downloads

Created: 12 months ago

Source

@solana/codecs-strings

This package contains codecs for strings of different sizes and encodings. It can be used standalone, but it is also exported as part of the Solana JavaScript SDK @solana/web3.js@experimental.

This package is also part of the @solana/codecs package which acts as an entry point for all codec packages as well as for their documentation.

Sizing string codecs

The @solana/codecs-strings package offers a variety of string codecs such as utf8, base58, base64, etc — which we will discuss in more detail below. However, before digging into the available string codecs, it's important to understand the different sizing strategies available for string codecs.

By default, all available string codecs will return a VariableSizeCodec<string> meaning that:

When encoding a string, all bytes necessary to encode the string will be used.
When decoding a byte array at a given offset, all bytes starting from that offset will be decoded as a string.

For instance, here's how you can encode/decode utf8 strings without any size boundary:

const codec = getUtf8Codec();

codec.encode('hello');
// 0x68656c6c6f
//   └-- Any bytes necessary to encode our content.

codec.decode(new Uint8Array([0x68, 0x65, 0x6c, 0x6c, 0x6f]));
// 'hello'

This might be what you want — e.g. when having a string at the end of a data structure — but in many cases, you might want to have a size boundary for your string. You may achieve this by composing your string codec with the fixCodecSize or addCodecSizePrefix functions.

The fixCodecSize function accepts a fixed byte length and returns a FixedSizeCodec<string> that will always use that amount of bytes to encode and decode a string. Any string longer or smaller than that size will be truncated or padded respectively. Here's how you can use it with a utf8 codec:

const codec = fixCodecSize(getUtf8Codec(), 5);

codec.encode('hello');
// 0x68656c6c6f
//   └-- The exact 5 bytes of content.

codec.encode('hello world');
// 0x68656c6c6f
//   └-- The truncated 5 bytes of content.

codec.encode('hell');
// 0x68656c6c00
//   └-- The padded 5 bytes of content.

codec.decode(new Uint8Array([0x68, 0x65, 0x6c, 0x6c, 0x6f, 0xff, 0xff, 0xff, 0xff]));
// 'hello'

The addCodecSizePrefix function accepts an additional number codec that will be used to encode and decode a size prefix for the string. This prefix allows us to know when to stop reading the string when decoding a given byte array. Here's how you can use it with a utf8 codec:

const codec = addCodecSizePrefix(getUtf8Codec(), getU32Codec());

codec.encode('hello');
// 0x0500000068656c6c6f
//   |       └-- The 5 bytes of content.
//   └-- 4-byte prefix telling us to read 5 bytes.

codec.decode(new Uint8Array([0x05, 0x00, 0x00, 0x00, 0x68, 0x65, 0x6c, 0x6c, 0x6f, 0xff, 0xff, 0xff, 0xff]));
// "hello"

Now, let's take a look at the available string encodings. Just remember that you can use the fixSizeCodec or prefixSizeCodec functions on any of these encodings to add a size boundary to them.

Utf8 codec

The getUtf8Codec function encodes and decodes a UTF-8 string to and from a byte array.

const bytes = getUtf8Codec().encode('hello'); // 0x68656c6c6f
const value = getUtf8Codec().decode(bytes); // "hello"

As usual, separate getUtf8Encoder and getUtf8Decoder functions are also available.

const bytes = getUtf8Encoder().encode('hello'); // 0x68656c6c6f
const value = getUtf8Decoder().decode(bytes); // "hello"

Base 64 codec

The getBase64Codec function encodes and decodes a base-64 string to and from a byte array.

const bytes = getBase64Codec().encode('hello+world'); // 0x85e965a3ec28ae57
const value = getBase64Codec().decode(bytes); // "hello+world"

As usual, separate getBase64Encoder and getBase64Decoder functions are also available.

const bytes = getBase64Encoder().encode('hello+world'); // 0x85e965a3ec28ae57
const value = getBase64Decoder().decode(bytes); // "hello+world"

Base 58 codec

The getBase58Codec function encodes and decodes a base-58 string to and from a byte array.

const bytes = getBase58Codec().encode('heLLo'); // 0x1b6a3070
const value = getBase58Codec().decode(bytes); // "heLLo"

As usual, separate getBase58Encoder and getBase58Decoder functions are also available.

const bytes = getBase58Encoder().encode('heLLo'); // 0x1b6a3070
const value = getBase58Decoder().decode(bytes); // "heLLo"

Base 16 codec

The getBase16Codec function encodes and decodes a base-16 string to and from a byte array.

const bytes = getBase16Codec().encode('deadface'); // 0xdeadface
const value = getBase16Codec().decode(bytes); // "deadface"

As usual, separate getBase16Encoder and getBase16Decoder functions are also available.

const bytes = getBase16Encoder().encode('deadface'); // 0xdeadface
const value = getBase16Decoder().decode(bytes); // "deadface"

Base 10 codec

The getBase10Codec function encodes and decodes a base-10 string to and from a byte array.

const bytes = getBase10Codec().encode('1024'); // 0x0400
const value = getBase10Codec().decode(bytes); // "1024"

As usual, separate getBase10Encoder and getBase10Decoder functions are also available.

const bytes = getBase10Encoder().encode('1024'); // 0x0400
const value = getBase10Decoder().decode(bytes); // "1024"

Base X codec

The getBaseXCodec accepts a custom alphabet of X characters and creates a base-X codec using that alphabet. It does so by iteratively dividing by X and handling leading zeros.

The base-10 and base-58 codecs use this base-x codec under the hood.

const alphabet = '0ehlo';
const bytes = getBaseXCodec(alphabet).encode('hello'); // 0x05bd
const value = getBaseXCodec(alphabet).decode(bytes); // "hello"

As usual, separate getBaseXEncoder and getBaseXDecoder functions are also available.

const bytes = getBaseXEncoder(alphabet).encode('hello'); // 0x05bd
const value = getBaseXDecoder(alphabet).decode(bytes); // "hello"

Re-slicing base X codec

The getBaseXResliceCodec also creates a base-x codec but uses a different strategy. It re-slices bytes into custom chunks of bits that are then mapped to a provided alphabet. The number of bits per chunk is also provided and should typically be set to log2(alphabet.length).

This is typically used to create codecs whose alphabet’s length is a power of 2 such as base-16 or base-64.

const bytes = getBaseXResliceCodec('elho', 2).encode('hellolol'); // 0x4aee
const value = getBaseXResliceCodec('elho', 2).decode(bytes); // "hellolol"

As usual, separate getBaseXResliceEncoder and getBaseXResliceDecoder functions are also available.

const bytes = getBaseXResliceEncoder('elho', 2).encode('hellolol'); // 0x4aee
const value = getBaseXResliceDecoder('elho', 2).decode(bytes); // "hellolol"

To read more about the available codecs and how to use them, check out the documentation of the main @solana/codecs package.

Keywords

FAQs

What is @solana/codecs-strings?

Is @solana/codecs-strings popular?

Is @solana/codecs-strings well maintained?

Package last updated on 10 Jul 2024

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install