Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

puppet-pson

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

puppet-pson

1.1.10
Rubygems

Version published: 2 years ago

Maintainers: 1

Created: 2 years ago

Source

PSON

PSON is a variant of JSON that puppet uses for serializing data to transmit across the network or store on disk. Whereas JSON requires that the serialized form is valid unicode (usually UTF-8), PSON is 8-bit ASCII, which allows it to represent arbitrary byte sequences in strings.

PSON was forked from upstream pure JSON v1.1.9 and patches were added to allow binary data.

Puppet uses the MIME types "pson" and "text/pson" to refer to PSON.

JSON Strings

A JSON string is encoded in Unicode. The string must start and end with " (ASCII 0x22). Between these characters, double quote, reverse solidus (backslash) and control characters (\u0000 - \u001F) must be escaped, all others may be escaped.

The escape process replaces each code point with "reverse solidus, followed by the lowercase letter u, followed by four hexadecimal digits that encode the character's code point." For example, the ASCII record separator character (\u001F) is serialized in JSON as:

0x5C 0x75 0x30 0x30 0x31 0x45

In addition, the double quote, reverse solidus and some printable control characters are commonly escaped using a shorter form, using a single reverse solidus:

| Byte | ASCII Character | Encoded Sequence | Encoded ASCII Sequence |
| ---- | --------------- | ---------------- | ---------------------- |
| 0x22 | "               | 0x5C, 0x22       | \"                     |
| 0x5c | \               | 0x5C, 0x5C       | \\                     |
| 0x08 | Backspace       | 0x5C, 0x62       | \b                     |
| 0x09 | Horizontal Tab  | 0x5C, 0x74       | \t                     |
| 0x0A | Line Feed       | 0x5C, 0x6E       | \n                     |
| 0x0C | Form Feed       | 0x5C, 0x66       | \f                     |
| 0x0D | Carriage Return | 0x5C, 0x72       | \r                     |

UTF-8/16/32 encodings define a set of rules for how code points are encoded as bytes. However, there are some byte sequences that are not a valid encoding for any code point. For example, in UTF-8, 0x80 is used to indicate a two byte sequence, so the following fails:

JSON.generate("\x80")
#=> JSON::GeneratorError (source sequence is illegal/malformed utf-8)

Differences from JSON

PSON does not differ from JSON in its representation of objects, arrays, numbers, booleans, and null values. PSON does serialize strings differently from JSON.

PSON shares the same encoding process as JSON, except that a PSON string is a sequence of 8-bit ASCII values. So the string containing 0x80 can be serialized unescaped as PSON:

PSON.generate("\x80").bytes.to_a.map { |b| b.to_s(16) }
#=> ["22", "80", "22"]

One other difference is that PSON may produce strings with 8-bit ASCII encoding, unlike JSON:

PSON.generate("\u20AC").encoding
#=> #<Encoding:ASCII-8BIT>
JSON.generate("\u20AC").encoding
#=> #<Encoding:UTF-8>

Finally, PSON does not roundtrip values that are not arrays or hashes in the same way as JSON:

JSON.parse(JSON.generate("\x1E"))
#=> "\u001E"
PSON.parse(PSON.generate("\x1E"))
#=> PSON::ParserError (source '"\u001e"' not in PSON!)

whereas PSON can roundtrip an array (or hash) containing that value:

PSON.parse(PSON.generate(["\x1E"]))
#=> ["\u001E"]

Decoding PSON Using JSON Parsers

Many languages have JSON parsers already, which can often be used to parse PSON data. Although JSON requires that it is encoded as unicode most parsers will produce usable output from PSON if they are instructed to interpret the input as Latin-1 encoding.

In all these examples there is a file available called data.pson that contains the ruby structure { "data" => "\x07\x08\xC3\xC3" } encoded as PSON (the value is an invalid unicode sequence). In bytes the data is:

0x7b 0x22 0x64 0x61 0x74 0x61 0x22 0x3a 0x22 0x5c 0x75 0x30 0x30 0x30 0x37 0x5c 0x62 0xc3 0xc3 0x22 0x7d

which represents:

 {    "    d    a    t    a    "   <sp>  "    \    u    0    0    0    7    \    b    0xc3 0xc3 "    }

Python Example:

>>> import json
>>> json.load(open("data.pson"), "latin_1")
{u'data': u'\x07\x08\xc3\xc3'}

Clojure Example:

user> (parse-string (slurp "data.pson" :encoding "ISO-8859-1"))
{"data" "^G\bÃÃ"}

Ruby Example:

irb> JSON.parse(File.read('data.pson', encoding: "ISO-8859-1"))
=> {"data"=>"\a\bÃÃ"}

FAQs

What is puppet-pson?

Is puppet-pson well maintained?

Package last updated on 21 Mar 2023

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

puppet-pson

PSON

JSON Strings

Differences from JSON

Decoding PSON Using JSON Parsers

Related posts

Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm

Malicious npm Package Typosquats Popular TypeScript ESLint Plugin, Exfiltrates Data and Enables Remote Exploitation