Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

sast-json-schema

Package Overview
Dependencies
Maintainers
1
Versions
16
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

sast-json-schema

Meta-schema for the Static Application Security Testing (SAST) of JSON Schemas

latest
Source
npmnpm
Version
0.4.1
Version published
Weekly downloads
1.3K
-31.76%
Maintainers
1
Weekly downloads
 
Created
Source

sast-json-schema

Meta-schema for the Static Application Security Testing (SAST) of JSON Schemas


GitHub Actions unit test status GitHub Actions dast test status GitHub Actions perf test status GitHub Actions SAST test status GitHub Actions lint test status
npm version npm install size npm weekly downloads npm provenance
Open Source Security Foundation (OpenSSF) Scorecard SLSA 3 Checked with Biome Conventional Commits

High-level functionality

  • Ensure strictness of interpretation.
  • Ensure integer or number are within a safe range.
  • Ensure string have defined allowed values, and maxLength.
  • Ensure arrays have defined properties and maxItems.
  • Ensure object have defined properties and maxProperties when needed.
  • Ensure pattern follow safe RegExp usage.
  • Ensure $id and $refs resolve safely.

Installation

Requires Node.js >=24.

npm install sast-json-schema

How to run

Manually

import Ajv from "ajv/dist/2020.js"
import sastSchema from "sast-json-schema" with { type: "json" }
import schema from "path/to/schema.json" with { type: "json" }

// Your schema should compile under strictTypes:true.
const userAjv = new Ajv({ strictTypes: true })
if (!userAjv.validateSchema(schema)) {
  console.error(userAjv.errors)
}

// The meta-schema itself uses strictTypes:false because it validates
// subschemas that may legally be `false` (boolean-schema form).
const sastAjv = new Ajv({ strictTypes: false })
const isSchemaSecure = sastAjv.compile(sastSchema)
if (!isSchemaSecure(schema)) {
  console.error(isSchemaSecure.errors)
}

Per-draft entry points are also exported: sast-json-schema/2020-12, /2019-09, /draft-07, /draft-06, /draft-04. Each meta-schema is identified by a urn:willfarrell:sast-json-schema:<spec> URN. Shared primitives (safePattern, safeUrl, etc.) are available via sast-json-schema/$defs.

CLI

npx sast-json-schema path/to/schema.json

Options:

  • --override-max-depth <n>: Override max depth limit (default: 32)
  • --override-max-items <n>: Override max items limit (default: 1024)
  • --override-max-properties <n>: Override max properties limit (default: 1024)
  • --ignore <instancePath>: Suppress errors by instancePath or instancePath:keyword (repeatable). Paths use RFC 6901 JSON Pointer encoding (~ to ~0, / to ~1)
  • --offline: Skip SSRF DNS resolution for remote $ref URLs (useful in airgapped CI)
  • --lang <code>: Downstream language whose deserialization-vector names to deny in property keys. Default is default (union of every named language). See language coverage below
  • --format <human|json|sarif>: Output format. json emits a JSON array of error objects on stdout; sarif emits a SARIF 2.1.0 log for GitHub code-scanning, SonarQube, Semgrep and other security pipelines; human is the default
  • -v, --version: Show version
  • -h, --help: Show this help

Exit codes

CodeMeaning
0No issues found
1Schema has security issues
2Usage/tool error (bad args, unreadable file, invalid JSON, unsupported $schema)

Also available via ajv-cmd:

ajv sast --fail path/to/schema.json

Known Limitations

  • $ref: "#" (self-reference) is rejected. The meta-schema requires $ref values to have at least one character after #. Bare self-references ($ref: "#") are blocked to prevent infinite recursion in validators. If you need a self-referencing schema, use a named $defs entry and reference it explicitly.
  • contentMediaType does not flag XSS-risky media types. The meta-schema validates that contentMediaType follows IANA format (RFC 6838) but does not warn about types whose content can execute scripts when rendered, such as text/html, application/xhtml+xml, or image/svg+xml. If your application renders content based on this annotation, ensure it is sanitized to prevent XSS.
  • contentMediaType does not flag XXE-risky media types. XML-family types (application/xml, text/xml, application/soap+xml, application/xml-dtd, application/xml-external-parsed-entity, image/svg+xml) are accepted without warning. If your consumer parses these payloads, configure the XML parser to disable external entity resolution and DTD processing. See SECURITY.md.
  • format: "regex" does not validate regex safety. A schema using format: "regex" validates that input strings are syntactically valid regular expressions, but the meta-schema does not ensure those regex strings are safe from ReDoS. If your application compiles user-provided regex strings, use runtime ReDoS checking on the input. i.e. JavaScript: redos-detector.

Meta-schema only

  • Prototype-pollution denylist does not cover patternProperties keys. The meta-schema rejects __proto__, constructor, and prototype as literal keys in properties, $defs, definitions, dependentSchemas, dependentRequired, and required. It does NOT reject these names when introduced via a patternProperties regex key, because any literal denylist (^__proto__$) is trivially bypassed by equivalent regexes (^_{2}proto_{2}$, ^[_][_]proto__$, ^.{9}$). Enforced by the CLI: crawlSchema compiles each patternProperties key and tests it against the denylisted names. Consumers using the meta-schema standalone (without cli.js / analyze()) get property-key protection but not patternProperties protection.
  • Language-specific deserialization-vector names are not in the meta-schema. Only __proto__, constructor, prototype are rejected at the meta-schema layer (the universal baseline). Names like @type (Java), $type (.NET), __class__ (Python), isa (Objective-C), __struct__ (Elixir), or PHP magic methods are enforced only at the CLI / analyze() layer via --lang. See Language coverage.
  • Depth limits are a runtime concern. Deeply nested schemas could cause stack overflow during recursive validation. Configure your validator's depth limits (e.g. AJV does not limit recursion depth by default). Enforced by the CLI, see --override-max-depth.
  • Min/max logical consistency not enforced. A schema with minimum: 100, maximum: 1 (impossible range) will pass validation. This cannot be reliably enforced in JSON Schema alone and would require a wrapper function. Having unit tests for your schema is recommended, this would catch this type of error. Enforced by the CLI.
  • pattern regex validation has known gaps. The check rejects negated character classes [^...] as broad denylist matchers (use allowlist patterns like [\p{L}\p{N}] instead), blocks nested quantifiers like (a+)+, backreferences, identical overlapping quantifiers like [a-z]+[a-z]+, semantically identical overlapping quantifiers like \d+[0-9]+, and superset overlaps like \w+\d+ (where \w\d). Bare alternation at the top level (^a|b$) is rejected, but alternation across sibling groups (^(a)|(b)$) is not detected at the meta-schema level (it is enforced by the CLI). The check cannot detect non-identical overlapping quantifiers (e.g. [a-z]+\\w+ where \\w[a-z]). Use runtime ReDoS checking for full protection.
  • Remote $ref URLs can be SSRF vectors. The meta-schema restricts $ref to # (local) or https:// URLs and blocks private IP ranges (dotted-decimal, hex 0x, and decimal representations), but DNS-based bypasses (domains resolving to internal IPs) cannot be detected at the schema level. Ensure your validator is configured to disallow or restrict remote schema loading (e.g., use ajv.addSchema() instead of allowing external fetches). Dereferencing before running SAST is recommended. Enforced by the CLI.

Language coverage

JSON Schemas are language-agnostic, but the JSON they validate gets deserialized into objects in many different languages, each of which has its own set of "magic" property names that downstream libraries may interpret as type discriminators, runtime hooks, or pollution vectors. The --lang flag selects which language's deserialization-vector names to deny in property keys (properties, $defs, definitions, dependentSchemas, dependentRequired, required, and patternProperties regex keys).

The meta-schema itself enforces a universal baseline of __proto__, constructor, prototype regardless of --lang: those names are dangerous in every named entry below. Language-specific extras are enforced additively at the CLI / analyze() layer.

For a list of JSON-Schema validators per language, see json-schema.org/tools#validator.

Language--langExtras over JS baseline
JavaScript / TypeScript / Node.jsjs(none, the universal baseline)
Pythonpy__class__, __init__, __globals__, __builtins__, __import__, __reduce__, __subclasses__, __dict__, __mro__
Rubyrb__send__, json_class, instance_eval, instance_variable_set, singleton_class
Rustrs(none. serde is type-safe; baseline applies because specs often pass through JS tooling)
Javajava@type, @class (Jackson / Fastjson polymorphic markers)
Kotlinkotlinalias of java (JVM/Jackson)
Clojureclojurealias of java (JVM/Cheshire)
C#cs$type, __type, @odata.type (Json.NET, DataContractJsonSerializer, OData)
VB.NETvbalias of cs
F#fsharpalias of cs
ASP.NET / ASPXcsshares the .NET serializer stack
PHPphp__construct, __destruct, __wakeup, __sleep, __serialize, __unserialize, __call, __callStatic, __get, __set, __isset, __unset, __toString, __invoke, __set_state, __clone, __debugInfo
Objective-Cobjcisa, class, superclass, description, init, _cmd (Obj-C runtime + KVC + performSelector:)
Swiftswiftalias of objc (mixed Obj-C interop; pure Codable is type-safe)
Elixirex__struct__, __exception__, __protocol__ (BEAM struct-identifier keys when JSON is decoded with :keys => :atoms)
Lualuametamethod names: __index, __newindex, __call, __metatable, __tostring, __name, __pairs, __eq, __lt, __le, __add, __sub, __mul, __div, __mod, __pow, __concat, __len, __unm, __band, __bor, __bxor, __bnot, __shl, __shr, __idiv, __close, __gc
Union of every named language abovedefaultevery extra above (the implicit default)

Languages without a dedicated entry

These ecosystems have JSON-Schema validators but either deserialize type-safely (no magic-name attack class) or their deserialization risks aren't expressible as a property-name denylist. The universal __proto__ / constructor / prototype baseline still applies via the meta-schema.

LanguageReason
Goencoding/json is reflection-by-struct-tag; no magic keys
C / C++nlohmann/json + valijson are type-safe; no runtime polymorphism via key names
Erlang / Elixir (BEAM, raw :atoms mode)Attack class is atom-table exhaustion DoS when user keys are interned via binary_to_atom/1. Use binary_to_existing_atom/1, or Jason.decode/2 without :keys => :atoms. See SECURITY.md.
Common Lispcl-json symbol-interning has the same exhaustion shape. Set cl-json:*json-symbols-package* to :keyword or nil. See SECURITY.md.
PerlJSON::PP does not auto-bless; magic only kicks in if convert_blessed is set, and the marker key is library-defined
JuliaJSONSchema.jl + JSON3.jl are type-safe

If you have a language-specific deserialization vector that fits the magic-name pattern and isn't covered, please open an issue.

Supported keywords per draft

All meta-schemas reject keywords not listed in their respective JSON Schema spec (e.g. draft-04 rejects const because it was introduced in draft-06). Keywords that ARE in a given spec but are rejected here on security grounds are flagged below.

Keyworddraft-04draft-06draft-072019-092020-12Notes
type, enum, not
allOf/anyOf/oneOf
$refRestricted to local #… or HTTPS; SSRF-checked
$id / idHTTPS URL, URN, or plain name
definitions
$defsn/an/an/a
title, description, default
constn/aType-locked to declared type
containsn/aRequires maxContains + uniqueItems
propertyNamesn/a
if/then/elsen/an/a
contentMediaType, contentEncodingn/an/aAllow-listed per RFC 6838 / RFC-standard
contentScheman/an/an/a
readOnly / writeOnlyn/aRejected in draft-06 (annotation-only, misleading for strictness); accepted but ignored later
deprecatedn/an/an/aAnnotation-only; type-checked as boolean. Rejected in older drafts where it isn't in spec
dependenciesn/an/aArray or subschema form; removed in 2019-09+, prefer dependentRequired / dependentSchemas
dependentRequiredn/an/an/a
dependentSchemasn/an/an/a
prefixItemsn/an/an/an/a
unevaluatedProperties/unevaluatedItemsn/an/an/aRequired for object/array strictness

Legend: ✓ supported · rejected on security grounds · n/a not in spec for that draft.

OWASP ASVS 5.0.0 (2026-03)

The following requirements should be considered when writing JSON Schemas used for input validation of an API endpoint.

V1 Encoding and Sanitization

  • 1.2.9: Verify that the application escapes special characters in regular expressions to prevent them from being misinterpreted as metacharacters.
  • 1.3.3: Verify that data being passed to a potentially dangerous context is sanitized beforehand to enforce safety measures, such as only allowing characters which are safe for this context and trimming input which is too long.
  • 1.3.6: Verify that the application protects against Server-side Request Forgery (SSRF) attacks, by validating untrusted data against an allowlist of protocols, domains, paths and ports and sanitizing potentially dangerous characters before using the data to call another service.
  • 1.3.12: Verify that regular expressions are free from elements causing exponential backtracking, and ensure untrusted input is sanitized to mitigate ReDoS or Runaway Regex attacks.
  • 1.4.2: Verify that sign, range, and input validation techniques are used to prevent integer overflows.
  • 1.5.2: Verify that deserialization of untrusted data enforces safe input handling, such as using an allowlist of object types or restricting client-defined object types, to prevent deserialization attacks.

V2 Validation and Business Logic

  • 2.1.1: Verify that the application's documentation defines input validation rules for how to check the validity of data items against an expected structure. This could be common data formats such as credit card numbers, email addresses, telephone numbers, or it could be an internal data format.
  • 2.1.2: Verify that the application's documentation defines how to validate the logical and contextual consistency of combined data items, such as checking that suburb and ZIP code match.
  • 2.2.1: Verify that input is validated to enforce business or functional expectations for that input. This should either use positive validation against an allow list of values, patterns, and ranges, or be based on comparing the input to an expected structure and logical limits according to predefined rules.
  • 2.2.3: Verify that the application ensures that combinations of related data items are reasonable according to the pre-defined rules.

V4 API and Web Service

  • 4.1.1: Verify that every HTTP response with a message body contains a Content-Type header field that matches the actual content of the response, including the charset parameter to specify safe character encoding (e.g., UTF-8, ISO-8859-1).
  • 4.1.4: Verify that only HTTP methods that are explicitly supported by the application or its API (including OPTIONS during preflight requests) can be used and that unused methods are blocked.
  • 4.2.2: Verify that when generating HTTP messages, the Content-Length header field does not conflict with the length of the content as determined by the framing of the HTTP protocol, in order to prevent request smuggling attacks.
  • 4.2.5: Verify that, if the application builds and sends requests, it uses validation, sanitization, or other mechanisms to avoid creating URIs or HTTP request header fields which are too long to be accepted by the receiving component.

V15 Secure Coding and Architecture

  • 15.3.3: Verify that the application has countermeasures to protect against mass assignment attacks by limiting allowed fields per controller and action, e.g., it is not possible to insert or update a field value when it was not intended to be part of that action.
  • 15.3.5: Verify that the application explicitly ensures that variables are of the correct type and performs strict equality and comparator operations to avoid type juggling or type confusion vulnerabilities.
  • 15.3.7: Verify that the application has defenses against HTTP parameter pollution attacks, particularly if the application framework makes no distinction about the source of request parameters (query string, body parameters, cookies, or header fields).

Sources

Contributions

Contributions are most welcome. Something missed, please reach out. I'd also love for security experts to give it an audit.

Keywords

JSON

FAQs

Package last updated on 04 May 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts