
Security News
The Hidden Blast Radius of the Axios Compromise
The Axios compromise shows how time-dependent dependency resolution makes exposure harder to detect and contain.
apify-schema-tools
Advanced tools
This is a tool intended for Apify actors developers.
It allows generating JSON schemas and TypeScript types, for input and dataset, from a single source of truth, with a few extra features.
As a quick example, assume you have a project that looks like this:
my-project
├── .actor
│ ├── actor.json
│ ├── dataset_schema.json
│ └── input_schema.json
└── src-schemas
├── dataset-item.json <-- source file for dataset
└── input.json <-- source file for input
After running this script, you will have:
my-project
├── .actor
│ ├── actor.json
│ ├── dataset_schema.json <-- updated with the definitions from src-schemas
│ └── input_schema.json <-- updated with the definitions from src-schemas
├── src
│ └── generated
│ ├── dataset.ts <-- TypeScript types generated from src-schemas
│ ├── input-utils.ts <-- utilities to fill input default values
│ └── input.ts <-- TypeScript types generated from src-schemas
└── src-schemas
├── dataset-item.json
└── input.json
These instructions will allow you to quickly get to a point where you can use
the apify-schema-tools to generate your schemas and TypeScript types.
Let's assume you are starting from a new project created from an Apify template.
apify-schema-tools:npm i -D apify-schema-tools
npx apify-schema-tools init
This command will:
src-schemas folder with input.json and dataset-item.json files..actor files if they don't exist.package.json.generate script to your package.json.npx apify-schema-tools sync
import { Actor } from 'apify';
import type { DatasetItem } from './generated/dataset.ts';
import type { Input } from './generated/input.ts';
import { getInputWithDefaultValues, type InputWithDefaults } from './generated/input-utils.ts';
await Actor.init();
const input: InputWithDefaults = getInputWithDefaultValues(await Actor.getInput<Input>());
'...'
await Actor.pushData<DatasetItem>({
tile: '...',
url: '...',
text: '...',
timestamp: '...',
});
await Actor.exit();
You can configure apify-schema-tools in two ways:
The init command automatically adds configuration to your package.json. You can also manually add an apify-schema-tools section to customize the behavior:
{
"name": "my-actor",
"version": "1.0.0",
"apify-schema-tools": {
"input": ["input", "dataset"],
"output": ["json-schemas", "ts-types"],
"srcInput": "src-schemas/input.json",
"srcDataset": "src-schemas/dataset-item.json",
"outputTSDir": "src/generated",
"includeInputUtils": true
}
}
You can also pass options directly to the sync command. You can check which options are available:
$ npx apify-schema-tools --help
usage: apify-schema-tools [-h] {init,sync,check} ...
Apify Schema Tools - Generate JSON schemas and TypeScript files for Actor input and output dataset.
positional arguments:
{init,sync,check}
init Initialize the Apify Schema Tools project with default settings.
sync Generate JSON schemas and TypeScript files from the source schemas.
check Check the schemas for consistency and correctness.
optional arguments:
-h, --help show this help message and exit
$ npx apify-schema-tools sync --help
usage: apify-schema-tools sync [-h] [-i [{input,dataset} ...]] [-o [{json-schemas,ts-types} ...]] [--src-input SRC_INPUT] [--src-dataset SRC_DATASET] [--add-input ADD_INPUT] [--add-dataset ADD_DATASET] [--input-schema INPUT_SCHEMA] [--dataset-schema DATASET_SCHEMA] [--output-ts-dir OUTPUT_TS_DIR]
[--deep-merge] [--include-input-utils {true,false}]
optional arguments:
-h, --help show this help message and exit
-i [{input,dataset} ...], --input [{input,dataset} ...]
specify which sources to use for generation (default: input,dataset)
-o [{json-schemas,ts-types} ...], --output [{json-schemas,ts-types} ...]
specify what to generate (default: json-schemas,ts-types)
--src-input SRC_INPUT
path to the input schema source file (default: src-schemas/input.json)
--src-dataset SRC_DATASET
path to the dataset schema source file (default: src-schemas/dataset-item.json)
--add-input ADD_INPUT
path to an additional schema to merge into the input schema (default: undefined)
--add-dataset ADD_DATASET
path to an additional schema to merge into the dataset schema (default: undefined)
--input-schema INPUT_SCHEMA
the path of the destination input schema file (default: .actor/input_schema.json)
--dataset-schema DATASET_SCHEMA
the path of the destination dataset schema file (default: .actor/dataset_schema.json)
--output-ts-dir OUTPUT_TS_DIR
path where to save generated TypeScript files (default: src/generated)
--deep-merge whether to deep merge additional schemas into the main schema (default: false)
--include-input-utils {true,false}
include input utilities in the generated TypeScript files: 'input' input and 'ts-types' output are required (default: true)
If you prefer to set up your project manually instead of using the init command, you can follow these steps:
src-schemas folder:mkdir src-schemas
input.json and dataset-item.json inside the src-schemas. Here is some example content:{
"title": "Input schema for Web Scraper",
"type": "object",
"schemaVersion": 1,
"properties": {
"startUrls": {
"type": "array",
"title": "Start URLs",
"description": "List of URLs to scrape",
"default": [],
"editor": "requestListSources",
"items": {
"type": "object",
"properties": {
"url": { "type": "string" }
}
}
}
},
"required": ["startUrls"],
"additionalProperties": false
}
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "Dataset schema for Web Scraper",
"type": "object",
"properties": {
"title": {
"type": "string",
"title": "Title",
"description": "Page title"
},
"url": {
"type": "string",
"title": "URL",
"description": "Page URL"
},
"text": {
"type": "string",
"title": "Text content",
"description": "Extracted text"
},
"timestamp": {
"type": "string",
"title": "Timestamp",
"description": "When the data was scraped"
}
},
"required": ["title", "url"]
}
.actor/dataset_schema.json and enter some empty content:{
"actorSpecification": 1,
"fields": {},
"views": {}
}
.actor/actor.json:{
"actorSpecification": 1,
"...": "...",
"input": "./input_schema.json",
"storages": {
"dataset": "./dataset_schema.json"
},
"...": "..."
}
npx apify-schema-tools sync
The sync command includes interactive conflict resolution to help you handle schema inconsistencies.
When the tool detects conflicts between your source schemas and existing target schemas,
it will prompt you to choose which version to keep.
Conflicts occur when there are differences between your source schema files and the schemas that would be generated in the target locations. Common scenarios include:
By default, when conflicts are detected, the tool will prompt you interactively to resolve each conflict:
⚠️ Field [properties > startUrls > description] in the source schema differs from
the target schema. Choose which to keep: (Use arrow keys)
❯ [source] List of URLs to scrape
[target] List of URLs to parse
⚠️ Property "searchTerm" was removed from the source schema. What do you want to do? (Use arrow keys)
❯ Confirm deletion
Restore field
For automated scripts or CI/CD pipelines, you can use these options:
--force)Automatically resolves all conflicts by preferring the source schema:
npx apify-schema-tools sync --force
This will:
--fail-on-conflict)Stops execution and exits with an error code when conflicts are detected:
npx apify-schema-tools sync --fail-on-conflict
The check command allows you to verify that your generated schemas and TypeScript files are up-to-date with your source schemas.
This is particularly useful in CI/CD pipelines to ensure that developers haven't forgotten to run the generation after making changes to the source schemas.
npx apify-schema-tools check
The check command will:
You can add this to your CI pipeline to automatically detect when schemas need to be regenerated:
{
"scripts": {
"generate": "apify-schema-tools sync",
"check-schemas": "apify-schema-tools check",
"test": "npm run check-schemas && npm run test:unit"
}
}
The check command accepts the same configuration options as the sync command,
either through package.json configuration or command-line arguments,
ensuring it checks the same files that would be generated.
--ignore-descriptions)The check command can ignore the title and description fields in the source and target schemas, and their properties.
This allows you to edit your descriptions and change how your Actor will appear on the Apify platform,
without having to run this tool to synchronize the schemas, but still being able to check for semantical correctness:
npx apify-schema-tools check --ignore-descriptions
The next time someone will try to run the sync command,
they will be prompted to solve the conflicts in the descriptions.
As an example, when type is "array", the property items is forbidden if editor is different from "select".
This feature is useful when working in monorepos. It allows you to define a single common schema across all the actors in the repo, and to add or override the tile, the description, and some properties, when necessary.
To use it, use the parameters --add-input and --add-dataset, e.g.:
npx apify-schema-tools sync \
--input input,dataset \
--output json-schemas,ts-types \
--src-input ../src-schemas/input.json \
--src-dataset ../src-schemas/dataset-item.json \
--add-input src-schemas/input.json \
--add-dataset src-schemas/dataset-item.json
You can also define the order of the properties in the merged schema.
To do so, add a position field to the properties. The script will follow these rules:
An example:
# Source input schema
{
"title": "My input schema",
"description": "My input properties",
"type": "object",
"properties": {
"a": { "type": "string", "position": 3 },
"b": { "type": "string" }, // will be last, because it has no position
"c": { "type": "string", "position": 1 }
},
"required": ["a"],
"additionalProperties": false
}
# Additional input schema
{
"description": "My input properties, a bit changed", // will override the description
"type": "object",
"properties": {
"c": { "type": "boolean", "position": 5 }, // will override also the position
"d": { "type": "string", "position": 1 } // will be first
},
"required": ["c", "d"], // will be merged to the source required parameters
"additionalProperties": false
}
# Final input schema
{
"title": "My input schema",
"description": "My input properties, a bit changed",
"type": "object",
"properties": {
"d": { "type": "string" },
"a": { "type": "string" },
"c": { "type": "boolean" },
"b": { "type": "string" }
},
"required": ["a", "c", "d"],
"additionalProperties": false
}
Use the option --deep-merge to merge object properties and array items, instead of overwriting every definition.
FAQs
Apify schema managing tools.
The npm package apify-schema-tools receives a total of 14 weekly downloads. As such, apify-schema-tools popularity was classified as not popular.
We found that apify-schema-tools demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
The Axios compromise shows how time-dependent dependency resolution makes exposure harder to detect and contain.

Research
A supply chain attack on Axios introduced a malicious dependency, plain-crypto-js@4.2.1, published minutes earlier and absent from the project’s GitHub releases.

Research
Malicious versions of the Telnyx Python SDK on PyPI delivered credential-stealing malware via a multi-stage supply chain attack.