New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details →

Book a Demo Sign in

reducto-cli

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

reducto-cli

CLI for Reducto document processing

PyPI

Version: 0.1.2

Maintainers: 1

Reducto CLI

A command-line tool for document parsing, structured data extraction, and document editing — powered by Reducto's document intelligence API.

Parse PDFs, images, spreadsheets, and Office documents into clean Markdown. Extract structured JSON using schemas. Edit documents with natural language instructions. Process single files or entire directories.

Documentation | Reducto Studio | API Quickstart | Python SDK | Claude Code Plugin

Installation

pip install reducto-cli

Requires Python 3.11 or later.

Authentication

Authenticate using the built-in device code flow, which opens a browser to Reducto Studio:

reducto login

This saves your API key to ~/.reducto/config.yaml.

Alternatively, set the REDUCTO_API_KEY environment variable directly:

export REDUCTO_API_KEY="your_api_key_here"

Get an API key by signing up at studio.reducto.ai.

Quick Start

# Parse a PDF into Markdown
reducto parse invoice.pdf

# Parse an entire folder of documents
reducto parse ./contracts/

# Extract structured data using a JSON Schema
reducto extract invoice.pdf -s schema.json

# Edit a document with natural language
reducto edit form.pdf -i "Fill in the client name as 'Acme Corp'"

Commands

Parse Command

Converts documents into structured Markdown, preserving layout, tables, and figures. Uses Reducto's Parse API with agentic OCR and vision-language models.

reducto parse <path> [options]

Output is written to <filename>.parse.md with YAML front matter containing the job ID and processing duration.

Options

Flag	Description
`--agentic`	Enables agentic processing for tables, text, and figures. Higher accuracy, higher latency. Use for complex layouts or low-quality scans.
`--change-tracking`	Returns `<s>`, `<u>`, and `<change>` tags for strikethrough, underlined, and revised text. Useful for contracts and legal redlines.
`--highlights`	Include highlighted text in output.
`--hyperlinks`	Include embedded hyperlinks in output.
`--comments`	Include document comments in output.

Examples

# Basic parse
reducto parse document.pdf

# High-accuracy parse for complex layouts
reducto parse scanned_report.pdf --agentic

# Parse a contract with revision tracking
reducto parse contract.pdf --change-tracking

# Parse with all metadata preserved
reducto parse document.pdf --hyperlinks --comments --highlights

# Combine flags
reducto parse legal_doc.pdf --agentic --change-tracking --comments

Extract Command

Pulls structured data from documents according to a JSON Schema you provide. Maps unstructured content — invoices, receipts, forms, contracts, financial statements — into machine-readable JSON.

reducto extract <path> --schema <schema>

The schema can be a path to a .json file or an inline JSON string. Output is saved as <filename>.extract.json.

The CLI automatically reuses existing parse results: if a .parse.md file exists for a document, its recorded job ID is used via jobid:// references to skip re-parsing.

Schema Requirements

Must be a valid JSON Schema document.
The top-level type must be object — arrays and primitives are not permitted at the top level.
Schemas can be provided as file paths or inline JSON strings.

Example Schema

{
  "type": "object",
  "properties": {
    "vendor_name": { "type": "string" },
    "invoice_number": { "type": "string" },
    "date": { "type": "string" },
    "line_items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "description": { "type": "string" },
          "quantity": { "type": "number" },
          "unit_price": { "type": "number" },
          "total": { "type": "number" }
        },
        "required": ["description", "quantity", "unit_price", "total"]
      }
    },
    "total_amount": { "type": "number" }
  },
  "required": ["vendor_name", "invoice_number", "line_items", "total_amount"]
}

Examples

# Extract using a schema file
reducto extract invoice.pdf -s schemas/invoice.json

# Extract from a folder of invoices
reducto extract ./invoices/ -s schemas/invoice.json

# Extract with inline JSON schema
reducto extract receipt.pdf -s '{"type":"object","properties":{"total":{"type":"number"},"date":{"type":"string"}},"required":["total","date"]}'

Edit Command

Modifies documents using natural language instructions. Uploads the document, applies edits via the Reducto Edit API, and downloads the result.

reducto edit <path> --instructions "<instructions>"

Edited files are saved as <filename>.edited.<extension> (e.g., form.pdf becomes form.edited.pdf).

Parameter	Required	Description
`path`	Yes	Path to a file or directory.
`--instructions`, `-i`	Yes	Natural language instructions for the edits.

Examples

# Fill out a PDF form
reducto edit application.pdf -i "Fill in: Name: Jane Smith, Date: 2025-03-15, check 'Agree to terms'"

# Update a contract
reducto edit contract.pdf -i "Fill in the client name as 'Acme Corporation' and set the effective date to January 15, 2025"

# Batch edit a folder of forms
reducto edit ./forms/ -i "Set the company name to 'Globex Inc' in all header fields"

Tips for Effective Instructions

Be specific about which elements to modify (headers, tables, specific fields).
Reference content by name or position when possible.
Describe the desired outcome, not the process.
For batch operations, write instructions that apply uniformly across all files.

Supported File Types

Category	Extensions
PDF	`.pdf`
Images	`.png`, `.jpg`, `.jpeg`
Office Documents	`.doc`, `.docx`, `.ppt`, `.pptx`
Spreadsheets	`.xls`, `.xlsx`, `.numbers`

All commands accept a single file or a directory. Directories are scanned recursively and only supported file types are processed. Generated output files (.parse.md, .extract.json) are automatically excluded from processing.

Use Cases

Invoice and Receipt Processing

Parse invoices from any vendor format, then extract line items, totals, and payment details into structured JSON for your accounting pipeline.

reducto parse ./invoices/
reducto extract ./invoices/ -s schemas/invoice.json

Contract and Legal Document Review

Parse contracts with change tracking to surface redlines and revisions. Extract key clauses, dates, and party names for contract management systems.

reducto parse contract.pdf --agentic --change-tracking --comments
reducto extract contract.pdf -s schemas/contract_terms.json

Form Processing and Auto-Fill

Edit PDF and DOCX forms programmatically — fill fields, check boxes, and populate tables without manual data entry.

reducto edit onboarding_form.pdf -i "Fill in employee name: Alex Chen, start date: 2025-04-01, department: Engineering, select 'Full-time' for employment type"

Financial Statement Analysis

Extract tables and figures from bank statements, earnings reports, and tax documents into structured data for financial modeling.

reducto extract quarterly_report.pdf -s schemas/financial_statement.json

Medical and Insurance Document Processing

Parse lab reports, claims forms, and patient intake documents. Reducto is HIPAA compliant for healthcare workflows.

reducto parse lab_results.pdf --agentic
reducto extract claim_form.pdf -s schemas/insurance_claim.json

Batch Document Digitization

Convert entire folders of scanned documents, presentations, and spreadsheets into searchable Markdown for knowledge bases or RAG pipelines.

reducto parse ./legacy_docs/ --agentic

Feeding Data to LLM Pipelines

Parse documents into clean Markdown optimized for LLM consumption, then use the structured output as context for retrieval-augmented generation (RAG) systems.

# Parse into LLM-ready Markdown
reducto parse ./knowledge_base/

# Or extract specific fields for structured RAG
reducto extract ./knowledge_base/ -s schemas/document_metadata.json

How It Works

Upload — The CLI uploads your document to Reducto's API.
Process — Reducto applies agentic OCR, layout detection, and vision-language models to understand document structure.
Return — Parsed Markdown, extracted JSON, or edited documents are downloaded to your local filesystem.

Files within a directory are processed concurrently. Parse results are cached locally (.parse.md files with job IDs), so subsequent extract commands skip re-parsing.

Configuration

Method	Details
Device code login	`reducto login` — opens browser, saves key to `~/.reducto/config.yaml`
Environment variable	`export REDUCTO_API_KEY="your_key"` — takes precedence over saved config
Manual entry	The CLI prompts for manual key entry as a fallback

The config file is stored at ~/.reducto/config.yaml with 0600 permissions.

Project	Description
Reducto Python SDK	Full Python client for the Reducto API (`pip install reductoai`)
Reducto Node.js SDK	Node.js client for the Reducto API (`npm install reductoai`)
Reducto Go SDK	Go client for the Reducto API
Reducto Claude Code Plugins	Official Reducto plugins for Claude Code
Reducto Studio	No-code web interface for document processing

Resources

Reducto Documentation — API reference, guides, and tutorials
API Quickstart — Get started with the Reducto API
Security & Compliance — SOC 2 Type II, HIPAA, and data handling policies
Reducto Website — Product overview and company information
PyPI Package — Package registry listing

FAQs

What is reducto-cli?

Is reducto-cli well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

reducto-cli

Reducto CLI

Table of Contents

Installation

Authentication

Quick Start

Commands

Parse Command

Options

Examples

Extract Command

Schema Requirements

Example Schema

Examples

Edit Command

Examples

Tips for Effective Instructions

Supported File Types

Use Cases

Invoice and Receipt Processing

Contract and Legal Document Review

Form Processing and Auto-Fill

Financial Statement Analysis

Medical and Insurance Document Processing

Batch Document Digitization

Feeding Data to LLM Pipelines

How It Works

Configuration

Resources

Related posts

reducto-cli

Reducto CLI

Table of Contents

Installation

Authentication

Quick Start

Commands

Parse Command

Options

Examples

Extract Command

Schema Requirements

Example Schema

Examples

Edit Command

Examples

Tips for Effective Instructions

Supported File Types

Use Cases

Invoice and Receipt Processing

Contract and Legal Document Review

Form Processing and Auto-Fill

Financial Statement Analysis

Medical and Insurance Document Processing

Batch Document Digitization

Feeding Data to LLM Pipelines

How It Works

Configuration

Related Projects

Resources

Related posts

NIST Officially Stops Enriching Most CVEs as Vulnerability Volume Skyrockets

Socket Selected for OpenAI's Cybersecurity Grant Program