Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

filecrystal

Package Overview
Dependencies
Maintainers
1
Versions
8
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

filecrystal

Universal file parser for PDFs, images, xlsx/xls, docx — outputs Markdown and prompt-defined JSON via any OpenAI-compatible API.

latest
Source
npmnpm
Version
0.5.3
Version published
Weekly downloads
20
17.65%
Maintainers
1
Weekly downloads
 
Created
Source

filecrystal

Universal file parser for PDFs, images, xlsx/xls and docx — with structured field extraction via any OpenAI-compatible API.

npm license node

Why

One consistent ParseResult for every supported file format, plus a Markdown-first pipeline. Plug any OpenAI-compatible provider (OpenAI / Moonshot / DeepSeek / 阿里百炼 / self-hosted vLLM) for OCR, seal detection and prompt-driven field extraction — switching provider is a change of baseUrl + model, not a code rewrite.

Install

pnpm add filecrystal
# or
npm i filecrystal

Quick start (CLI)

Two focused subcommands: extract (files → Markdown) and structure (Markdown / files → prompt-defined JSON).

# 1. Parse files to Markdown — defaults to writing next to each input
filecrystal extract ./a.pdf ./b.xlsx
# → ./a.md  ./b.md

# Write to a dedicated directory
filecrystal extract ./*.pdf --out ./out/

# 2. Extract structured fields with a prompt (file or inline)
filecrystal structure ./out/a.md --prompt ./prompts/contract.prompt.md
filecrystal structure ./out/a.md --prompt-text '输出 JSON: {"title":"..."}'

Full option reference: docs/CLI.md.

Credentials

# Default OCR/LLM backend: any OpenAI-compatible provider.
# Alibaba 百炼 (Qwen) is the default preset when you use DashScope.
export FILECRYSTAL_MODEL_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
export FILECRYSTAL_MODEL_API_KEY=sk-your-key-here

# Optional model overrides
export FILECRYSTAL_VISION_MODEL=qwen-vl-ocr-latest        # OCR + seal detection
export FILECRYSTAL_TEXT_MODEL=qwen3.6-plus                # structure stage
export FILECRYSTAL_VISION_MODEL_THINKING=false            # Qwen3 reasoning for OCR
export FILECRYSTAL_TEXT_MODEL_THINKING=false              # Qwen3 reasoning for structure

# Optional concurrency tuning
export FILECRYSTAL_FILE_CONCURRENCY=20   # CLI file-level parallelism (extract + structure)
export FILECRYSTAL_OCR_CONCURRENCY=24    # process-wide OCR / vision pool; lower if rate-limited

Aliyun OCR provider

For OCR-only Markdown extraction you can use Aliyun OCR directly, without an OpenAI-compatible vision model:

export FILECRYSTAL_OCR_PROVIDER=aliyun-ocr
export FILECRYSTAL_ALIYUN_ACCESS_KEY_ID=your-access-key-id
export FILECRYSTAL_ALIYUN_ACCESS_KEY_SECRET=your-access-key-secret

filecrystal extract ./scan.pdf --out ./out/

Aliyun OCR uses RecognizeAdvanced with automatic rotation and table output on by default, so scanned forms and payment tables can render as Markdown tables without extra flags. filecrystal structure still needs text LLM credentials (FILECRYSTAL_MODEL_BASE_URL + FILECRYSTAL_MODEL_API_KEY) because that stage runs prompt-defined JSON extraction after Markdown is produced.

Quick start (library)

import {
  createFileParser,
  createStructuredExtractor,
  parseMany,
  toMarkdown,
  toStructureSource,
} from 'filecrystal';

// --- Mock mode — works offline, deterministic placeholders ---
const parser = createFileParser({ mode: 'mock' });
const { raw, source } = await parser.parse('./contract.pdf');
const md = toMarkdown({ raw, source });

// --- API mode ---
const apiParser = createFileParser({
  mode: 'api',
  openai: {
    baseUrl: process.env.FILECRYSTAL_MODEL_BASE_URL!,
    apiKey: process.env.FILECRYSTAL_MODEL_API_KEY!,
    models: { ocr: 'qwen-vl-ocr-latest', vision: 'qwen-vl-max', text: 'qwen3.6-plus' },
  },
});

// --- Batch: many files concurrently ---
const batch = await parseMany(apiParser, ['./a.pdf', './b.xlsx'], { concurrency: 3 });

// --- Structured extraction: every source becomes Markdown text first,
//     then one prompt bundles them in argv order (single LLM call by default). ---
const extractor = createStructuredExtractor({
  mode: 'api',
  openai: { /* same as above */ },
});
const sources = batch.items
  .filter((i) => i.ok && i.result)
  .map((i) => toStructureSource(i.result!));  // → { name, text } per source
const { extracted } = await extractor.extract(sources, {
  prompt: customPromptMarkdown /* optional */,
});

Supported formats

FormatNotes
xlsx / xlsSheetJS; cells, merges, formulas
pdftext-layer first, OCR fallback via @napi-rs/canvas + sharp preprocess
jpg / pngsharp preprocessing (EXIF rotate, long-edge 2000, JPEG q85) → OCR
docx / docmammoth main + word-extractor fallback; embedded images scanned for seals

Output shape (library)

interface ParseResult {
  schemaVersion: '1.0';
  parsedAt: string;
  parserVersion: string;
  source: ParsedSource;       // filePath, fileName, fileFormat, fileHash, pageCount, ...
  raw: ParsedRaw;             // pages | sheets | sections | fullText | seals | signatures
  extracted?: Record<string, unknown>;  // only when options.prompt — prompt owns the schema
  metrics: ParseMetrics;      // quality / performance / cost (CNY)
  warnings?: string[];
}

Full TypeScript contract: specs/001-file-parser/contracts/types.d.ts. JSON Schema at runtime:

import { getParseResultJsonSchema } from 'filecrystal/schema';
console.log(getParseResultJsonSchema());

Integration examples

Both integration surfaces — CLI and SDK — produce the same output shape. Pick either for your workflow.

SurfaceWhen to useDemo
CLIshell scripts · CI pipelines · language-agnostic integrationsexamples/cli-workflow.sh
SDKNode.js apps · custom pre/post-processing · tight error handlingexamples/sdk-workflow.mjs

Both walk through the same two stages: extract → Markdown, then structure → prompt-defined JSON. See examples/README.md for the quick-start commands.

Development

pnpm install
pnpm build
pnpm test
pnpm typecheck
pnpm lint

Spec-driven docs live in specs/001-file-parser/. Contributing guide: CONTRIBUTING.md.

License

MIT — see LICENSE.

Keywords

file-parser

FAQs

Package last updated on 28 Apr 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts