
Security News
Deno 2.6 + Socket: Supply Chain Defense In Your CLI
Deno 2.6 introduces deno audit with a new --socket flag that plugs directly into Socket to bring supply chain security checks into the Deno CLI.
layoutscribe
Advanced tools
LLM-powered layout & text extraction for PDFs, slides, and Word docs
LLM-only, agentic parser that converts PDF / PPTX / DOCX into clean Markdown, plain text, and layout JSON (with normalized bounding boxes).
Built with LangGraph (agent orchestration), LiteLLM (provider-agnostic multimodal calls), and MLflow (tracing).
No OCR engines, no heuristic parsers. Rendering to images is allowed; all structure and text understanding is done by a multimodal LLM.
blocks with type, bbox[0..1], text, conf)0.1 (alpha) released — see CHANGELOG.md and docs/ROADMAP.md.
Requires Python 3.10+.
pip install layoutscribe
Optional extras:
# Office file support (PPTX/DOCX rendering via python-pptx / python-docx)
pip install "layoutscribe[office]"
# Development tools (ruff, black, pytest)
pip install "layoutscribe[dev]"
Runtime notes:
python-pptx, python-docx (install with [office])Set provider keys as environment variables (see CONFIGURATION.md). Example .env:
OPENAI_API_KEY=sk-...
LAYOUTSCRIBE_DPI=180
layoutscribe parse ./samples/report.pdf \
--llm openai/gpt-4o \
--outputs markdown text layout_json \
--output-dir ./artifacts/report \
--dpi 180 --parallel-pages 6 --budget-usd 0.50
import asyncio
from layoutscribe.api import parse as ls_parse
async def main() -> None:
doc = await ls_parse(
path="samples/report.pdf",
outputs=["markdown", "text", "layout_json"],
llm="openai/gpt-4o",
dpi=180,
parallel_pages=6,
budget_usd=0.50,
save_intermediate=True,
)
print(doc.metadata)
print(doc.markdown[:1000])
if __name__ == "__main__":
asyncio.run(main())
./artifacts/report/
document.md
document.txt
layout.json
overlays/
page-0001.png
page-0002.png
intermediate/
page-0001.json
See docs/CONFIGURATION.md for provider-specific env vars, defaults, and precedence. MLflow tracing is opt-in via --trace-mlflow.
LiteLLM reads provider keys from environment variables. Set only those you need:
# OpenAI
OPENAI_API_KEY=sk-...
# Azure OpenAI
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://<your-resource>.openai.azure.com/
AZURE_OPENAI_API_VERSION=2024-02-15-preview
# Anthropic
ANTHROPIC_API_KEY=...
# Google (Gemini)
GOOGLE_API_KEY=...
Use --llm to pick a model via LiteLLM:
--llm openai/gpt-4o
--llm azure/<deployment_name>
--llm anthropic/claude-3.5-sonnet
--llm google/gemini-1.5-pro
Notes:
Apache-2.0 (see LICENSE).
FAQs
LLM-only, agentic layout & text extraction to Markdown/Text/Layout JSON
We found that layoutscribe demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Deno 2.6 introduces deno audit with a new --socket flag that plugs directly into Socket to bring supply chain security checks into the Deno CLI.

Security News
New DoS and source code exposure bugs in React Server Components and Next.js: what’s affected and how to update safely.

Security News
Socket CEO Feross Aboukhadijeh joins Software Engineering Daily to discuss modern software supply chain attacks and rising AI-driven security risks.