🚨 Shai-Hulud Strikes Again:834 Packages Compromised.Technical Analysis →
Socket
Book a DemoInstallSign in
Socket

pymupdf-layout

Package Overview
Dependencies
Maintainers
1
Versions
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

pymupdf-layout

Commercial extension for PyMuPDF

pipPyPI
Version
1.26.6
Maintainers
1

PyMuPDF Layout

PyMuPDF Layout is a fast and lightweight layout analysis Python package integrated with PyMuPDF for clean, structured data output from PDF. It's fast, accurate and doesn't need GPUs like vision-based models.

While other tools train machine learning models on rendered page images, PyMuPDF Layout trains Graph Neural Networks directly on PDF internals. This gives us accuracy at 10× the speed utilizing CPU-only resources.

License PolyForm Noncommercial Python version

Features

  • 📚 Structured data extraction from your documents in Markdown, JSON or TXT format
  • 🧐 Advanced document page layout understanding, including semantic markup for titles, headings, headers, footers, tables, images and text styling
  • 🔍 Detect and isolate header and footer patterns on each page

Usage

PyMuPDF Layout works alongside PyMuDF4LLM's to_markdown method. Once PyMuPDF Layout is activated just use to_markdown and PyMuPDF Layout will work behind the scenes to analyse documents and deliver improved results.

You can also get a JSON or TXT format of the data with to_json or to_text.

Extract Structured data

import pymupdf.layout
pymupdf.layout.activate()
import pymupdf4llm
doc = pymupdf.open(source)
md = pymupdf4llm.to_markdown(doc)
json = pymupdf4llm.to_json(doc)
txt = pymupdf4llm.to_text(doc)

Try It!

Try PyMuPDF Layout on our PyMuPDF website.

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts