🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more

papershift

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

papershift

Convert PDF documents and images to Markdown format with AI assistance

0.1.2
Maintainers
1

PaperShift

A Python library for converting PDF documents and images to Markdown format with AI assistance. Shift from scanned documents and images to editable, searchable text.

Features

  • Converts PDF documents to well-formatted Markdown
  • Converts image files (PNG, JPG, etc.) to well-formatted Markdown
  • Process documents and images in parallel for faster conversion
  • Optimized memory usage with batch processing
  • Fast mode option for quicker processing with lower resolution
  • Detailed progress reporting
  • Customizable AI model selection
  • Adaptive resolution based on output requirements

Installation

pip install papershift

Usage

PDF to Markdown

from papershift import convert_pdf_to_markdown

# Basic usage
markdown_content = convert_pdf_to_markdown(
    pdf_path="path/to/your/document.pdf",
    api_key="your-openrouter-api-key"
)

# Advanced usage with options
markdown_content = convert_pdf_to_markdown(
    pdf_path="path/to/your/document.pdf",
    output_dir="output_folder",
    dpi=300,
    target_height_px=2048,
    model="openrouter/google/gemini-2.0-flash-001",
    api_key="your-openrouter-api-key",
    max_workers=4,
    batch_size=5,
    fast_mode=True
)

# Save the output
with open("output.md", "w", encoding="utf-8") as f:
    f.write(markdown_content)

Image to Markdown

from papershift import convert_image_to_markdown, convert_images_to_markdown

# Convert a single image
markdown_content = convert_image_to_markdown(
    image_path="path/to/your/image.jpg",
    api_key="your-openrouter-api-key"
)

# Convert multiple images with combined output
markdown_content = convert_images_to_markdown(
    image_paths=["image1.jpg", "image2.png", "image3.jpg"],
    output_dir="output_folder",
    api_key="your-openrouter-api-key",
    combined_output=True
)

# Convert multiple images with separate outputs
markdown_files = convert_images_to_markdown(
    image_paths=["image1.jpg", "image2.png", "image3.jpg"],
    output_dir="output_folder",
    api_key="your-openrouter-api-key",
    combined_output=False
)

Configuration Options

PDF to Markdown Options

ParameterDescriptionDefault
pdf_pathPath to the PDF file(Required)
output_dirDirectory to save the output markdown filesNone
dpiDPI for image rendering300
target_height_pxTarget height in pixels2048
aspect_thresholdAspect ratio threshold for height adjustment1.5
promptText prompt to send with each page image"Convert this document to markdown"
modelThe model to use for processing"openrouter/google/gemini-2.0-flash-001"
api_keyOpenRouter API keyNone
site_urlOptional site URL for OpenRouterNone
app_nameOptional app name for OpenRouterNone
combined_outputIf True, returns a single string with all pages combinedTrue
verboseIf True, prints progress informationFalse
max_workersMaximum number of worker processes for PDF conversion4
batch_sizeNumber of pages to process in a single batch5
qualityImage quality (1-100) for JPEG compression in fast mode95
fast_modeIf True, uses reduced resolution and JPEG format for faster processingFalse

Image to Markdown Options

ParameterDescriptionDefault
image_path / image_pathsPath to the image file or list of image paths(Required)
output_dirDirectory to save the output markdown filesNone
target_height_pxTarget height in pixels2048
aspect_thresholdAspect ratio threshold for height adjustment1.5
promptText prompt to send with each image"Convert this image to markdown"
modelThe model to use for processing"openrouter/google/gemini-2.0-flash-001"
api_keyOpenRouter API keyNone
site_urlOptional site URL for OpenRouterNone
app_nameOptional app name for OpenRouterNone
combined_outputIf True, returns a single string with all images combinedTrue
verboseIf True, prints progress informationFalse
max_workersMaximum number of worker processes for parallel processing4
qualityImage quality (1-100) for JPEG compression in fast mode95
fast_modeIf True, uses reduced resolution and JPEG format for faster processingFalse

Dependencies

  • PyMuPDF: PDF processing library
  • Pillow: Image processing library
  • litellm: LLM API integration
  • openrouter: API for accessing various AI models
  • python-dotenv: Environment variable management

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts