
Security News
Vite Releases Technical Preview of Rolldown-Vite, a Rust-Based Bundler
Vite releases Rolldown-Vite, a Rust-based bundler preview offering faster builds and lower memory usage as a drop-in replacement for Vite.
A Python library for converting PDF documents and images to Markdown format with AI assistance. Shift from scanned documents and images to editable, searchable text.
pip install papershift
from papershift import convert_pdf_to_markdown
# Basic usage
markdown_content = convert_pdf_to_markdown(
pdf_path="path/to/your/document.pdf",
api_key="your-openrouter-api-key"
)
# Advanced usage with options
markdown_content = convert_pdf_to_markdown(
pdf_path="path/to/your/document.pdf",
output_dir="output_folder",
dpi=300,
target_height_px=2048,
model="openrouter/google/gemini-2.0-flash-001",
api_key="your-openrouter-api-key",
max_workers=4,
batch_size=5,
fast_mode=True
)
# Save the output
with open("output.md", "w", encoding="utf-8") as f:
f.write(markdown_content)
from papershift import convert_image_to_markdown, convert_images_to_markdown
# Convert a single image
markdown_content = convert_image_to_markdown(
image_path="path/to/your/image.jpg",
api_key="your-openrouter-api-key"
)
# Convert multiple images with combined output
markdown_content = convert_images_to_markdown(
image_paths=["image1.jpg", "image2.png", "image3.jpg"],
output_dir="output_folder",
api_key="your-openrouter-api-key",
combined_output=True
)
# Convert multiple images with separate outputs
markdown_files = convert_images_to_markdown(
image_paths=["image1.jpg", "image2.png", "image3.jpg"],
output_dir="output_folder",
api_key="your-openrouter-api-key",
combined_output=False
)
Parameter | Description | Default |
---|---|---|
pdf_path | Path to the PDF file | (Required) |
output_dir | Directory to save the output markdown files | None |
dpi | DPI for image rendering | 300 |
target_height_px | Target height in pixels | 2048 |
aspect_threshold | Aspect ratio threshold for height adjustment | 1.5 |
prompt | Text prompt to send with each page image | "Convert this document to markdown" |
model | The model to use for processing | "openrouter/google/gemini-2.0-flash-001" |
api_key | OpenRouter API key | None |
site_url | Optional site URL for OpenRouter | None |
app_name | Optional app name for OpenRouter | None |
combined_output | If True, returns a single string with all pages combined | True |
verbose | If True, prints progress information | False |
max_workers | Maximum number of worker processes for PDF conversion | 4 |
batch_size | Number of pages to process in a single batch | 5 |
quality | Image quality (1-100) for JPEG compression in fast mode | 95 |
fast_mode | If True, uses reduced resolution and JPEG format for faster processing | False |
Parameter | Description | Default |
---|---|---|
image_path / image_paths | Path to the image file or list of image paths | (Required) |
output_dir | Directory to save the output markdown files | None |
target_height_px | Target height in pixels | 2048 |
aspect_threshold | Aspect ratio threshold for height adjustment | 1.5 |
prompt | Text prompt to send with each image | "Convert this image to markdown" |
model | The model to use for processing | "openrouter/google/gemini-2.0-flash-001" |
api_key | OpenRouter API key | None |
site_url | Optional site URL for OpenRouter | None |
app_name | Optional app name for OpenRouter | None |
combined_output | If True, returns a single string with all images combined | True |
verbose | If True, prints progress information | False |
max_workers | Maximum number of worker processes for parallel processing | 4 |
quality | Image quality (1-100) for JPEG compression in fast mode | 95 |
fast_mode | If True, uses reduced resolution and JPEG format for faster processing | False |
FAQs
Convert PDF documents and images to Markdown format with AI assistance
We found that papershift demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Vite releases Rolldown-Vite, a Rust-based bundler preview offering faster builds and lower memory usage as a drop-in replacement for Vite.
Research
Security News
A malicious npm typosquat uses remote commands to silently delete entire project directories after a single mistyped install.
Research
Security News
Malicious PyPI package semantic-types steals Solana private keys via transitive dependency installs using monkey patching and blockchain exfiltration.