
Research
Malicious npm Packages Impersonate Flashbots SDKs, Targeting Ethereum Wallet Credentials
Four npm packages disguised as cryptographic tools steal developer credentials and send them to attacker-controlled Telegram infrastructure.
A powerful Python package for converting PDF files to EPUB format via Markdown with intelligent layout detection, AI-powered postprocessing, and seamless CLI/API integration.
# Basic installation
pip install pdf2epub
# Full installation with all features
pip install pdf2epub[full]
# Convert a PDF to EPUB
pdf2epub document.pdf
# Advanced options
pdf2epub book.pdf --start-page 10 --max-pages 50 --langs "English,German"
pip3 uninstall torch torchvision torchaudio
pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
pip3 uninstall torch torchvision torchaudio
pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
import torch
print(torch.__version__) # PyTorch version
print(torch.cuda.is_available()) # Should return True for NVIDIA
print(torch.mps.is_available()) # Should return True for Apple Silicon
print(torch.version.hip) # Should print ROCm version for AMD
import pdf2epub
# Simple conversion
pdf2epub.convert_pdf_to_markdown("document.pdf", "output/")
pdf2epub.convert_markdown_to_epub("output/", "final/")
# Advanced usage with AI enhancement
processor = pdf2epub.AIPostprocessor("output/")
processor.run_postprocessing("document.md", "anthropic")
pip install pdf2epub
Includes core functionality with minimal dependencies.
pip install pdf2epub[full]
Includes all features: PDF processing, AI postprocessing, and GPU acceleration.
pip install pdf2epub[dev]
Includes development tools: testing, linting, and formatting.
NVIDIA CUDA:
pip install pdf2epub[full]
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
AMD ROCm:
pip install pdf2epub[full]
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.2
# Required for AI postprocessing
export ANTHROPIC_API_KEY="your-anthropic-api-key"
# Optional: Control GPU usage
export CUDA_VISIBLE_DEVICES="0" # Use specific GPU
export CUDA_VISIBLE_DEVICES="" # Force CPU-only mode
import pdf2epub
# Configure default settings
pdf2epub.config.set_default_batch_multiplier(3)
pdf2epub.config.set_default_ai_provider("anthropic")
Run the test suite:
pytest # Run all tests
pytest --cov=pdf2epub # Run with coverage
pytest tests/test_pdf2md.py # Run specific test file
Current test coverage: 49% with 100% pass rate (41/41 tests)
Create custom AI postprocessing providers:
from pdf2epub.postprocessing.ai import AIPostprocessor
class CustomAIProvider:
@staticmethod
def getjsonparams(system_prompt: str, request: str) -> str:
# Implement your AI API integration
return process_with_custom_ai(system_prompt, request)
# Register and use your provider
processor = AIPostprocessor(work_dir)
processor.register_provider("custom", CustomAIProvider)
processor.run_postprocessing(markdown_file, "custom")
Document Type | Pages | Processing Time | Memory Usage |
---|---|---|---|
Research Paper | 20 | 45 seconds | 2.1 GB |
Technical Book | 200 | 6 minutes | 4.8 GB |
Magazine | 50 | 2 minutes | 1.9 GB |
Results on NVIDIA RTX 3080 with 16GB RAM
Feature | PDF2EPUB | calibre | pandoc |
---|---|---|---|
AI Enhancement | โ | โ | โ |
Layout Detection | โ | โ ๏ธ | โ ๏ธ |
GPU Acceleration | โ | โ | โ |
Python API | โ | โ ๏ธ | โ ๏ธ |
Plugin System | โ | โ | โ |
CLI Interface | โ | โ | โ |
FROM python:3.11-slim
RUN pip install pdf2epub[full]
WORKDIR /workspace
ENTRYPOINT ["pdf2epub"]
- name: Convert PDFs
run: |
pip install pdf2epub[full]
pdf2epub documents/*.pdf
import pdf2epub
from pathlib import Path
def production_converter(pdf_path: str) -> dict:
"""Production-ready PDF conversion with error handling."""
try:
output_dir = pdf2epub.convert_pdf_to_markdown(
pdf_path,
batch_multiplier=2, # Conservative memory usage
max_pages=1000 # Prevent runaway processing
)
epub_path = pdf2epub.convert_to_epub(output_dir)
return {
"status": "success",
"markdown_path": output_dir,
"epub_path": epub_path,
"processing_time": time.time() - start_time
}
except Exception as e:
return {
"status": "error",
"error": str(e)
}
We welcome contributions! Please see our Contributing Guide for details.
git checkout -b feature-name
pytest
black .
See CONTRIBUTING.md for detailed guidelines.
This project is licensed under the MIT License - see the LICENSE file for details.
This project builds upon excellent open-source libraries:
Transform your PDFs into beautiful, accessible EPUBs with AI-powered enhancement! ๐๐
FAQs
Convert PDF files to EPUB format via Markdown with intelligent layout detection
We found that pdf2epub demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago.ย It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Four npm packages disguised as cryptographic tools steal developer credentials and send them to attacker-controlled Telegram infrastructure.
Security News
Ruby maintainers from Bundler and rbenv teams are building rv to bring Python uv's speed and unified tooling approach to Ruby development.
Security News
Following last weekโs supply chain attack, Nx published findings on the GitHub Actions exploit and moved npm publishing to Trusted Publishers.