
Research
/Security News
Contagious Interview Campaign Escalates With 67 Malicious npm Packages and New Malware Loader
North Korean threat actors deploy 67 malicious npm packages using the newly discovered XORIndex malware loader.
A comprehensive PDF processing toolkit that converts PDFs to markdown with advanced AI-powered features for image and table analysis. Supports local files and URLs, preserves document structure, extracts high-quality images, detects tables using advanced ML models, and generates detailed content descriptions using multiple LLM providers including OpenAI and Google's Gemini.
A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted tables/images using several LLM clients. And many more functionalities. Markdrop is available on PyPI.
pip install markdrop
After installing the package, you can use the markdrop
command-line interface.
1. Convert PDF to Markdown and HTML:
markdrop convert <input_path> --output_dir <output_directory> [--add_tables]
<input_path>
: Path or URL to the input PDF file.<output_directory>
: Directory to save output files (default: output
).--add_tables
: (Optional) Add downloadable tables to the HTML output.Example:
markdrop convert my_document.pdf --output_dir processed_docs --add_tables
2. Generate Descriptions for Images and Tables in a Markdown File:
markdrop describe <input_path> --output_dir <output_directory> --ai_provider <provider> [--remove_images] [--remove_tables]
<input_path>
: Path to the markdown file.<output_directory>
: Directory to save the processed file (default: output
).<provider>
: AI provider to use (gemini
or openai
).--remove_images
: (Optional) Remove images from the markdown file.--remove_tables
: (Optional) Remove tables from the markdown file.Example:
markdrop describe my_markdown.md --output_dir described_content --ai_provider gemini --remove_images
3. Analyze Images in a PDF File:
markdrop analyze <input_path> --output_dir <output_directory> [--save_images]
<input_path>
: Path or URL to the PDF file.<output_directory>
: Directory to save analysis results (default: output/analysis
).--save_images
: (Optional) Save extracted images.Example:
markdrop analyze report.pdf --output_dir pdf_analysis --save_images
4. Set Up API Keys for AI Providers:
markdrop setup <provider>
<provider>
: The AI provider to set up (gemini
or openai
).Example:
markdrop setup gemini
5. Generate Descriptions for Images (Standalone):
markdrop generate <input_path> --output_dir <output_directory> [--prompt <prompt_text>] [--llm_client <client1> <client2> ...]
<input_path>
: Path to an image file or a directory of images.<output_directory>
: Directory to save the descriptions CSV (default: output/descriptions
).--prompt
: (Optional) Prompt for the AI model (default: "Describe the image in detail.").--llm_client
: (Optional) List of LLM clients to use (default: gemini
). Available: qwen
, gemini
, openai
, llama-vision
, molmo
, pixtral
.Example:
markdrop generate my_images/ --output_dir image_descriptions --prompt "What is in this picture?" --llm_client gemini openai
from markdrop import markdrop, MarkDropConfig, add_downloadable_tables
from pathlib import Path
import logging
# Configure processing options
config = MarkDropConfig(
image_resolution_scale=2.0, # Scale factor for image resolution
download_button_color='#444444', # Color for download buttons in HTML
log_level=logging.INFO, # Logging detail level
log_dir='logs', # Directory for log files
excel_dir='markdropped-excel-tables' # Directory for Excel table exports
)
# Process PDF document
input_doc_path = "path/to/input.pdf"
output_dir = Path('output_directory')
# Convert PDF and generate HTML with images and tables
html_path = markdrop(input_doc_path, str(output_dir), config)
# Add interactive table download functionality
downloadable_html = add_downloadable_tables(html_path, config)
from markdrop import setup_keys, process_markdown, ProcessorConfig, AIProvider, logger
from pathlib import Path
# Set up API keys for AI providers
setup_keys(key='gemini') # or setup_keys(key='openai')
# Configure AI processing options
config = ProcessorConfig(
input_path="path/to/markdown/file.md", # Input markdown file path
output_dir=Path("output_directory"), # Output directory
ai_provider=AIProvider.GEMINI, # AI provider (GEMINI or OPENAI)
remove_images=False, # Keep or remove original images
remove_tables=False, # Keep or remove original tables
table_descriptions=True, # Generate table descriptions
image_descriptions=True, # Generate image descriptions
max_retries=3, # Number of API call retries
retry_delay=2, # Delay between retries in seconds
gemini_model_name="gemini-1.5-flash", # Gemini model for images
gemini_text_model_name="gemini-pro", # Gemini model for text
image_prompt=DEFAULT_IMAGE_PROMPT, # Custom prompt for image analysis
table_prompt=DEFAULT_TABLE_PROMPT # Custom prompt for table analysis
)
# Process markdown with AI descriptions
output_path = process_markdown(config)
from markdrop import generate_descriptions
prompt = "Give textual highly detailed descriptions from this image ONLY, nothing else."
input_path = 'path/to/img_file/or/dir'
output_dir = 'data/output'
llm_clients = ['gemini', 'llama-vision'] # Available: ['qwen', 'gemini', 'openai', 'llama-vision', 'molmo', 'pixtral']
generate_descriptions(
input_path=input_path,
output_dir=output_dir,
prompt=prompt,
llm_client=llm_clients
)
Converts PDF to markdown and HTML with enhanced features.
Parameters:
input_doc_path
(str): Path to input PDF fileoutput_dir
(str): Output directory pathconfig
(MarkDropConfig, optional): Configuration options for processingAdds interactive table download functionality to HTML output.
Parameters:
html_path
(Path): Path to HTML fileconfig
(MarkDropConfig, optional): Configuration optionsConfiguration for PDF processing:
image_resolution_scale
(float): Scale factor for image resolution (default: 2.0)download_button_color
(str): HTML color code for download buttons (default: '#444444')log_level
(int): Logging level (default: logging.INFO)log_dir
(str): Directory for log files (default: 'logs')excel_dir
(str): Directory for Excel table exports (default: 'markdropped-excel-tables')Configuration for AI processing:
input_path
(str): Path to markdown fileoutput_dir
(str): Output directory pathai_provider
(AIProvider): AI provider selection (GEMINI or OPENAI)remove_images
(bool): Whether to remove original imagesremove_tables
(bool): Whether to remove original tablestable_descriptions
(bool): Generate table descriptionsimage_descriptions
(bool): Generate image descriptionsmax_retries
(int): Maximum API call retriesretry_delay
(int): Delay between retries in secondsgemini_model_name
(str): Gemini model for image processinggemini_text_model_name
(str): Gemini model for text processingimage_prompt
(str): Custom prompt for image analysistable_prompt
(str): Custom prompt for table analysisLegacy function for basic PDF to markdown conversion.
Parameters:
source
(str): Path to input PDF or URLoutput_dir
(str): Output directory pathverbose
(bool): Enable detailed loggingLegacy function for basic image extraction.
Parameters:
source
(str): Path to input PDF or URLoutput_dir
(str): Output directory pathverbose
(bool): Enable detailed loggingLegacy function for basic table extraction.
Parameters:
pdf_path
(str): Path to input PDF or URLstart_page
(int, optional): Starting page numberend_page
(int, optional): Ending page numberthreshold
(float, optional): Detection confidence thresholdoutput_dir
(str): Output directory pathCheck an example in run.py
We welcome contributions! Please see our Contributing Guidelines for details.
git clone https://github.com/shoryasethia/markdrop.git
cd markdrop
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
markdrop/
├── LICENSE
├── README.md
├── CONTRIBUTING.md
├── CHANGELOG.md
├── requirements.txt
├── setup.py
└── markdrop/
├── __init__.py
├── src
| └── markdrop-logo.png
├── main.py
├── process.py
├── api_setup.py
├── parse.py
├── utils.py
├── helper.py
├── ignore_warnings.py
├── run.py
└── models/
├── __init__.py
├── .env
├── img_descriptions.py
├── logger.py
├── model_loader.py
├── responder.py
└── setup_keys.py
This project is licensed under the MIT License - see the LICENSE file for details.
See CHANGELOG.md for version history.
Please note that this project follows our Code of Conduct.
FAQs
A comprehensive PDF processing toolkit that converts PDFs to markdown with advanced AI-powered features for image and table analysis. Supports local files and URLs, preserves document structure, extracts high-quality images, detects tables using advanced ML models, and generates detailed content descriptions using multiple LLM providers including OpenAI and Google's Gemini.
We found that markdrop demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
/Security News
North Korean threat actors deploy 67 malicious npm packages using the newly discovered XORIndex malware loader.
Security News
Meet Socket at Black Hat & DEF CON 2025 for 1:1s, insider security talks at Allegiant Stadium, and a private dinner with top minds in software supply chain security.
Security News
CAI is a new open source AI framework that automates penetration testing tasks like scanning and exploitation up to 3,600× faster than humans.