You're Invited:Meet the Socket Team at BlackHat and DEF CON in Las Vegas, Aug 4-6.RSVP
Socket
Book a DemoInstallSign in
Socket

chat-with-pdf

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

chat-with-pdf

Chat with your PDFs using local embedding search and OpenAI.

0.4.1
pipPyPI
Maintainers
1

📄 Chat with PDF

PyPI version Build Status

Chat with your PDF documents easily using local embeddings and powerful LLMs through a unified SDK. Upload any PDF and ask natural language questions about its content — powered by semantic search and AI.

🛠️ Installation

pip install chat-with-pdf

Or using Poetry:

poetry add chat-with-pdf

✨ Quickstart Example

  • Set your credentials and optionally choose a model/provider:
# Default provider key
export OPENAI_API_KEY="sk-your-openai-key"

# The model for the provider
export OPENAI_MODEL="gpt-4"

# Switch to another provider (e.g., perplexity, openai or deepseek)
export LLM_PROVIDER="perplexity"

  • Use the SDK to chat with any PDF:
from chat_with_pdf import PDFChat

# Local PDF file
chat = PDFChat("path/to/your/document.pdf")
print(chat.ask("Summarize the introduction section."))

# Remote URL
chat = PDFChat("https://example.com/sample.pdf")
print(chat.ask("What is the main point of this document?"))

# PDF in memory
with open("path/to/your/document.pdf", "rb") as f:
    data = f.read()
chat = PDFChat(data)
print(chat.ask("Give me a brief overview."))

⚙️ Configuration Options

Configure via environment variables (in order of precedence):

VariablePurposeDefault
LLM_PROVIDERProvider to use (openai, perplexity, deepseek)openai
OPENAI_API_KEYYour OpenAI API key
OPENAI_MODELGPT model name (used for all providers)gpt-3.5-turbo
EMBEDDING_MODELEmbedding modelall-MiniLM-L6-v2
DEFAULT_CHUNK_SIZECharacters per text chunk500
TOP_K_RETRIEVALNumber of chunks to retrieve per query5

💡 For local development, you can also create a .env file with these variables and the SDK will load it automatically.

🔥 Advanced Usage

Override provider/model at runtime:

from chat_with_pdf import PDFChat

# Use GPT-4 on OpenAI
chat = PDFChat("doc.pdf")
print(chat.ask("What are the key findings?", provider="openai", model="gpt-4"))

# Use DeepSeek
print(chat.ask("Summarize", provider="deepseek", model="deepseek-chat"))

# Use Perplexity
print(chat.ask("Summarize", provider="perplexity", model="sonar"))

📝 License

This project is licensed under the MIT License.

🌟 Acknowledgements

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts