TextFromImage

A powerful Python library for obtaining detailed descriptions of images using various AI models including OpenAI's GPT models, Azure OpenAI, and Anthropic Claude. Perfect for applications requiring image understanding, accessibility features, and content analysis. Supports both local files and URLs, with batch processing capabilities.
🌟 Key Features
- 🤖 Multiple AI Providers: Support for OpenAI, Azure OpenAI, and Anthropic Claude
- 🌐 Flexible Input: Support for both URLs and local file paths
- 📦 Batch Processing: Process multiple images (up to 20) concurrently
- 🔄 Flexible Integration: Easy-to-use API with multiple initialization options
- 🎯 Custom Prompting: Configurable prompts for targeted descriptions
- 🔑 Secure Authentication: Multiple authentication methods including environment variables
- 🛠️ Model Selection: Support for different model versions and configurations
- 📝 Type Hints: Full typing support for better development experience
📦 Installation
pip install textfromimage
pip install textfromimage[azure]
pip install textfromimage[all]
🚀 Quick Start
import textfromimage
textfromimage.openai.init(api_key="your-openai-api-key")
image_url = 'https://example.com/image.jpg'
local_image = '/path/to/local/image.jpg'
url_description = textfromimage.openai.get_description(image_path=image_url)
local_description = textfromimage.openai.get_description(image_path=local_image)
image_paths = [
'https://example.com/image1.jpg',
'/path/to/local/image2.jpg',
'https://example.com/image3.jpg'
]
batch_results = textfromimage.openai.get_description_batch(
image_paths=image_paths,
concurrent_limit=3
)
for result in batch_results:
if result.success:
print(f"Success for {result.image_path}: {result.description}")
else:
print(f"Failed for {result.image_path}: {result.error}")
💡 Advanced Usage
🤖 Multiple Provider Support
textfromimage.claude.init(api_key="your-anthropic-api-key")
claude_description = textfromimage.claude.get_description(
image_path=image_path,
model="claude-3-sonnet-20240229"
)
claude_results = textfromimage.claude.get_description_batch(
image_paths=image_paths,
model="claude-3-sonnet-20240229",
concurrent_limit=3
)
textfromimage.azure_openai.init(
api_key="your-azure-openai-api-key",
api_base="https://your-azure-endpoint.openai.azure.com/",
deployment_name="your-deployment-name"
)
azure_description = textfromimage.azure_openai.get_description(
image_path=image_path,
system_prompt="Analyze this image in detail"
)
azure_results = textfromimage.azure_openai.get_description_batch(
image_paths=image_paths,
system_prompt="Analyze each image in detail",
concurrent_limit=3
)
🔧 Configuration Options
import os
os.environ['OPENAI_API_KEY'] = 'your-openai-api-key'
os.environ['ANTHROPIC_API_KEY'] = 'your-anthropic-api-key'
os.environ['AZURE_OPENAI_API_KEY'] = 'your-azure-openai-api-key'
os.environ['AZURE_OPENAI_ENDPOINT'] = 'your-azure-endpoint'
os.environ['AZURE_OPENAI_DEPLOYMENT'] = 'your-deployment-name'
batch_results = textfromimage.openai.get_description_batch(
image_paths=image_paths,
model='gpt-4-vision-preview',
prompt="Describe the main elements of each image",
max_tokens=300,
concurrent_limit=5
)
📋 Parameters and Types
def get_description(
image_path: str,
prompt: str = "What's in this image?",
max_tokens: int = 300,
model: str = "gpt-4-vision-preview"
) -> str: ...
@dataclass
class BatchResult:
success: bool
description: Optional[str]
error: Optional[str]
image_path: str
def get_description_batch(
image_paths: List[str],
prompt: str = "What's in this image?",
max_tokens: int = 300,
model: str = "gpt-4-vision-preview",
concurrent_limit: int = 3
) -> List[BatchResult]: ...
🔍 Error Handling
from textfromimage.utils import BatchResult
try:
description = textfromimage.openai.get_description(image_path=image_path)
except ValueError as e:
print(f"Image processing error: {e}")
except RuntimeError as e:
print(f"API error: {e}")
results = textfromimage.openai.get_description_batch(image_paths)
successful = [r for r in results if r.success]
failed = [r for r in results if not r.success]
for result in failed:
print(f"Failed to process {result.image_path}: {result.error}")
🤝 Contributing
We welcome contributions! Here's how you can help:
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
)
- Commit your changes (
git commit -m 'Add some AmazingFeature'
)
- Push to the branch (
git push origin feature/AmazingFeature
)
- Open a Pull Request
📝 License
This project is licensed under the MIT License - see the LICENSE file for details.