Google News API Client

A robust Python client library for the Google News RSS feed API that provides both synchronous and asynchronous implementations with built-in rate limiting, caching, and error handling.
Features
- ✨ Comprehensive news search and retrieval functionality
- Search by keywords with advanced filtering
- Get top news by topic (WORLD, NATION, BUSINESS, TECHNOLOGY, etc.)
- Batch search support for multiple queries
- URL decoding for original article sources
- 🔄 Both synchronous and asynchronous APIs
GoogleNewsClient
for synchronous operations
AsyncGoogleNewsClient
for async/await support
- 🕒 Advanced time-based search capabilities
- Date range filtering (after/before)
- Relative time filtering (e.g., "1h", "24h", "7d")
- Maximum 100 results for date-based searches
- 🚀 High performance features
- In-memory caching with configurable TTL
- Built-in rate limiting with token bucket algorithm
- Automatic retries with exponential backoff
- Concurrent batch searches in async mode
- 🌍 Multi-language and country support
- ISO 639-1 language codes (e.g., "en", "fr", "de")
- ISO 3166-1 country codes (e.g., "US", "GB", "DE")
- Language-country combinations (e.g., "en-US", "fr-FR")
- 🛡️ Robust error handling
- Specific exceptions for different error scenarios
- Detailed error messages with context
- Graceful fallbacks and retries
- 📦 Modern Python packaging with Poetry
Requirements
- Python 3.9 or higher
- Poetry (recommended for installation)
Installation
Using Poetry (recommended)
poetry add google-news-api
git clone https://github.com/ma2za/google-news-api.git
cd google-news-api
poetry install
Using pip
pip install google-news-api
Usage Examples
Synchronous Client
from google_news_api import GoogleNewsClient
client = GoogleNewsClient(
language="en",
country="US",
requests_per_minute=60,
cache_ttl=300
)
try:
world_news = client.top_news(topic="WORLD", max_results=5)
tech_news = client.top_news(topic="TECHNOLOGY", max_results=3)
date_articles = client.search(
"Ukraine war",
after="2024-01-01",
before="2024-03-01",
max_results=5
)
recent_articles = client.search(
"climate change",
when="24h",
max_results=5
)
batch_results = client.batch_search(
queries=["AI", "machine learning", "deep learning"],
when="7d",
max_results=3
)
for topic, articles in batch_results.items():
print(f"\nTop {topic} news:")
for article in articles:
print(f"- {article['title']} ({article['source']})")
print(f" Published: {article['published']}")
print(f" Summary: {article['summary'][:100]}...")
except Exception as e:
print(f"An error occurred: {e}")
finally:
del client
Asynchronous Client
from google_news_api import AsyncGoogleNewsClient
import asyncio
async def main():
async with AsyncGoogleNewsClient(
language="en",
country="US",
requests_per_minute=60
) as client:
world_news = await client.top_news(topic="WORLD", max_results=3)
tech_news = await client.top_news(topic="TECHNOLOGY", max_results=3)
batch_results = await client.batch_search(
queries=["AI", "machine learning", "deep learning"],
when="7d",
max_results=3
)
for topic, articles in batch_results.items():
print(f"\nTop {topic} news:")
for article in articles:
original_url = await client.decode_url(article['link'])
print(f"- {article['title']} ({article['source']})")
print(f" Original URL: {original_url}")
if __name__ == "__main__":
asyncio.run(main())
Configuration
The library provides extensive configuration options through the client initialization:
language | Two-letter language code (ISO 639-1) or language-country format | "en" | "en" , "fr" , "de" , "en-US" , "fr-FR" |
country | Two-letter country code (ISO 3166-1 alpha-2) | "US" | "US" , "GB" , "DE" , "JP" |
requests_per_minute | Rate limit threshold for API requests | 60 | 30 , 100 , 120 |
cache_ttl | Cache duration in seconds for responses | 300 | 600 , 1800 , 3600 |
Available Topics
The top_news()
method supports the following topics:
"WORLD"
- World news
"NATION"
- National news
"BUSINESS"
- Business news
"TECHNOLOGY"
- Technology news
"ENTERTAINMENT"
- Entertainment news
"SPORTS"
- Sports news
"SCIENCE"
- Science news
"HEALTH"
- Health news
Time-Based Search
The library supports two types of time-based search:
-
Date Range Search
- Use
after
and before
parameters
- Format:
YYYY-MM-DD
- Maximum 100 results
- Example:
after="2024-01-01", before="2024-03-01"
-
Relative Time Search
- Use the
when
parameter
- Hours:
"1h"
to "101h"
- Days: Any number of days (e.g.,
"7d"
, "30d"
)
- Cannot be used with
after
/before
- Example:
when="24h"
for last 24 hours
Article Structure
Each article in the results contains the following fields:
title
: Article title
link
: Google News article URL
published
: Publication date and time
summary
: Article summary/description
source
: News source name
Error Handling
The library provides specific exceptions for different error scenarios:
from google_news_api.exceptions import (
ConfigurationError,
ValidationError,
HTTPError,
RateLimitError,
ParsingError
)
try:
articles = client.search("technology")
except RateLimitError as e:
print(f"Rate limit exceeded. Retry after {e.retry_after} seconds")
except HTTPError as e:
print(f"HTTP error {e.status_code}: {str(e)}")
except ValidationError as e:
print(f"Invalid parameters: {str(e)}")
except Exception as e:
print(f"Unexpected error: {str(e)}")
Best Practices
Resource Management
- Use context managers (
async with
) for async clients
- Explicitly close synchronous clients when done
- Implement proper error handling and cleanup
Performance Optimization
- Utilize caching for frequently accessed queries
- Use the async client for concurrent operations
- Batch related requests to maximize cache efficiency
- Configure appropriate cache TTL based on your needs
Rate Limiting
- Set
requests_per_minute
based on your requirements
- Implement exponential backoff for rate limit errors
- Monitor rate limit usage in production
Development
Setting up the Development Environment
git clone https://github.com/ma2za/google-news-api.git
cd google-news-api
poetry install --with dev
pre-commit install
Running Tests
poetry run pytest
poetry run pytest --cov=google_news_api
pre-commit run --all-files
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
)
- Make your changes
- Run tests and linting (
poetry run pytest
and poetry run flake8
)
- Commit your changes (
git commit -m 'Add amazing feature'
)
- Push to the branch (
git push origin feature/amazing-feature
)
- Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Author
Paolo Mazza (mazzapaolo2019@gmail.com)
Acknowledgments
Support
For issues, feature requests, or questions:
- Open an issue on GitHub
- Contact the author via email
- Check the examples directory for more usage scenarios
Time-Based Search
The library supports two types of time-based search:
Date Range Search
Use after
and before
parameters to search within a specific date range:
articles = client.search(
"Ukraine war",
after="2024-01-01",
before="2024-03-01",
max_results=5
)
Relative Time Search
Use the when
parameter for relative time searches:
articles = client.search("climate change", when="1h")
articles = client.search("climate change", when="24h")
articles = client.search("climate change", when="7d")
Notes:
- Date range parameters (
after
/before
) must be in YYYY-MM-DD format
- Relative time (
when
) supports:
- Hours (h): 1-101 hours (e.g., "1h", "24h", "101h")
- Days (d): Any number of days (e.g., "1d", "7d", "30d")
when
parameter cannot be used together with after
or before
- All searches return articles sorted by relevance and recency