
Security News
ECMAScript 2025 Finalized with Iterator Helpers, Set Methods, RegExp.escape, and More
ECMAScript 2025 introduces Iterator Helpers, Set methods, JSON modules, and more in its latest spec update approved by Ecma in June 2025.
Fast flight data scraper from Google Flights which fetches and decodes flight information based on user's filters. Based on a Base64-encoded Protobuf URL string.
pip install google-flight
or if you want to run it local (with Playwright):
pip install google-flight[local]
from google_flights import create_filter, Passengers, FlightData, search_airline, search_airport
flight_filter = create_filter(
flight_data=[
FlightData(
airlines=search_airline("RyanAir"), # Airline codes - "FR" and "RK" - can be passed as list ["FR","RK"]
date="2025-07-20", # Date of departure
from_airport=search_airport("Kaunas") # Departure airport ["KUNs"]
to_airport=["MAD"], # Arrival airports
),
],
trip="one-way", # Trip type
passengers=Passengers(adults=1, children=0, infants_in_seat=0, infants_on_lap=0),
seat="economy", # Seat type
max_stops=1, # Maximum number of stops
)
from google_flights import get_flights_from_filter
flight_data = get_flights_from_filter(flight_filter, data_source='js', mode="common")
✈️ Airline (code & name)
🔢 Flight number
🛫 Departure airport (code & name) & departure time
🛬 Arrival airport (code & name) & arrival time
📅 Departure date
⏱️ Travel duration
🛡️ Aircraft type
💺 Seat class (including seat pitch)
🌱 CO₂ emissions (grams)
💶 Price (EUR)
🔄 Layovers (direct or stops)
✨ Features (Wi‑Fi, in‑seat power, video, media streaming, etc.)
I'll work with this project more in future, as I have plans for it. So contributions are always welcome!
Additional bug checking
Add more possibilities, as finding the shortest and economic path
It all started with a simple thought: “When is the best time to purchase a flight ticket?” and I found this website:
It actually offers pretty decent information, even predictions on whether to wait or buy now. However, their training data was already outdated, resulting in false predictions. So, I decided to build my own version. What did I need?
A scraper!
Since Google doesn't provide a public API, we, simple people who don't want to pay - are creating our own scraper.
The idea is straightforward: build a scraper that returns all results based on your filters.
Initially, I tried Playwright. It worked (and remains in the project as an alternative). We need smth different, which is faster and stable.
Then I examined the Google Flights URL: https://www.google.com/travel/flights?tfs=GisSCjIwMjUtMDctMjUoATICQlQyAkZSagUSA01BRHIFEgNLVU5yBRIDVk5PQgEBSAGYAQI%3D
Notice the tfs parameter:tfs=GisSCjIwMjUtMDctMjUoATICQlQyAkZSagUSA01BRHIFEgNLVU5yBRIDVk5PQgEBSAGYAQI%3D
, which looks like base64. We need to decode it! How?
But raw decoding yields too much noise. Which protocol does Google use? - Correctly Protobuf!
And what we can see here:
That's our data! Half of the object is done. Next we need to parse, and get results.
Using data embedded in the response's script tag (<script class="ds:1">
), I'm able to get much more data than what is parsed from the HTML.
There are a few challenges: some fields require hardcoding. For example, features (in-seat power, etc.) are coded as: [null, null, null, null, null, null, null, null, null, null, null, 3]
You can see the features array is [… , 3] with that 3 sitting in slot 11 (zero-based).
We know slot 11 maps to Wi-Fi, and the code 3 means “for a fee” (whereas 2 would be "for free", and null means “not available”). With the same logic every else, which you need to find by yourself.
However, this approach is much faster than Playwright and provides richer data than pure HTML scraping!
I'm a Python developer with a deep passion for Data Science and ML Engineering.
I’m constantly exploring new techniques and tools to enhance my skills and dive into solving real-world problems through data-driven insights. Whether it's building models or optimizing data pipelines or analyzing data
FAQs
Google Flights scraper (API) implemented in Python
We found that google-flights demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
ECMAScript 2025 introduces Iterator Helpers, Set methods, JSON modules, and more in its latest spec update approved by Ecma in June 2025.
Security News
A new Node.js homepage button linking to paid support for EOL versions has sparked a heated discussion among contributors and the wider community.
Research
North Korean threat actors linked to the Contagious Interview campaign return with 35 new malicious npm packages using a stealthy multi-stage malware loader.