πŸ“… You're Invited: Meet the Socket team at RSAC (April 28 – May 1).RSVP β†’
Socket
Sign inDemoInstall
Socket

LightYtSearch

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

LightYtSearch

Lightweight YouTube search scraper without using the official API

0.3.0
PyPI
Maintainers
1

πŸ” LightYtSearch

PyPI version License: GPL-3.0 Python Versions Downloads Commits Last Commit

✨ A lightweight Python package to search YouTube without using the official API ✨

πŸ“‹ Table of Contents

✨ Features

  • πŸ” Search for videos, playlists, and movies on YouTube
  • πŸ“Š Extract detailed information including titles, channels, view counts, and more
  • πŸ”‘ No API key required
  • 🌈 Colorful command-line interface
  • πŸ“„ JSON output support
  • 🌐 Configurable search parameters (language, region, etc.)
  • πŸ›‘οΈ Fault-tolerant with retry capabilities
  • πŸ—‚οΈ Save raw YouTube data for further analysis
  • πŸ–ΌοΈ Proxy support
  • πŸ”„ User-agent rotation

πŸš€ Installation

Install from PyPI using pip:

pip install LightYtSearch

Or install the development version from GitHub:

pip install git+https://github.com/Arrowar/LightYtSearch.git

🏁 Quick Start

from LightYtSearch import search_youtube

# Basic search
results = search_youtube("python tutorial", max_results=5)
print(f"Found {len(results)} results")

# Print titles and URLs
for item in results:
    print(f"{item['type']}: {item['title']}")
    print(f"URL: {item['url']}")
    print("---")

πŸ“– Usage

🐍 As a Python Module

from LightYtSearch import search_youtube

# Search for "python tutorial" and get up to 5 results
results = search_youtube("python tutorial", max_results=5)

# Process the results
for item in results:
    print(f"{item['type']}: {item['title']}")
    print(f"URL: {item['url']}")
    print("---")

βš™οΈ Function Parameters

The search_youtube() function accepts the following parameters:

ParameterTypeDescriptionDefault
querystrThe search term to look for on YouTube(Required)
max_resultsintMaximum number of results to return5
filter_typestrFilter results by type ('video', 'playlist', 'movie')None
timeoutintRequest timeout in seconds10
languagestrLanguage code for search results'en'
regionstrRegion code for search results'US'
save_jsonboolWhether to save results to a JSON fileFalse
output_filestrPath to save results JSON'results.json'
verboseboolWhether to print progress and results to consoleTrue
showResultsboolWhether to display search results in terminalTrue
retry_countintNumber of retries if request fails3
retry_delayintDelay between retries in seconds2
showTimeExecutionboolDisplay execution time for each major processFalse
save_raw_databoolSave raw YouTube data to JSON fileFalse
raw_data_dirstrDirectory to save raw YouTube data'./raw_data'

Note: The maximum possible value for max_results is 20 due to YouTube's page limitations.

πŸ’» Command-line Interface

# Basic search
LightYtSearch "python tutorial"

# Get 10 results (maximum is 20)
LightYtSearch "python tutorial" -n 10

# Export to JSON
LightYtSearch "python tutorial" -j > results.json

# Save to a file
LightYtSearch "python tutorial" -s -o my_results.json

# Save raw YouTube data
LightYtSearch "python tutorial" --save-raw-data --raw-data-dir ./my_raw_data

# Filter by type
LightYtSearch "python tutorial" --filter video

# Set custom timeout, language, and region
LightYtSearch "python tutorial" --timeout 15 --language fr --region FR

# Configure retry behavior
LightYtSearch "python tutorial" --retry-count 5 --retry-delay 3

# Show execution time
LightYtSearch "python tutorial" --time

🧰 CLI Arguments

ArgumentShortDescriptionDefault
querySearch term (required)
--max-results-nMaximum number of results5
--json-jOutput results as JSON
--save-sSave results to a file
--output-oOutput filenameresults.json
--quiet-qQuiet mode (minimal output)
--version-vShow version information
--filterFilter results by type
--timeoutRequest timeout in seconds10
--languageLanguage codeen
--regionRegion codeUS
--retry-countNumber of retries3
--retry-delayDelay between retries in seconds2
--timeShow execution time
--save-raw-dataSave raw YouTube data to JSON fileFalse
--no-save-raw-dataDo not save raw YouTube data
--raw-data-dirDirectory to save raw YouTube data./raw_data

πŸ”„ Data Structure

The function returns a list of dictionaries with different structures depending on the item type:

🎬 Video Results

{
    'type': 'video',
    'id': 'video_id',
    'title': 'Video title',
    'channel': {
        'name': 'Channel name',
        'id': 'channel_id',
        'url': 'https://www.youtube.com/channel/channel_id',
        'thumbnail': 'Channel thumbnail URL',
        'badges': ['Verified', 'Official Artist', etc.]
    },
    'views': {
        'full': '1,234,567 views',
        'short': '1.2M views'
    },
    'published': '2 years ago',
    'duration': '12:34',
    'description': 'Video description snippet',
    'thumbnails': [thumbnail objects],
    'rich_thumbnail': 'Moving thumbnail URL',
    'flags': {
        'is_live': False,
        'is_upcoming': False,
        'has_closed_captions': True
    },
    'upcoming_date': None,
    'overlay_icons': ['4K', 'HD', etc.],
    'url': 'https://www.youtube.com/watch?v=video_id'
}

πŸ“‘ Playlist Results

{
    'type': 'playlist',
    'id': 'playlist_id',
    'title': 'Playlist title',
    'channel': {
        'name': 'Channel name',
        'id': 'channel_id',
        'url': 'https://www.youtube.com/channel/channel_id'
    },
    'video_count': '42 videos',
    'thumbnails': [thumbnail objects],
    'video_previews': [
        {
            'title': 'Video title',
            'id': 'video_id',
            'url': 'https://www.youtube.com/watch?v=video_id'
        }
    ],
    'url': 'https://www.youtube.com/playlist?list=playlist_id'
}

πŸŽ₯ Movie Results

{
    'type': 'movie',
    'id': 'movie_id',
    'title': 'Movie title',
    'description': 'Movie description',
    'duration': '1:23:45',
    'publisher': 'Studio name',
    'metadata': ['2023', 'PG-13', etc.],
    'bottom_metadata': ['Adventure', 'Action', etc.],
    'thumbnails': [thumbnail objects],
    'is_vertical_poster': False,
    'badges': ['4K', 'HD', etc.],
    'offers': ['Rent $3.99', 'Buy $14.99', etc.],
    'url': 'https://www.youtube.com/watch?v=movie_id'
}

πŸ“ Examples

πŸ” Searching for Videos Only

from LightYtSearch import search_youtube

# Search for videos only
videos = search_youtube("python programming", filter_type="video", max_results=10)
for video in videos:
    print(f"Title: {video['title']}")
    print(f"Channel: {video['channel']['name']}")
    print(f"Views: {video['views']['full']}")
    print(f"Duration: {video['duration']}")
    print(f"URL: {video['url']}")
    print("---")

πŸ’Ύ Saving Results to a JSON File

from LightYtSearch import search_youtube

# Search and save to custom file
search_youtube("machine learning", max_results=15, save_json=True, output_file="ml_videos.json")

🌐 Using Different Region and Language

from LightYtSearch import search_youtube

# Search in Italian from Italy
results_it = search_youtube("ricette pasta", language="it", region="IT", max_results=5)

# Search in Spanish from Mexico
results_es = search_youtube("recetas mexicanas", language="es", region="MX", max_results=5)

πŸ—‚οΈ Saving Raw YouTube Data

from LightYtSearch import search_youtube

# Save raw YouTube data to a custom directory
search_youtube("data science", save_raw_data=True, raw_data_dir="./data_science_raw")

⚠️ Limitations

  • Maximum Results: This library can extract a maximum of 20 results per search query due to YouTube's initial page load limitations.
  • No Pagination: Currently doesn't support fetching more than the initial results page.
  • Subject to Changes: YouTube's HTML structure might change, which could affect the scraping functionality.
  • Rate Limiting: Excessive use might trigger YouTube's rate limiting mechanisms.
  • No Official Support: This is not using the official YouTube API and is therefore not officially supported by YouTube.

πŸ‘ Acknowledgments

  • Inspired by the need for a lightweight YouTube search solution without API key requirements
  • Thanks to all contributors who have helped shape this project
  • This library is for educational purposes and personal use

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts