New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

wolfsoftware.github-extractor

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

wolfsoftware.github-extractor

Extract various information from the GitHub API.

  • 0.1.1
  • Source
  • PyPI
  • Socket score

Maintainers
1

GitHubToolbox logo
Github Build Status License Created
Release Released Commits since release

Overview

The GitHub Extractor package is a Python library designed to facilitate the extraction of data from GitHub.

This package provides functions to fetch information about repositories, including languages used, releases, contributors, topics, workflows, and more with robust error handling and configuration support.

Features

  • List organizations for a user from GitHub.
  • List repositories for a user from GitHub.
  • List repositories for a specified organization from GitHub.
  • Support for authentication using GitHub API tokens.
  • Filtering of organizations and repositories based on given patterns.
  • Pagination handling for API requests.

Installation

You can install GitHub Extractor via pip:

pip install wolfsoftware.github-extractor

Usage

Getting Token information

You an get basic information relating to the given token.

There is also a specific command line tool for this Github Token Validator.

from wolfsoftware.github_extractor import get_token_information

config = {
    "token": "your_github_token",
}
Parameters
NameRequiredPurpose
tokenYesAuthentication for the GitHub API.
timeoutNoThe timeout to use when talking to the GitHub API (default is 10 seconds).
slugsNoShould we return the results as slugs. (List of names and nothing else).

Getting User Information

You an get basic information relating to the authenticated user (owner of the token). The information will be limited by the scope of the token.

from wolfsoftware.github_extractor import get_authenticated_user

config = {
    "token": "your_github_token",
}
Parameters
NameRequiredPurpose
tokenYesAuthentication for the GitHub API.
timeoutNoThe timeout to use when talking to the GitHub API (default is 10 seconds).
slugsNoShould we return the results as slugs. (List of names and nothing else).

Listing Organizations

You can list organizations that you are a member of using British or American English spelling.

from wolfsoftware.github_extractor import list_organisations, list_organizations

config = {
    "token": "your_github_token",
    "ignore_orgs": ["Test*"]
}

# Using British English spelling
organisations = list_organisations(config)

# Using American English spelling
organisations_us = list_organizations(config)
Parameters
NameRequiredPurpose
tokenYesAuthentication for the GitHub API.
timeoutNoThe timeout to use when talking to the GitHub API (default is 10 seconds).
slugsNoShould we return the results as slugs. (List of names and nothing else).
Filtering Parameters
NameRequiredPurpose
include_orgsNoA list of organisation names to include in the results.
ignore_orgsNoA list of organisation names to exclude from the results.
get_membersNoShould we include organisation members in the results.

Listing User Repositories

You can list repositories for a user with optional filters:

from wolfsoftware.github_extractor import list_user_repositories

config = {
    "token": "your_github_token",
    "ignore_repos": ["Test*"],
    "include_repos": ["Project*"]
}

repositories = list_user_repositories(config)
Parameters
NameRequiredPurpose
tokenNoAuthentication for the GitHub API.
timeoutNoThe timeout to use when talking to the GitHub API (default is 10 seconds).
slugsNoShould we return the results as slugs. (List of names and nothing else).
usernameNoThe GitHub username to list repositories for. (Authenticated user will be used is this is not supplied).
Additional Data Parameter
NameRequiredPurpose
get_branchesNoAdd details about all branches to each repository.
get_contributorsNoAdd details about all contributors to each repository.
get_languagesNoAdd the list of identified languages for each repository.
get_releasesNoAdd details about all releases to each repository.
get_tagsNoAdd details about all tags to each repository.
get_topicsNoAdd the list of defined topics to each repository.
get_workflowsNoAdd details about all workflows to each repository.
Filtering Parameter
NameRequiredPurpose
include_namesNoA list of repository names to include in the results.
ignore_namesNoA list of repository names to exclude from the results.
include_reposNoA list of organisation names/repository names to include in the results.
ignore_reposNoA list of organisation names/repository names to exclude from the results.
skip_privateNoDo not include private repositories, this is for the authenticated user only.

ignore and include names use the full name of the repository, which is the organisation name / repository name E.g. GitHubToolbox/github-extractor-package

Listing Repositories by Organization

You can list repositories for a specific organization with optional filters:

from wolfsoftware.github_extractor import list_repositories_by_org

config = {
    "token": "your_github_token",
    "org_name": "your_organization",
    "ignore_repos": ["Test*"],
    "include_repos": ["Project*"]
}

repositories = list_repositories_by_org(config)
Parameters
NameRequiredPurpose
tokenNoAuthentication for the GitHub API.
timeoutNoThe timeout to use when talking to the GitHub API (default is 10 seconds).
slugsNoShould we return the results as slugs. (List of names and nothing else).
org_nameNoThe GitHub organisation to list repositories for.
Additional Data Parameter
NameRequiredPurpose
get_branchesNoAdd details about all branches to each repository.
get_contributorsNoAdd details about all contributors to each repository.
get_languagesNoAdd the list of identified languages for each repository.
get_releasesNoAdd details about all releases to each repository.
get_tagsNoAdd details about all tags to each repository.
get_topicsNoAdd the list of defined topics to each repository.
get_workflowsNoAdd details about all workflows to each repository.
Filtering Parameter
NameRequiredPurpose
include_namesNoA list of repository names to include in the results.
ignore_namesNoA list of repository names to exclude from the results.
include_reposNoA list of organisation names/repository names to include in the results.
ignore_reposNoA list of organisation names/repository names to exclude from the results.
skip_privateNoDo not include private repositories, this is for the authenticated user only.

ignore and include names use the full name of the repository, which is the organisation name / repository name E.g. GitHubToolbox/github-extractor-package

Listing all Organisation Repositories

You can list all repositories for all organisations you're a member of.

from wolfsoftware.github_extractor import list_all_org_repositories

config = {
    "token": "your_github_token",
    "ignore_repos": ["Test*"],
    "include_repos": ["Project*"]
}

repositories = list_all_org_repositories(config)
Parameters
NameRequiredPurpose
tokenYesAuthentication for the GitHub API.
timeoutNoThe timeout to use when talking to the GitHub API (default is 10 seconds).
slugsNoShould we return the results as slugs. (List of names and nothing else).
Additional Data Parameter
NameRequiredPurpose
get_branchesNoAdd details about all branches to each repository.
get_contributorsNoAdd details about all contributors to each repository.
get_languagesNoAdd the list of identified languages for each repository.
get_releasesNoAdd details about all releases to each repository.
get_tagsNoAdd details about all tags to each repository.
get_topicsNoAdd the list of defined topics to each repository.
get_workflowsNoAdd details about all workflows to each repository.
Filtering Parameter
NameRequiredPurpose
include_namesNoA list of repository names to include in the results.
ignore_namesNoA list of repository names to exclude from the results.
include_reposNoA list of organisation names/repository names to include in the results.
ignore_reposNoA list of organisation names/repository names to exclude from the results.
skip_privateNoDo not include private repositories, this is for the authenticated user only.

ignore and include names use the full name of the repository, which is the organisation name / repository name E.g. GitHubToolbox/github-extractor-package

Listing all Visible Repositories

You can list repositories that you are able to access.

from wolfsoftware.github_extractor import list_all_visible_repositories

config = {
    "token": "your_github_token",
    "ignore_repos": ["Test*"],
    "include_repos": ["Project*"]
}

repositories = list_all_visible_repositories(config)
Parameters
NameRequiredPurpose
tokenYesAuthentication for the GitHub API.
timeoutNoThe timeout to use when talking to the GitHub API (default is 10 seconds).
slugsNoShould we return the results as slugs. (List of names and nothing else).
Additional Data Parameter
NameRequiredPurpose
get_branchesNoAdd details about all branches to each repository.
get_contributorsNoAdd details about all contributors to each repository.
get_languagesNoAdd the list of identified languages for each repository.
get_releasesNoAdd details about all releases to each repository.
get_tagsNoAdd details about all tags to each repository.
get_topicsNoAdd the list of defined topics to each repository.
get_workflowsNoAdd details about all workflows to each repository.
Filtering Parameter
NameRequiredPurpose
include_namesNoA list of repository names to include in the results.
ignore_namesNoA list of repository names to exclude from the results.
include_reposNoA list of organisation names/repository names to include in the results.
ignore_reposNoA list of organisation names/repository names to exclude from the results.
skip_privateNoDo not include private repositories, this is for the authenticated user only.

ignore and include names use the full name of the repository, which is the organisation name / repository name E.g. GitHubToolbox/github-extractor-package

Exceptions

The following custom exceptions are used:

NamePurpose
AuthenticationErrorRaised when authentication fails. This is caused by an invalid token.
MissingOrgNameErrorRaised when the organization name is missing.
MissingTokenErrorRaised when the GitHub API token is missing but is required.
NotFoundErrorRaised when a requested resource is not found. This is caused by incorrect scope of the token.
RateLimitExceededErrorRaised when the GitHub API rate limit is exceeded.
RequestErrorRaised for general request errors.
RequestTimeoutErrorRaised when a request times out.

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc