Security News
38% of CISOs Fear They’re Not Moving Fast Enough on AI
CISOs are racing to adopt AI for cybersecurity, but hurdles in budgets and governance may leave some falling behind in the fight against cyber threats.
github.com/GitGuardian/src-fingerprint
The purpose of src-fingerprint
is to provide an easy way to extract git related information (namely all file shas of a repository) from your hosted source version control system.
This util's main command is the collect
command used to collect source code fingerprints from a version control system or a local repository. It supports 3 main VCS:
If you're using Homebrew you can add GitGuardian's tap and then install src-fingerprint. Just run the following commands:
brew tap gitguardian/tap
brew install src-fingerprint
Deb and RPM packages are available on Cloudsmith.
Setup instructions:
Open a PowerShell prompt and run this command:
iwr -useb https://raw.githubusercontent.com/GitGuardian/src-fingerprint/main/scripts/windows-installer.ps1 | iex
The script asks for the installation directory. To install silently, use these commands instead:
iwr -useb https://raw.githubusercontent.com/GitGuardian/src-fingerprint/main/scripts/windows-installer.ps1 -Outfile install.ps1
.\install.ps1 C:\Destination\Dir
rm install.ps1
Note that src-fingerprint
requires Unix commands such as bash
to be available, so it runs better from a "Git Bash" prompt.
You can also download the archives directly from the releases page.
You need go
installed and GOBIN
in your PATH
. Once that is done, run the command:
$ go get -u github.com/gitguardian/src-fingerprint/cmd/src-fingerprint
Generate a new token
.repo
box. This is the only scope we need.Generate token
. The token will only be available at this time so make sure you keep it in a safe place.Access Tokens
.read_api
box. This is the only scope we need. You can set an end-date for the token validity if you want more security.Create personal token
. The token will only be available at this time so make sure you keep it in a safe place.The output format can be chosen between jsonl
, json
, gzip-jsonl
and gzip-json
with the option --export-format
.
The default format is gzip-jsonl
to minimize the size of the output file.
The default output filepath is ./fingerprints.jsonl.gz
. Use --output
to override this behavior.
Also, note that if you were to download fingerprints for repositories of a big organization, src-fingerprint
has a limit to process no more than 100
repositories. You can override this limit with the option --limit
, a limit of 0 will process all repos of the organization.
Note that if multiple organizations are passed, the limit is applied to each one independently.
There is no default timeout, it can be set with the option --timeout
. Similarly to the limit, it is applied to each source independently.
Here is an example of some lines of a .jsonl
format output:
{"repository_name":"src-fingerprint","private":false,"sha":"a0c16efce5e767f04ba0c6988d121147099a17df","type":"blob","filepath":".env.example","size":"31"}
{"repository_name":"src-fingerprint","private":false,"sha":"d425eb0f8af66203dbeef50c921ea5bff0f2acba","type":"blob","filepath":".github/workflows/tag.yml","size":"882"}
{"repository_name":"src-fingerprint","private":false,"sha":"c7f341033d78474b125dd56d8adaa3f0fc47faf2","type":"blob","filepath":".github/workflows/test.yml","size":"899"}
{"repository_name":"src-fingerprint","private":false,"sha":"f4409d88950abd4585d8938571864726533a7fa5","type":"blob","filepath":".gitignore","size":"356"}
{"repository_name":"src-fingerprint","private":false,"sha":"f733f951ace2e032c270d2f3cf79c2efb8187b5b","type":"blob","filepath":".gitlab-ci.yml","size":"85"}
{"repository_name":"src-fingerprint","private":false,"sha":"d17ae66a017477bc65a2f433bf23d551ffc6bd75","type":"blob","filepath":".golangci.yml","size":"1196"}
{"repository_name":"src-fingerprint","private":false,"sha":"ee08a617cfb1c63c1c55fa4cb15e8bac0095346f","type":"blob","filepath":".goreleaser.yml","size":"2127"}
Note that by default, src-fingerprint
will exclude forked repositories from the fingerprints computation. For GitHub provider archived repositories and public repositories will also be excluded by default. Use flags --include-forked-repos
, --include-archived-repos
or include-public-repos
to change this behavior.
For all the following examples, we assume that the user is able to clone repositories using an HTTP URL with basic authentication. If for any reason this is not possible with the user's organization, src-fingerprint
supports ssh cloning by using the dedicated option --ssh-cloning
. Note though that this option is not the standard configuration of the tool but rather a workaround for this type of edge case. Especially, this option may bring some issues in the event of discrepancies in permissions between the token provided for API-based repos listing, and the SSH keys used to clone these repos.
./fingerprints.jsonl.gz
with logs:env VCS_TOKEN="<token>" src-fingerprint -v collect --provider github --object ORG_1_NAME --object ORG_2_NAME
./fingerprints.jsonl.gz
:env VCS_TOKEN="<token>" src-fingerprint -v collect --provider github --include-public-repos --include-forked-repos --include-archived-repos
./fingerprints.jsonl.gz
with logs:--provider-url
to specify its url, don't forget to include the scheme.env VCS_TOKEN="<token>" src-fingerprint -v collect --provider gitlab --object "GitGuardian-dev-group"
./fingerprints.jsonl.gz
with logs:env VCS_TOKEN="<token>" src-fingerprint -v collect --provider gitlab --include-forked-repos
./fingerprints.jsonl.gz
with logs:--provider-url
to specify its url, don't forget to include the scheme.env VCS_TOKEN="<token>" src-fingerprint -v collect --provider bitbucket --object "GitGuardian Project"
./fingerprints.jsonl.gz
with logs:env VCS_TOKEN="<token>" src-fingerprint -v collect --provider bitbucket
Allows the processing of a single repository given a git clone URL
src-fingerprint collect -p repository -u 'git@github.com:GitGuardian/gg-shield.git'
src-fingerprint collect -p repository -u 'https://user:password@github.com/GitGuardian/gg-shield.git'
src-fingerprint collect -p repository -u 'https://github.com/GitGuardian/gg-shield.git'
src-fingerprint collect -p repository -u /projects/gitlab/src-fingerprint -u /projects/gitlab/internal-api
src-fingerprint collect -p repository -u .
src-fingerprint
will by default process each object (--object
/-u
) one by one. When an object (ie: a GitHub Organization)
contains multiple repositories, they are processed in parallel by multiple cloners, the number of cloners is configurable
with --cloners
. Adding more cloners will increase the memory usage of src-fingerprint
. When extracting fingerprints
from multiple sources (e.g. with multiple --object values), you can use the option --pool
to configure the number of
workers that will process the objects in parallel. Each worker will have --cloners
cloners. Be cautious when increasing
both --cloners
and --pool
, the memory usage may increase drastically.
GitGuardian src-fingerprint
is MIT licensed.
FAQs
Unknown package
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
CISOs are racing to adopt AI for cybersecurity, but hurdles in budgets and governance may leave some falling behind in the fight against cyber threats.
Research
Security News
Socket researchers uncovered a backdoored typosquat of BoltDB in the Go ecosystem, exploiting Go Module Proxy caching to persist undetected for years.
Security News
Company News
Socket is joining TC54 to help develop standards for software supply chain security, contributing to the evolution of SBOMs, CycloneDX, and Package URL specifications.