Product
Introducing License Enforcement in Socket
Ensure open-source compliance with Socket’s License Enforcement Beta. Set up your License Policy and secure your software!
File identification library for Python.
Given a file (or some information about a file), return a set of standardized tags identifying what the file is.
pip install identify
If you have an actual file on disk, you can get the most information possible (a superset of all other methods):
>>> from identify import identify
>>> identify.tags_from_path('/path/to/file.py')
{'file', 'text', 'python', 'non-executable'}
>>> identify.tags_from_path('/path/to/file-with-shebang')
{'file', 'text', 'shell', 'bash', 'executable'}
>>> identify.tags_from_path('/bin/bash')
{'file', 'binary', 'executable'}
>>> identify.tags_from_path('/path/to/directory')
{'directory'}
>>> identify.tags_from_path('/path/to/symlink')
{'symlink'}
When using a file on disk, the checks performed are:
>>> identify.tags_from_filename('file.py')
{'text', 'python'}
>>> identify.tags_from_interpreter('python3.5')
{'python', 'python3'}
>>> identify.tags_from_interpreter('bash')
{'shell', 'bash'}
>>> identify.tags_from_interpreter('some-unrecognized-thing')
set()
$ identify-cli --help
usage: identify-cli [-h] [--filename-only] path
positional arguments:
path
optional arguments:
-h, --help show this help message and exit
--filename-only
$ identify-cli setup.py; echo $?
["file", "non-executable", "python", "text"]
0
$ identify-cli setup.py --filename-only; echo $?
["python", "text"]
0
$ identify-cli wat.wat; echo $?
wat.wat does not exist.
1
$ identify-cli wat.wat --filename-only; echo $?
1
identify
also has an api for determining what type of license is contained
in a file. This routine is roughly based on the approaches used by
licensee (the ruby gem that github uses to figure out the license for a
repo).
The approach that identify
uses is as follows:
To use the api, install via pip install identify[license]
>>> from identify import identify
>>> identify.license_id('LICENSE')
'MIT'
The return value of the license_id
function is an SPDX id. Currently
licenses are sourced from choosealicense.com.
A call to tags_from_path
does this:
By design, this means we don't need to partially read files where we recognize the file extension.
FAQs
File identification library for Python
We found that identify demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Ensure open-source compliance with Socket’s License Enforcement Beta. Set up your License Policy and secure your software!
Product
We're launching a new set of license analysis and compliance features for analyzing, managing, and complying with licenses across a range of supported languages and ecosystems.
Product
We're excited to introduce Socket Optimize, a powerful CLI command to secure open source dependencies with tested, optimized package overrides.