gl-query
is a tool to query the GitLab API for useful project and pipeline information.
Installation / Setup
pip install gl-query
[config]
# Must include "/api/v4/"
url = https://<GITLAB_HOST>/api/v4/
token = YOUR_GITLAB_PERSONAL_ACCESS_TOKEN
Usage
Positional Arguments
Positional arguments subject/verb commands for the CLI utility.
The format for position arguments is:
gl-query <VERB> <SUBJECT> --<OPTIONAL_FLAGS>
An example would be:
gl-query get projects
Verbs
Verbs are the first positional argument
Currently, the only supported verb is get
as the current purpose of this program is to retrieve information from the GitLab API and output it. For future, it may be worth adding a set
action in order to make modifications to the state of the GitLab instance. Some proposed actions would be to :
Subjects
Subjects are the second positional argument.
Optional Arguments (Flags)
There are multiple command-line flags compatibile with gl-query
.
--csv / -c
<OPTIONAL>
USAGE
: gl-query get projects --csv
This flag takes no arguments. It simply tells gl-query
to produce a CSV output of the query. When used without the --filename
flag it will write the CSV file to a default filename in this format: GL_PROJECT_ACTIVITY_SINCE_${DATE}.csv
--filename / -f
<OPTIONAL>
USAGE
: gl-query get projects --csv --filename all_projects.csv
The --filename
flag is simply the name of CSV file to write to, therefore, it is recommended to have a .csv
extension. Avoid using relative paths if you do choose to specify the path. If --filename
flag is used, must use the --csv
flag.
--language / -l
<OPTIONAL>
USAGE
: gl-query get projects --language Scala
Specify a programming language if you would like to filter GitLab projects by language. The first letter of the language must be capitalized. If flag is not used, all languages will be queried by default.
TODO
: Exception Handling.
--date-before
<OPTIONAL>
USAGE
: gl-query get projects --date-before 2019-03-04
Displays results before specified date. The argument must follow the YYYY-MM-DD format.
--date-after
<OPTIONAL>
USAGE
: gl-query get projects --date-after 2019-03-04
Displays results after specified date. The argument must follow the YYYY-MM-DD format.
--has-pipeline / -p
<OPTIONAL>
USAGE
: gl-query get projects -p
The flag takes no arguments and causes output to only contain results that have at least one associated pipeline. This is useful for viewing projects that have pipeline functionality.
--silent
<OPTIONAL>
USAGE
: gl-query get projects --lang java -c -f java.csv --silent
The flag blocks output to the console or standard out. This is good for security purposes (if this tool is being used in the GitLab CI/CD) or if the output is being written to CSV so you don't need console output.
--scan-type
<OPTIONAL>
USAGE
:
gl-query get scans --scan-type sast --date-after 2021-08-02 -l java
gl-query get scans --date-after 2021-08-02 --scan-type "sast,srcclear,dependency-scan,srcclr" -l scala
This flag must be used in conjunction with the get scans
action and outputs job information pertaining to the scan type(s). Currently there are three scan types supported for this query:
sast
(JS, Python, Go, Java, C++, sbt, Rust )dependency-scan
(JS, Python, Go, Java, sbt, Rust)srcclear
or srcclr
(JS, Python, Go, Java, C++, sbt, Rust )
If you choose to select multiple scan types, remember to comma-delimit them surround them in double quotes as in the example above.
--version
Obtains version of gl-query
utility. This should correspond with a Git Tag.
Usage Examples
gl-query get project --project-id 3430
gl-query get projects -l scala --date-filter 2021-07-30 --csv --filename test.csv -p
gl-query get pipelines --project-id 4748
gl-query get pipeline --project-id 3430 --pipeline-id 247249
Dev Guide
Useful curl
commands for troubleshooting
curl --header "Authorization: Bearer <your_access_token>" "https://gitlab.gaikai.com/api/v4/projects"
To get x-total-pages
:
curl -s -I --header "Authorization: Bearer <your_access_token>" "https://gitlab.gaikai.org/api/v4/projects/"
Future Work
-
Testing (CI) : Integrate gl-query_test.sh
script into pipeline or pre-commit
hook
-
Implement the conditional query param (date_before
versus date_after
dilemna) here
-
Implement GitLab Birthday in config file
-
For get scans
action, implement:
- Lookup by project-id / project-name
-
Implement custom exclude_projects functionality
-
API throttling -- backoff after a certain number of API Requests (need to find out GitLab API limits)
-
Minimize / consolidate number of API calls
-
Investigate more robust / centralized exception Handling
-
Flags
--top-level-group
--cgei-subgroup
--include-user-repositories
-
Implement search functionality: https://gitlab.gaikai.org/api/v4/projects?search=nexus
-
Create a global query options function and child functions for each type of object (projects, pipelines, etc.)
-
Integrate --csv
output for all actions
Developer Notes
Manually Deploying to PyPi
bumpversion --current-version <CURRENT_VERSION> minor setup.py gl-query/__init__.py --allow-dirty
python3 setup.py sdist bdist_wheel
twine upload dist/gl-query-<VERSION_NUMBER>*
twine upload dist/gl_query-<VERSION_NUMBER>*
Query Options / Parameters
-
Query parameters are generally of concern when executing listing operations (i.e. list projects, groups, users, pipelines, jobs, merge requests, etc...)
-
Query parameters are generally not used when dealing with a specific project or pipeline.
-
GLOBAL Query Parameters
-
Projects Query Parameters
archived=false
last_activity_after
last_activity_before
with_programming_language
order_by=last_activity_at
-
Pipelines
order_by=updated_at
yaml_errors=false
scope=finished
updated_after
updated_before
-
To implement pipeline_id
filter on get scans
:
- Iterate through pipeline pages for given project
- Piggy back off the get_pipeline method for job info
- Maintain running sum of target scans
Using Date Filters
tldr; Do not use both date_before
and date_after
in projects queries.
Scenario:
-
Today's date is August 3, 2021
-
Use Case: You are searching for sast
scans between July 15, 2021 and July 30, 2021
-
Project X has a pipeline which ran on July 15, 2021 with a sast
scan.
-
If you execute a gl-query
with --date-before 2021-07-30
and --date-after 2021-07-15
, this will only include projects that were last active
between July 15, 2021 and July 30, 2021 as it is using the &last_activity
query parameters.
-
If Project X, was updated after 2021-07-30
for any reason, it will not show up in the query results since it's not between the specified date range, therefore excluding a perfectly legitimate sast
scan from the results.
-
The solution to this is to not use both the &last_activity_before
and &updated_before
query parameters when making requests to projects
API endpoints respectively. Instead of using the date_before
param, we will use just the date_after
param. This will ensure that all projects and pipelines will be included in your query.
-
Although this avoids the false negative problem, it significantly increases query time since it will query ALL projects/pipelines AFTER the given date. It is possible to speed this up by implementing the following logic:
if (DATE_BEFORE - GL_BDAY ) > ( TODAY - DATE_AFTER):
Use date_after for `&last_activity_after` and `&updated_after` AND do not use date_before
else:
Use the date_before for `&last_activity_before` and `&updated_before`
- This approach has it's own risks as GL project/pipeline activity may be skewed such that activity was low for the first couple months / year, so keep this into account if you choose to implement this.
Useful Queries
All Scans & All Languages since a certain date
gl-query get scans --all-languages --scan-type "sast, dependency-scan, srcclr, srcclear" --date-after 2021-07-04 --csv -f all_scans_since_07_04_21.csv
Note, we query for both srcclear
and srcclr
since they are both used as job names in GitLab CI.
Java SAST Scans ( w/ --date-after
)
# Java SAST Scans (July 3 - August 3)
gl-query get scans --scan-type sast -l Java --date-after 2021-07-03
RESULT:
Total SAST Scan Jobs after 2021-07-03 : 576
ALL sast
Scans ( w/ --date-after
)
gl-query get scans --scan-type sast --all-languages --date-after 2021-07-03
ALL dependency-scan
( w/ --date-after
)
gl-query get scans --scan-type dependency-scan --all-languages --date-after 2021-07-03
ALL srcclr
or srcclear
( w/ --date-after
)
gl-query get scans --scan-type srcclr --all-languages --date-after 2021-07-03
gl-query get scans --scan-type srcclear --all-languages --date-after 2021-07-03
[DANGEROUS] ALL sast
scans EVER
# This will look through every project since GL_BDAY (JANUARY 1, 2019) which could generate tremendous load against the GitLab instance
TODAY=$(date '+%Y-%m-%d')
gl-query get scans --scan-type sast --all-languages --date-before $TODAY