Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

ahjo

Package Overview
Dependencies
Maintainers
5
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

ahjo

Database deployment framework

  • 3.10.1
  • PyPI
  • Socket score

Maintainers
5

Ahjo Framework

PyPI - Python Version PyPI

Description

Ahjo is a database project framework and a deployment script. It is made to unify database project deployment and development practices and to give basic tooling for such projects.

Ahjo provides a base scripts for database deployment with simple commands ("actions"), and the possibility to define custom actions for project's special needs. The scripts are designed to reduce accidental operations to production environments. The actions and their parts are logged by Ahjo.

Database tooling is currently based on sqlalchemy/alembic and SQL scripts. Support for other backends than Microsoft SQL Server is currently limited.

Dependencies

Common

Extras

mssql

azure

visualizations

Install Guide

Install Guide 1 - PyPI

Install Ahjo (without platform-specific dependencies) from Python Package Index with the following command:

pip install ahjo

In order to use Ahjo with the database engine of your choice, install platform-specific dependencies using available tags. For example, if you use Ahjo with Microsoft SQL Server, use tag mssql to install required dependencies. See full list of available tags below.

pip install ahjo[mssql]

Install Guide 2 - Clone and install

  1. Clone Ahjo

    • Get your clone command from the repository.
  2. Install with pip

    • Use -e flag to install package in develop mode.
cd .\ahjo
pip install [-e] .[mssql,azure]

Available platform tags

  • mssql - Microsoft SQL Server
  • azure - Microsoft Azure SQL Database

Install Guide 3 - MSI installation package

Building an MSI installation package

This assumes you have cloned the source code repository and have it open in a shell.

Create a new, empty build venv and install build requirements into it.

Notice: at the time of writing this (11/2024), the latest stable cx_freeze version (7.2.5) does not support Python 3.13 yet. As soon as new cx_freeze version with Python 3.13 support is released, it should be taken in to use.

python -m venv venv_msi_build
.\venv_msi_build\Scripts\Activate.ps1 
pip install -r .\msi_build_requirements.txt

Notice: the last command installs ahjo with the most common options. You may need to update msi_build_requirements.txt to suit your needs.

Notice: if you make changes to the source code, you must install ahjo into build venv again to have those changes included in the next build. Editable pip install doesn't work here, so don't use it.

With the build venv active, build the MSI package with the following command.

python .\msi_build.py bdist_msi

Find the built MSI installation package under (automatically created) dist directory.

Installing the MSI package

MSI installation package installs everything that is needed to execute ahjo shell commands including the required parts of the Python runtime setup. In other words, the target environment doesn't need to have Python installed and there is no need to create separate venvs for ahjo.

  1. Run the msi installer with the default settings it offers
  2. Make sure to start a new shell instance (e.g. Windows PowerShell) after the installation
  3. After a successful installation the following CLI commands are available in the shell:
    • ahjo
    • ahjo-init-project
    • ahjo-multi-project-build
    • ahjo-upgrade
    • ahjo-scan
    • ahjo-install-git-hook
    • ahjo-config
  4. If a new shell instance can't find the executables, ensure that installation path is included in the PATH enviroment variable

Project Initialization

Create a new project by running the following command:

ahjo-init-project

This will start the project initialization command and you'll be asked to enter a name for your new project:

This is Ahjo project initialization command.
Enter project name:

Enter a name for your project and hit enter. Confirm, if project name and locations are correct.

Enter project name: project_1
You are about to initialize a new project project_1 to location C:\projects
Are you sure you want to proceed?
[Y/N] (N): Y
confirmed
[2019-06-04 08:46:23] Ahjo - Creating new project project_1
Project C:\projects\project_1 created.

Usage Example

Before running actions:

  1. Install Ahjo (see "Install Guide")
  2. Initialize project using ahjo-init-project (see "Project Initialization")
  3. Have your development database server running (SQL Server for the example)
  4. Fill database connection information to the config-file

To create a database without objects, run the following command in the project root:

ahjo init config_development.jsonc

After the command, there should be a empty database at the server, as configured in config_development.jsonc. This step is usually run only once, and is not required if the database already exists in the server.

After tables are defined using alembic (see alembic's documentation for creating new version scripts), the tables can be deployed using:

ahjo deploy config_development.jsonc

This command also runs all the SQL scripts that are defined in directories database/functions, database/procedures and database/views.

Conventionally scripts under database/data include row inserts for dimension tables, and database/testdata for mock data insertion scripts. To populate the dimension tables run:

ahjo data config_development.jsonc

To run test SQL on top of mock data:

ahjo testdata config_development.jsonc
ahjo test config_development.jsonc

To run all the previous commands at once, a single (multi-)action "complete-build" can be used:

ahjo complete-build config_development.jsonc

To deploy your project to production you need a new config-file. In production environment actions like "downgrade" can be quite hazard. To exclude such actions set "allowed_actions" to a list:

"allowed_actions": ["deploy", "data"]

Now running "downgrade" is not possible using production configuration.

ahjo downgrade config_production.jsonc
[2019-10-01 12:58:12] Starting to execute "downgrade"
Action downgrade is not permitted, allowed actions: deploy, data
------

To add your own actions (f.e. for more complex testing), modify ahjo_actions.py.

Scripts

Below is a list of available scripts and their descriptions.

ScriptDescription
ahjoMain script for running Ahjo actions.
ahjo-init-projectScript for initializing new Ahjo project.
ahjo-multi-project-buildScript for running actions from multiple projects at once.
ahjo-upgradeScript for running version upgrades.
ahjo-scanScript for searching matches with defined search rules from files in the working directory or git staging area.
ahjo-install-git-hookScript for installing Ahjo scan command as a pre-commit hook.
ahjo-configScript for converting config files between JSON/JSONC and YAML formats.

Actions

Running actions

Ahjo actions are run with ahjo command. The command syntax with positional arguments is:

ahjo <action> <config_filename>

where <action> is the name of the action to be run and <config_filename> is the path to the project-specific config file. The config parameter is optional if the path is defined in environment variable AHJO_CONFIG_PATH.

The rest of the optional parameters are listed below.

ArgumentShorthandDescriptionDefault Value
listList all available actions and their descriptions.
--non-interactive-niSkip confirmation for actions that affect the database.False
--filesList of files to be used in action.
--object_typeType of database object.
--skip-metadata-update-smSkip updating documented extended properties to database.False
--skip-alembic-update-saSkip running alembic migrations.False
--skip-git-update-sgSkip updating current git version to git version table.False

It is also possible to pass custom command-line arguments and their values to actions.

For example, to pass a custom argument --example-arg with values x and y to action example-action, use the following command:

ahjo example-action --example-arg x y

In the action, the values of the custom argument can be accessed from the context object:

example_arg_values = context.get_cli_arg("example-arg") # Returns a list of strings ["x", "y"]

The get_cli_arg method returns a string if the argument has only one value. If the argument has multiple values, it returns a list of strings. If only the argument name is given without values, the method returns True by default. This way, it is possible to pass boolean arguments to actions.

Pre-defined actions

init-config

Initializes local configuration file. This file is intended to hold configurations that should not be versioned.

init

Creates the database. Database is created with module create_db.py. Required configurations are target_database_name, target_server_hostname and sql_port. For optional configurations, see config file cheat sheet below.

structure

Creates database structure, that is schemas, tables and constraints. Primary method runs all SQL files under directories ./database/schemas, ./database/tables and ./database/constraints. Alternate method runs SQL script ./database/create_db_structure.sql. If structure can't be created with primary method (one of the listed directories does not exists etc.), alternate method is attempted.

deploy

Runs alembic migrations, creates views, procedures and functions. First, runs alembic upgrade head. Second, creates functions, views and procedures by executing all SQL files under directories ./database/functions, ./database/views and ./database/procedures. Third, attempts to update documented extended properties to database. Finally, updates current GIT version (git describe) to GIT version table.

deploy-files

Runs alembic upgrade head, creates database objects by executing all SQL files listed in --files argument and updates current GIT version (git describe) to GIT version table. Example usage: ahjo deploy-files .\config_development.jsonc --files ./database/procedures/dbo.procedure.sql ./database/functions/dbo.function.sql .

data

Runs data insertion scripts. Runs all SQL files under ./database/data.

testdata

Inserts data for testing into database. Runs all SQL files under ./database/data/testdata.

complete-build

Runs actions init, deploy, data, testdata and test in order.

create-db-roles

Runs SQL scripts defined in ./database/permissions/create_db_roles.sql to create database roles. To pass scripting variables to the SQL script, define them in the config file under db_permissions_variables key (optional).

grant-db-permissions

Runs SQL scripts defined in ./database/permissions/grant_db_permissions.sql to grant database permissions. To pass scripting variables to the SQL script, define them in the config file under db_permissions_variables key (optional).

create-db-users

Runs SQL scripts defined in ./database/permissions/create_db_users.sql to create database users. To pass scripting variables to the SQL script, define them in the config file under db_permissions_variables key (optional).

add-users-to-roles

Runs SQL scripts defined in ./database/permissions/add_users_to_db_roles.sql to add users to database roles. To pass scripting variables to the SQL script, define them in the config file under db_permissions_variables key (optional).

drop

Drops all views, procedures, functions, clr-procedures and assemblies that are listed in directory ./database. Drops are executed with TRY-CATCH.

drop-files

Drops database objects from locations that are listed in --files argument. Database objects can be views, procedures, functions or assemblies. Object type is read from --object_type argument. Acceptable --object_type parameters: view, procedure, function, assembly. Example usage: ahjo drop-files .\config_development.jsonc --files ./database/procedures/dbo.procedure_1.sql ./database/procedures/dbo.procedure_2.sql --object_type procedure .

downgrade

Reverts the database back to basic structure. First, drops all views, procedures, functions, clr-procedures and assemblies that are listed in directory ./database. Drops are executed with TRY-CATCH. Second, runs alembic downgrade base.

version

Prints the alembic- and git-version currently in the database. Alembic version is read from alembic_version_table. GIT version is read from git_table.

update-file-obj-prop

Write database objects and their extended properties (only SQL Server) to JSON files under ./docs/db_objects directory. Use metadata_allowed_schemas to define which schema-scoped objects are documented. The JSON files are named after the object type, for example procedures.json or views.json. Example of JSON file content:

{
    "dbo.procedure_1": {
        "description": "This is a procedure",
        "author": "John Doe"
    },
    "dbo.procedure_2": {
        "description": "This is another procedure",
        "author": "Jane Doe"
    }
}

Each object is represented as a dictionary where the key is the object name and the value is a dictionary of extended properties. The dictionary can contain any number of key-value pairs where the key is the name of the extended property and the value is the value of the extended property. If the object type is column, the object name should be in the format: <schema name>.<table or view name>.<column name>.

update-db-obj-prop

Update documented extended properties (only SQL Server) from JSON files under ./docs/db_objects directory to database. Use metadata_allowed_schemas to define which schema-scoped objects are documented. The format of the JSON files is the same as defined in update-file-obj-prop.

It is also possible to define default metadata for columns in ./docs/db_objects/columns_default.json file. So if a table has a column called id, and a default value for the Description for a column called id has been defined in docs/db_objects/columns_default.json, that value is used, unless there is a row for that specific table's id column in docs\db_objects\columns.json.

To define default metadata for columns, use the following format:

{
    "id": {
        "Description": "Surrogate key."
    },
    "Timestamp": {
        "Description": "Timestamp of the record creation."
    }
}

where the key is the column name and the value is a dictionary of extended properties.

test

Runs tests and returns the results. Runs all SQL scripts under ./database/tests. If save_test_results_to_db is set to true, saves the results to a table defined in test_table_name and test_table_schema. The columns of the result set should match the columns of the table where the results are saved (only the matching columns are saved to the database - the rest are ignored).

create_test_table

Creates a table for test action results. The table is created with the columns defined in a global variable DEFAULT_TEST_TABLE_COLS in ahjo.scripts.master_actions.py. The name and schema of the table are defined in test_table_name and test_table_schema.

create_test_view

Creates a view for displaying test action results. The columns of the view are the same as the columns of the table where the results are saved. The name and schema of the view are defined in test_view_name and test_view_schema.

plot-dependencies

Parse the SQL files in the project and create a dependency graph of the database objects. Plots the graph with plotly. Optionally, use --files to parameter define which files are used in the action.

list

You can view all available actions and their descriptions with command ahjo list.

ahjo list
-------------------------------
List of available actions
-------------------------------
'assembly': (MSSQL) Drop and deploy CLR-procedures and assemblies.
'complete-build': (MSSQL) Run 'init', 'deploy', 'data', 'testdata' and 'test' actions.
.
.
.

Config File

Ahjo requires config file to be JSON, JSONC (JSON with comments) or YAML format. Ahjo configs are located in BACKEND section of file. Below is an example of config file (in both JSONC and YAML format) and a cheat sheet for config file parameters.

JSONC config file

{
    "BACKEND": {
        "allowed_actions": "ALL",
        //Git repository and git version table information
        "url_of_remote_git_repository": "https:\\\\github.com\\user\\projectx\\",
        "git_table": "git_version",
        "git_table_schema": "dbo",
        //Database connection information
        "sql_port": 1433,
        "sql_driver": "ODBC Driver 18 for SQL Server",
        "target_database_name": "PROJECTX",
        "target_server_hostname": "localhost",
        // Database file location
        "database_data_path": "/var/opt/mssql/data/PROJECTX.mdf",
        "database_log_path": "/var/opt/mssql/data/PROJECTX.ldf",
        //Alembic
        //the table that alembic creates automatically for migration version handling
        "alembic_version_table": "alembic_version",
        "alembic_version_table_schema": "dbo",
    }
}

YAML config file

BACKEND:
  ## List of allowed Ahjo actions. If all actions are allowed use "ALL" option.
  allowed_actions: ALL
  ## Git repository url and git version table information.
  url_of_remote_git_repository: ''
  git_table: git_version
  git_table_schema: dbo
  ## Database connection information.
  sql_port: 14330
  sql_driver: ODBC Driver 18 for SQL Server
  sql_dialect: mssql+pyodbc
  target_server_hostname: localhost
  target_database_name: DB_NAME

Config file cheat sheet

ParameterRequiredDescriptionTypeDefault Value
alembic_version_tableNoName of Alembic version table. Table holds the current revision number.str"alembic_version"
alembic_version_table_schemaNoSchema of Alembic version table.str"dbo"
allowed_actionsYesList of actions Ahjo is allowed to execute. If all actions are allowed, use "ALL".str or list of str
skipped_actionsNoList of actions that are skipped.list of str[]
azure_authenticationNoAuthentication type to Azure AD. Possible values are "ActiveDirectoryPassword", "ActiveDirectoryInteractive", "ActiveDirectoryIntegrated" and "AzureIdentity".str
azure_identity_settingsNoA dictionary containing parameters for azure-identity library (used only if azure_authentication is "AzureIdentity"). Dictionary holds a key: "token_url" (str). Note: currently ahjo supports only AzureCliCredential authentication method.dict
database_collationNoCollation of database. If the defined collation is different from the database collation, a warning is logged.str"Latin1_General_CS_AS"
database_compatibility_levelNoCompatibility level of database.intDepends on server. SQL Server 2017 default is 140.
database_data_pathNoPath of database data file.str
database_file_growthNoThe size (MB) of how much database data file will grow when it runs out of space.int500
database_init_sizeNoInitial size (MB) of database data file.int100
database_log_pathNoPath of database log file.str
database_max_sizeNoMaximum size (MB) of database data file.int10000
enable_database_loggingNoEnable logging to database.booleanfalse
log_table_schemaNoSchema of log table.str"dbo"
log_tableNoName of ahjo log table.str"ahjo_log"
git_tableNoName of git hash table. Table holds current branch, commit hash and URL of remote repository.str"git_version"
git_table_schemaNoSchema of git hash table.str"dbo"
metadata_allowed_schemasNoList of schemas that extended properties will be written to JSON files and updated to database. If list left empty, nothing is documented or updated.list of str
password_fileNoPath of file where password will be stored. If no path given, credentials are asked everytime any database altering action is run.str
sql_dialectNoDialect used by SQL Alchemy.str"mssql+pyodbc"
sql_driverNoName of ODBC driver.str
sql_portYesPort number of target database server.int
target_database_nameYesName of target database.str
target_server_hostnameYesHost name of target database server.str
target_database_protectedNoAsks user to verify database name before running actions.booleanfalse
url_of_remote_git_repositoryNoURL of project's remote repository.str
username_fileNoPath of file where username will be stored. If no path given, credentials are asked everytime any database altering action is run.str
db_permissionsNoList of dictionaries containing file locations & scripting variables for setting up database permissions from sql file(s). Dictionary holds keys: "source" (str) and "variables" (dict).list of dict
db_permission_invoke_methodNoInvoke method for setting up database permissions. Available options: "sqlcmd" or "sqlalchemy" (default).str"sqlalchemy"
upgrade_actions_fileNoConfiguration file for upgrade actions.str"./upgrade_actions.jsonc"
catalog_collation_type_descNoCatalog collation setting of database. If the defined setting is different from the database setting, a warning is logged. Applies only to Azure SQL Databasestr"DATABASE_DEFAULT"
display_db_infoNoPrint database collation information to console before running actions.booleantrue
context_connectable_typeNoType of SQLAlchmey object returned by Context.get_connectable(). Possible values are "engine" and "connection".str"engine"
transaction_modeNoTransaction management style for ahjo actions. Applied only if context_connectable_type is "connection". Possible values are "begin_once" and "commit_as_you_go". If "begin_once", a transaction is started before running actions and committed after all actions are run. If "commit_as_you_go", a transaction is started before running actions but not committed automatically.str"begin_once"
git_version_info_pathNoPath to git version info file. Retrieve git commit information from this file if git repository is not available. JSON file format: {"repository": "<url>", "commit": "<commit hash>", "branch": "<branch name>"}str
windows_event_logNoLog Ahjo events to Windows Event Log.booleanfalse
ahjo_action_filesNoDefines the location and name of project-specific Ahjo actions files.list of dict
sqlalchemy.urlNoSQLAlchemy database URL. If defined, overrides the values of dialect, sql_port, sql_driver, target_server_hostname and target_database_name.str
sqlalchemy.*NoItems under sqlalchemy.* are passed as parameters to SQLAlchemy's create_engine function. For example sqlalchemy.pool_size: 10 is passed as pool_size=10 to create_engine function.dictIf dialect is mssql+pyodbc: "sqlalchemy.use_insertmanyvalues": false, "sqlalchemy.use_setinputsizes": false
sqla_url_query_mapNoA dictionary containing SQLAlchemy URL query connection parameters.dictIf ODBC Driver 18: {"Encrypt" : "yes", "LongAsMax": "Yes"}. If ODBC Driver 17 or older: {"Encrypt" : "no"}. Else: {}
enable_sqlalchemy_loggingNoEnable SQLAlchemy logging.booleanfalse
save_test_results_to_dbNoSave test action results to database table.booleanfalse
test_table_schemaNoSchema of the table where test action results are saved.str"dbo"
test_table_nameNoName of the table where test action results are saved.str"ahjo_tests"
create_test_table_if_not_existsNoCreate test table if it does not exist.booleantrue
test_view_nameNoName of the view that is used to display test action results.str"vwAhjoTests"
test_view_schemaNoSchema of the view that is used to display test action results.str"dbo"
connect_resilientlyNoTest database connection before running actions. If connection fails, retry connection for connect_retry_count times with connect_retry_interval seconds interval.booleantrue
connect_retry_countNoNumber of retries for database connection.int20
connect_retry_intervalNoInterval between connection retries in seconds.int10

Config conversion

Config file can be converted from JSON/JSONC to YAML or vice versa with ahjo-config command:

ahjo-config --convert-to <target_format> --config <config_file_path> --output <output_file_path>

Using Alembic with Ahjo

Alembic upgrade HEAD is used in deploy action, but for many use cases other alembic commands are needed. For these needs Ahjo comes with a env.py file that enables running Alembic commands without running Ahjo.

The env.py modifications provide logging integration to Ahjo, default naming schema and possibility to run alembic according to project configuration. The engines are created according to configuration, and there is no need for storing plain-text connection strings in the project.

Usage example:

alembic -x main_config=config_development.jsonc downgrade -1

The env.py is created in initialize-project command.

Alembic API commands

Ahjo supports running alembic commands with the following command:

alembic-api <arguments>

Alembic API can be used for example in a situation where Ahjo is installed with MSI package and one needs to run alembic commands without python environment.

Examples:

alembic-api revision -m "create table <table_name>"
alembic-api -x main_config=config_development.jsonc upgrade +1

Authentication

Depending on the configuration, the database credentials can be stored into files or be asked when needed, once per application run. The credential handling is shared with alembic with custom env.py file. The username and password files can be defined in the config file with the keys username_file and password_file. If no path is given, credentials are asked every time any database altering action is run. The password and username files are created automatically if they do not exist.

Microsoft Entra

Ahjo supports authentication with Microsoft Entra / Azure AD. The authentication type is defined in the config file with the key azure_authentication. The possible values are "ActiveDirectoryPassword", "ActiveDirectoryInteractive", "ActiveDirectoryIntegrated" and "AzureIdentity". The authentication type "AzureIdentity" is used for authentication with azure-identity library (see Azure-identity).

Azure-identity

Instructions for enabling azure identity authentication in ahjo:

  1. Install Azure CLI & azure-identity
  2. Set the config variable azure_authentication to AzureIdentity

Sign in interactively through your browser with the az login command. If the login is successful, ahjo will use Azure credentials for creating an engine that connects to an Azure SQL database.

Running actions from multiple projects

To run all selected actions from different projects at once, a single command "ahjo-multi-project-build" can be used:

ahjo-multi-project-build path/to/config_file.jsonc

Use -c or --confirm flag to enable confirmation messages for ahjo actions.
Below is an example of JSONC config file. With the following definition, multi-project-build command executes complete-build actions of three ahjo projects:

{
    "projects_path": "path/to/projects_folder",
    "connection_info": {
        "sql_port": 14330,
        "sql_driver": "ODBC Driver 18 for SQL Server",
        "target_server_hostname": "localhost"
    },
    "projects": {
        "ahjo_project_1_name": {
            "config": "path/to/projects_folder/ahjo_project_1_name/config_development.jsonc",
            "actions": [
                "complete-build"
            ]
        },
        "ahjo_project_2_name": {
            "config": "path/to/projects_folder/ahjo_project_2_name/config_development.jsonc",
            "actions": [
                "complete-build"
            ]        
        },
        "ahjo_project_3_name": {
            "config": "path/to/projects_folder/ahjo_project_3_name/config_development.jsonc",
            "actions": [
                "complete-build"
            ]        
        }
    }
}

Settings under connection_info contains database server definitions in the same format as in ahjo project config file (excluding target_database_name parameter, which is not used here).
Currently in this version ahjo projects should be located under the folder specified in projects_path setting.
Ahjo project names are listed under projects section in run order. In this example, the actions of project ahjo_project_1_name are executed first and the actions of project ahjo_project_3_name are executed last.
The following settings are defined under the name of the ahjo project(s):

config - File path to the project-specific config file
actions - List of project actions to be executed

Ahjo project upgrade

Database updates can be run with ahjo-upgrade command. The command detects automatically the latest installed git version and runs the required database version updates (in order). The upgrade actions are defined in a JSONC file and its location is defined in upgrade_actions_file setting in project config file. The ahjo actions required for version upgrades are defined with key-value pairs, where key is the git version tag and value is a list of actions. The list of actions can contain strings of action names or lists of action names and action parameters. If the action is defined with parameters, the action name is the first item in the list and the parameters are defined as a dictionary in the second item in the list. The dictionary contains the parameters of the action as key-value pairs, where key is the parameter name and value is the parameter value.

Below is an example of upgrade actions file:

{
	"v3.0.0": ["deploy", "data"],
	"v3.1.0": ["deploy", ["deploy-files", {"files": ["./database/procedures/dbo.procedure_1.sql", "./database/procedures/dbo.procedure_2.sql"]}]],
	"v3.1.1": ["deploy"]
}

To upgrade specific version, use -v or --version flag:

ahjo-upgrade -v v3.1.0

Ahjo scan

ahjo-scan command can be used to search for matches with defined search rules from files in the working directory or git staging area. The search results are printed to the console and logged to a file. The command can be used to search e.g. illegal database object modifications, sensitive information or custom regex patterns defined by the user.

ArgumentShorthandDescriptionRequiredDefault Value
--search-rules-rPath to YAML file that defines the search rules.No./ahjo_scan_rules.yaml
--stage-stScan files in git staging area instead of working directory.NoFalse
--ignore-config-igPath to YAML file that defines the ignore rules.No./ahjo_scan_ignore.yaml
--init-inInitialize config files for scan rules and ignored scan results.NoFalse
--add-results-to-ignore-aiAdd found scan results to ignore config file.NoFalse

The search rules are defined as a list of dictionaries. Each dictionary contains a search rule name, a list of file paths to be searched and parameters for the search rule. It is also possible to define a custom regex pattern for the search rule instead of using predefined search rules. The regex pattern is defined as a string in the pattern parameter.

Built-in search rules

Rule nameDescriptionAcceptable parameters
hetuFinnish Personal Identity Numberfilepath
emailEmail addressfilepath
sql_object_modificationDatabase object modification (SQL Server). The search rule searches for database object modifications from SQL files. The search rule can be configured to search for modifications of specific database object types (e.g. PROCEDURE) and/or specific database object schemas and/or specific database objects (e.g. table names). If no parameters are defined, the search rule searches for modifications of all database object types, all schemas and all objects.object_types, schemas, objects, filepath
alembic_table_modificationDatabase table modification (Alembic). The search rule searches for database table modifications from alembic migrations. If no parameters are defined, the search rule searches for modifications of all schemas and all tables.schemas, filepath
sql_insertDatabase insert (SQL Server). The search rule searches for database inserts from SQL files. The search rule can be configured to search for inserts to specific database object schemas and/or specific database tables. If no parameters are defined, the search rule searches for inserts to all schemas and all tables.schemas, tables, filepath

Rule parameters

ParameterDescriptionType
filepathList of file paths.list
patternCustom regex pattern.str
schemasList of database object schemas.list
object_typesList of database object types.list
objectsList of database objects.list
tablesList of database tables.list

Search rules file

The search rules are defined in a YAML file. The file can be initialized with --init or -in flag. By default, the search rules are defined in ./ahjo_scan_rules.yaml file. Use --search-rules or -r flag to define a different path for the search rules file.

Here is an example of search rules file:

- name: sql_object_modification
  filepath: 
    - database/
  object_types:
    - PROCEDURE
  schemas:
    - dummy
    - utils
- name: hetu
  filepath: 
    - database/
    - alembic/versions/
- name: select_star # Custom rule
  filepath: 
    - database/
  pattern: SELECT \* # This is a regular expression

Ignoring scan results

To filter out false positives, scan results can be ignored by defining ignore rules in a YAML file. The ignore rules are defined as a list of dictionaries. Each dictionary contains a file path and a list of matches or rules to be ignored. The file can be initialized with --init or -in flag. By default, the ignore rules are defined in ./ahjo_scan_ignore.yaml file. Use --ignore-config or -ig flag to define a custom path for the ignore results file.

Below is an example of ignore results file:

- file_path: database/data/persons.sql
  matches:
    - 010106A921P
    - 130202A904N
- file_path: database/data/addresses.sql
  rules:
    - sql_object_modification

Adding scan results to ignore file

Using --add-results-to-ignore or -ai flag, the found scan results can be added to the ignore file. The flag can be used to add all found scan results to the ignore file. If the found scan results already exist in the ignore file, they are not added again.

Scan as a pre-commit hook

Ahjo scan command can be used as a pre-commit hook to prevent committing files that contain e.g. sensitive information or illegal database object modifications. This can be accomplished by utilizing a Git pre-commit hook script that automatically executes ahjo-scan command on each commit and prevents the commit if the scan finds matches with the defined search rules. To use the hook, you need to have the ahjo-scan.exe accessible as a shell command. (e.g. the tool is installed from an MSI package, see Installation Guide 3).

Setting up the hook

To install the hook, run the following command in the project root directory:

ahjo-install-git-hook

The script creates a file named pre-commit to Git hooks directory. By default, Git hooks are located in the .git/hooks directory of the repository. This can be changed by setting the core.hooksPath configuration variable to the desired path. See Git documentation for more information.

Logging

Log files

Everything Ahjo prints to console, is also written into log file ahjo.log in the project root directory. The log files are created automatically if they do not exist.

Database log

Ahjo supports database logging. The logging can be enabled by setting enable_database_logging to true in the config file. The log entries are written into a log table. The name and schema of the log table can be defined in the config file with keys log_table and log_table_schema. The log table is created automatically if it does not exist.

Windows Event Log

Logging can be done to Windows Event Log by setting windows_event_log to true in config file. This feature can be utilized for Azure Monitor activities, for example.

SQLAlchemy log

SQL Alchemy logging can be enabled by setting enable_sqlalchemy_logging to true in config file. The logging is done to a file named sqlalchemy.log in the project root directory. The log files are created automatically if they do not exist.

Customization

Setting up custom action files

Every default Ahjo action and multiaction can be overwritten in project's Ahjo actions file. By default, the file is located in ahjo_actions.py, but the location can be changed in config file with key ahjo_action_files. It is possible to define multiple Ahjo actions files. This can be useful for example when you want to use different actions for different environments or separate actions that are compatible with MSI-installed ahjo from actions that are compatible with pip-installed ahjo. The action files are loaded in order, so if you define multiple actions for the same action name, the last loaded action file will be used. If a multi-action uses actions from different files, remember to load all the subactions before the multi-action. Here is an example of ahjo_action_files configuration:

"ahjo_action_files": [
    {
        "source_file": "ahjo_prod_actions.py", // The location is relative to project root directory.
        "name": "Example project Ahjo actions (prod)" // Name is used in logging.
    },
    {
        "source_file": "ahjo_dev_actions.py", 
        "name": "Example project Ahjo actions (dev)"
    }
]

Overwriting default Ahjo actions

In example below, 'init' and 'complete-build' actions are overwritten.

import ahjo.operations as op
from ahjo.action import action, create_multiaction


@action(affects_database=True)
def structure(context):
    """New structure action."""
    from models import Base
    Base.metadata.create_all(context.get_engine())

# overwrite Ahjo complete-build
create_multiaction("complete-build", ["init", "structure", "deploy", "data", "test"])

License

Copyright 2019 - 2024 ALM Partners Oy

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc