Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects.
airflowctl
is a command-line tool for managing Apache Airflow™ projects.
It provides a set of commands to initialize, build, start, stop, and manage Airflow projects.
With airflowctl
, you can easily set up and manage your Airflow projects, install
specific versions of Apache Airflow, and manage virtual environments.
The main goal of airflowctl
is for first-time Airflow users to install and setup Airflow using a single command and
for existing Airflow users to manage multiple Airflow projects with different Airflow versions on the same machine.
pip install airflowctl
To initialize a new Airflow project with the latest airflow version, build a Virtual environment and run the project, run the following command:
airflowctl init my_airflow_project --build-start
This will start Airflow and display the logs in the terminal. You can
access the Airflow UI at http://localhost:8080. To stop Airflow, press Ctrl+C
.
To create a new Apache Airflow project, use the init command. This command sets up the basic project structure, including configuration files, directories, and sample DAGs.
airflowctl init <project_name> --airflow-version <version> --python-version <version>
Example:
airflowctl init my_airflow_project --airflow-version 2.6.3 --python-version 3.8
This creates a new project directory with the following structure:
my_airflow_project
├── .env
├── .gitignore
├── dags
│ └── example_dag_basic.py
├── plugins
├── requirements.txt
└── settings.yaml
Description of the files and directories:
.env
file contains the environment variables for the project..gitignore
file contains the default gitignore settings.dags
directory contains the sample DAGs.plugins
directory contains the sample plugins.requirements.txt
file contains the project dependencies.settings.yaml
file contains the project settings, including the project name,
Airflow version, Python version, and virtual environment path.In our example settings.yaml
file would look like this:
# Airflow version to be installed
airflow_version: "2.6.3"
# Python version for the project
python_version: "3.8"
# Path to a virtual environment to be used for the project
mode:
name: "uv"
config:
venv_path: "PROJECT_DIR/.venv"
# Airflow connections
connections:
# Example connection
# - conn_id: example
# conn_type: http
# host: http://example.com
# port: 80
# login: user
# password: pass
# schema: http
# extra:
# example_extra_field: example-value
# Airflow variables
variables:
# Example variable
# - key: example
# value: example-value
# description: example-description
Edit the settings.yaml
file to customize the project settings.
The build command creates the virtual environment, installs the specified Apache Airflow version, and sets up the project dependencies.
Run the build command from the project directory:
cd my_airflow_project
airflowctl build
The CLI relies on one of uv
or pyenv
to download and install a Python version if the version is not already installed.
Example, if you have Python 3.8 installed but you specify Python 3.7 in the settings.yaml
file,
the CLI will install Python 3.7 using uv
or pyenv
and create a virtual environment with Python 3.7 first.
Optionally, you can choose custom virtual environment path in case you have already installed apache-airflow package
and other dependencies.
Pass the existing virtualenv path using --venv_path
option to the init
command or in settings.yaml
file.
Make sure the existing virtualenv has same airflow and python version as your settings.yaml
file states.
To start Airflow services, use the start command. This command activates the virtual environment and launches the Airflow web server and scheduler.
Example:
airflowctl start my_airflow_project
You can also start Airflow in the background with the --background
flag:
airflowctl start my_airflow_project --background
To monitor logs from the background Airflow processes, use the logs command. This command displays live logs and provides options to filter logs for specific components.
Example
airflowctl logs my_airflow_project
To filter logs for specific components:
# Filter logs for scheduler
airflowctl logs my_airflow_project -s
# Filter logs for webserver
airflowctl logs my_airflow_project -w
# Filter logs for triggerer
airflowctl logs my_airflow_project -t
# Filter logs for scheduler and webserver
airflowctl logs my_airflow_project -s -w
To stop Airflow services if they are still running, use the stop command.
Example:
airflowctl stop my_airflow_project
To list all Airflow projects, use the list command.
Example:
airflowctl list
To show project info, use the info command.
Example:
# From the project directory
airflowctl info
# From outside the project directory
airflowctl info my_airflow_project
To run Airflow commands, use the airflowctl airflow
command. All the commands after
airflowctl airflow
are passed to the Airflow CLI.:
# From the project directory
airflowctl airflow <airflow_command>
Example:
$ airflowctl airflow version
2.6.3
You can also run airflowctl airflow --help
to see the list of available commands.
$ airflowctl airflow --help
Usage: airflowctl airflow [OPTIONS] COMMAND [ARGS]...
Run Airflow commands.
Positional Arguments:
GROUP_OR_COMMAND
Groups:
celery Celery components
config View configuration
connections Manage connections
dags Manage DAGs
db Database operations
jobs Manage jobs
kubernetes Tools to help run the KubernetesExecutor
pools Manage pools
providers Display providers
roles Manage roles
tasks Manage tasks
users Manage users
variables Manage variables
Commands:
cheat-sheet Display cheat sheet
dag-processor Start a standalone Dag Processor instance
info Show information about current Airflow and environment
kerberos Start a kerberos ticket renewer
plugins Dump information about loaded plugins
rotate-fernet-key
Rotate encrypted connection credentials and variables
scheduler Start a scheduler instance
standalone Run an all-in-one copy of Airflow
sync-perm Update permissions for existing roles and optionally DAGs
triggerer Start a triggerer instance
version Show the version
webserver Start a Airflow webserver instance
Options:
-h, --help show this help message and exit
Example:
# Listing dags
$ airflowctl airflow dags list
dag_id | filepath | owner | paused
==================+======================+=========+=======
example_dag_basic | example_dag_basic.py | airflow | True
# Running standalone
$ airflowctl airflow standalone
Or you can activate the virtual environment first and then run the commands as shown below.
Example:
# From the project directory
source .venv/bin/activate
# Source all the environment variables
source .env
airflow version
To add a new DAG, add the DAG file to the dags
directory.
To edit an existing DAG, edit the DAG file in the dags
directory.
The changes will be reflected in the Airflow web server.
airflowctl
by default uses SQLite as the backend database and SequentialExecutor
as the executor.
However, if you want to use other databases or executors, you can stop the project and
either a) edit the airflow.cfg
file or b) add environment variables to the .env
file.
Example:
# Stop the project
airflowctl stop my_airflow_project
# Changing the executor to LocalExecutor
# Change the database to PostgreSQL if you already have it installed
echo "AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@localhost:5432/airflow" >> .env
echo "AIRFLOW__CORE__EXECUTOR=LocalExecutor" >> .env
# Start the project
airflowctl start my_airflow_project
Check the Airflow documentation for all the available Airflow configurations.
For Airflow >= 2.6, you can run LocalExecutor
with sqlite
as the backend database by
adding the following environment variable to the .env
file:
_AIRFLOW__SKIP_DATABASE_EXECUTOR_COMPATIBILITY_CHECK=1
AIRFLOW__CORE__EXECUTOR=LocalExecutor
[!WARNING] Sqlite is not recommended for production use. Use it only for development and testing only.
For more information and options, you can use the --help
flag with each command.
airflowctl
can be used with other Airflow projects as long as the project structure is the same.
airflowctl
can be used with Astro CLI projects too.
While airflowctl
is a tool that allows you to run Airflow locally using virtual environments, Astro CLI
allows you to run Airflow locally using docker.
airflowctl
can read the airflow_settings.yaml
file generated by Astro CLI for reading connections & variables. It
will then reuse it as settings
file for airflowctl
.
For example, if you have an Astro CLI project:
airflowctl init . --build-start
command to initialize airflowctl
from the project directory.
Press y
to continue when prompted.airflow_settings.yaml
file in the
python_version
field.# From the project directory
$ cd astro_project
$ airflowctl init . --build-start
Directory /Users/xyz/astro_project is not empty. Continue? [y/N]: y
Project /Users/xyz/astro_project added to tracking.
Airflow project initialized in /Users/xyz/astro_project
Detected Astro project. Using Astro settings file (/Users/kaxilnaik/Desktop/proj1/astro_project/airflow_settings.yaml).
'airflow_version' not found in airflow_settings.yaml file. What is the Airflow version? [2.6.3]:
Virtual environment created at /Users/xyz/astro_project/.venv
...
...
If you see an error like the following, remove airflow.cfg
file from the project directory and remove
AIRFLOW_HOME
from .env
file if it exists and try again.
Error: there might be a problem with your project starting up.
The webserver health check timed out after 1m0s but your project will continue trying to start.
Run 'astro dev logs --webserver | --scheduler' for details.
This project is licensed under the terms of the Apache 2.0 License
FAQs
A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects.
We found that airflowctl demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.