🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more →

Demo Install Sign in

rae-cli

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

rae-cli

CLI tool for scaffolding analytics engineering data stacks.

0.2.10

PyPI

Maintainers: 1

RAE - Rapid Analytics Engineering

Perfectly imperfect, for she was a Wildflower

RAE is the first opinionated framework that is purpose-built for the Analytics Engineering community — inspired by the likes of backend web frameworks like Django, Flask and NestJS, but with a Data Engineering twist!

RAE:

Empowers teams, solo devs, students and individual engineers to rapidly scaffold a modern analytics engineering stack with nothing more than a few responses to CLI prompts. From zero to fully containerized infrastructure in minutes.
- Users can also opt to only scaffold 1 or 2 tools versus an entire stack
Abstracts away the infrastructure, container and server knowledge required to set most tools up.

All so you can focus on what matters most: modeling, orchestrating, and delivering data.

What RAE Does

Scaffold Tool Docker Configurations

Spin up a project with plug-and-play support for essential data tools:

Data Storage:
- PostgreSQL
- MySQL
Data Modeling:
- dbt
- SQL Mesh
Orchestration:
- Airflow
- Dagster

Auto-Generate `settings.py`

A clean and extensible settings file inspired by Django — making it easy to pass environment-specific values (ports, credentials, container names, etc.) to every component of your stack.

Auto-Generate `docker-compose.yml`

Connect all services via a shared Docker network.

Frameworks aren't just for web and mobile engineers anymore. RAE gives Analytics Engineers the tools to build, connect and orchestrate their data stack with ease.

Build like a developer. Deploy like an engineer. Let RAE compose your analytics stack.

Who Is RAE For?

Analytics Engineers who want to quickly scaffold their required infrastructure.
Data Engineers who need to tie various tools together.
Data Scientists who have a need for a data tool stack.
Individual developers and anyone learning to use analytics/data engineering tools.
Teams that want standardization and clarity across their data stack.

How to use RAE

System Dependencies

Tool	Required Version	Notes
Python	3.8+	Required for the RAE CLI tool
Docker Desktop	Latest	Docker Desktop (macOS/Windows) or Docker Engine (Linux)
Shell	bash / zsh / PowerShell	Used to run CLI and Docker commands
Web Browser	Any modern browser	`Google Chrome` recommended for container-based UIs (e.g. Airflow)

CLI Setup Steps

1. Create a Virtual Environment

OS	Command
macOS/Linux	`python3 -m venv local-env`
Windows	`py -m venv local-env`

2. Activate the Virtual Environment

OS	Command
macOS/Linux (bash/zsh)	`source local-env/bin/activate`
Windows (CMD)	`local-env\Scripts\activate.bat`
Windows (PowerShell)	`local-env\Scripts\Activate.ps1`

3. Install RAE CLI:

pip install rae-cli

4. Initialize your project:

rae init

This will take you through a series of prompts and then generate the project_config.json and settings.py files.

After this command completes you will be left with the following project structure:

├── rae
│   └── src
│       ├── airflow
│       │   └── airflow-init.sh
│       ├── dbt
│       │   ├── analyses
│       │   ├── macros
│       │   ├── models
│       │   ├── seeds
│       │   ├── snapshots
│       │   ├── tests
│       │   ├── dbt-init.sh
│       │   ├── dbt.sh
│       │   ├── dbt_project.yml
│       │   └── Dockerfile
│       ├── docker-compose.yml
│       ├── postgres
│       │   └── postgres-init.sh
│       └── settings
│           ├── project_config.json
│           └── settings.py

The above is just an example and assumes you selected postgres as your data storage, dbt as your data modeling and airflow as your orchestration with postgres as the metastore.

5. Open your settings file - `{project_name}/src/settings/settings.py`

- You need to populate this file with your specific credentials
    - `data_storage` (PostgreSQL or MySQL)
    - `data_modeling` (dbt or SQL Mesh)
    - `data_orchestration` (Airflow or Dagster)
- If you do not do this, the project will be usable, but the project's containers will be built with default values and will NOT BE production ready nor secure.

You are responsible for ensuring your project is secure, is setup properly and is ready for deployment!

6. Generate your docker compose file:

cd into your project directory:

cd {project_name}

generate your compose file:

rae generate-compose-file

generate docker-compose file without changing directories:

rae generate-compose-file --project-name {project_name}

7. Run your project's Docker containers:

Docker must be installed AND running on your host machine or this command will fail So make sure you have Docker Desktop installed and actively running on your machine!

cd {project_name}/src

Then simply start the containers:

docker-compose up -d

This will run the docker containers for each service and link them via a docker network. The process allows for each container to communicate with one another while still ensuring all tools operate in an isolated state.

Current State of Project

Future Implementations

1. Add secondary test coverage to the project:
  - src/cli.py
  - src/data_modeling/dbt_modeling
  - src/data_modeling/sql_mesh_modeling
  - src/data_orchestration/airflow_orchestration.py
  - src/data_orchestration/dagster_orchestration.py
  - src/data_storage/mysql_storage.py
  - src/data_storage/postgresql_storage.py
  - src/generators/docker_compose_generator.py
  
2. Continue iterating on test coverage
  - src/managers/data_modeling_manager.py
  - src/managers/data_orchestration_manager.py
  - src/managers/data_storage_manager.py
  - src/managers/settings_manager.py
  - src/utility/base_manager.py
  - src/utility/base_tool.py
  - src/utility/dockerfile_writer.py
  - src/utility/indented_dumper.py
  - src/utility/shell_script_writer.py
  - src/utility/supported_tools.py
  - src/main.py

3. Add support for additional data storage tools:
  - Snowflake
  - DuckDB?
  - SQL Server?
  - Databricks
    - AWS S3
    - Google Cloud Storage
    - Azure Blob Storage

4. Add support to allow users to scaffold single applications or custom combinations of tool stacks
  - Scenarios:
    - user only needs a data modeling tool
    - user only needs a data modeling tool and a data storage tool
    - user only needs an orchestration tool
    - etc
  - Intent:
    - To allow greater flexibility and provide a wider use-case for the CLI

FAQs

What is rae-cli?

Is rae-cli well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

rae-cli

RAE - Rapid Analytics Engineering

What RAE Does

Scaffold Tool Docker Configurations

Auto-Generate docker-compose.yml

Who Is RAE For?

How to use RAE

System Dependencies

CLI Setup Steps

1. Create a Virtual Environment

2. Activate the Virtual Environment

3. Install RAE CLI:

4. Initialize your project:

5. Open your settings file - {project_name}/src/settings/settings.py

6. Generate your docker compose file:

7. Run your project's Docker containers:

Current State of Project

Future Implementations

Related posts

TypeScript Native Previews: 10x Faster Compiler Now on npm for Public Testing

Malicious npm Packages Target React, Vue, and Vite Ecosystems with Destructive Payloads

Auto-Generate `docker-compose.yml`

5. Open your settings file - `{project_name}/src/settings/settings.py`