Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

dbt-presto

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

dbt-presto

The presto adpter plugin for dbt (data build tool)

  • 0.21.1
  • PyPI
  • Socket score

Maintainers
1

dbt-presto

Documentation

For more information on using Presto with dbt, consult the dbt documentation:

  • Presto profile

Installation

This plugin can be installed via pip:

$ pip install dbt-presto

Configuring your profile

A dbt profile can be configured to run against Presto using the following configuration:

OptionDescriptionRequired?Example
methodThe Presto authentication method to useOptional (default is none)none or kerberos
userUsername for authenticationRequireddrew
passwordPassword for authenticationOptional (required if method is ldap or kerberos)none or abc123
http_headersHTTP Headers to send alongside requests to Presto, specified as a yaml dictionary of (header, value) pairs.OptionalX-Presto-Routing-Group: my-cluster
http_schemeThe HTTP scheme to use for requests to PrestoOptional (default is http, or https for method: kerberos and method: ldap)https or http
databaseSpecify the database to build models intoRequiredanalytics
schemaSpecify the schema to build models into. Note: it is not recommended to use upper or mixed case schema namesRequireddbt_drew
hostThe hostname to connect toRequired127.0.0.1
portThe port to connect to the host onRequired8080
threadsHow many threads dbt should useOptional (default is 1)8

Example profiles.yml entry:

my-presto-db:
  target: dev
  outputs:
    dev:
      type: presto
      user: drew
      host: 127.0.0.1
      port: 8080
      database: analytics
      schema: dbt_drew
      threads: 8

Usage Notes

Supported Functionality

Due to the nature of Presto, not all core dbt functionality is supported. The following features of dbt are not implemented on Presto:

  • Archival
  • Incremental models

Also, note that upper or mixed case schema names will cause catalog queries to fail. Please only use lower case schema names with this adapter.

If you are interested in helping to add support for this functionality in dbt on Presto, please open an issue!

Required configuration

dbt fundamentally works by dropping and creating tables and views in databases. As such, the following Presto configs must be set for dbt to work properly on Presto:

hive.metastore-cache-ttl=0s
hive.metastore-refresh-interval = 5s
hive.allow-drop-table=true
hive.allow-rename-table=true
Use table properties to configure connector specifics

Trino/Presto connectors use table properties to configure connector specifics.

Check the Presto/Trino connector documentation for more information.

{{
  config(
    materialized='table',
    properties={
      "format": "'PARQUET'",
      "partitioning": "ARRAY['bucket(id, 2)']",
    }
  )
}}

Reporting bugs and contributing code

  • Want to report a bug or request a feature? Let us know on Slack, or open an issue.

Running tests

Build dbt container locally:

./docker/dbt/build.sh

Run a Presto server locally:

./docker/init.bash

If you see errors while about "inconsistent state" while bringing up presto, you may need to drop and re-create the public schema in the hive metastore:

# Example error

Initialization script hive-schema-2.3.0.postgres.sql
Error: ERROR: relation "BUCKETING_COLS" already exists (state=42P07,code=0)
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
Use --verbose for detailed stacktrace.
*** schemaTool failed ***

Solution: Drop (or rename) the public schema to allow the init script to recreate the metastore from scratch. Only run this against a test Presto deployment. Do not run this in production!

-- run this against the hive metastore (port forwarded to 10005 by default)
-- DO NOT RUN THIS IN PRODUCTION!

drop schema public cascade;
create schema public;

You probably should be slightly less reckless than this.

Run tests against Presto:

./docker/run_tests.bash

Run the locally-built docker image (from docker/dbt/build.sh):

export DBT_PROJECT_DIR=$HOME/... # wherever the dbt project you want to run is
docker run -it --mount "type=bind,source=$HOME/.dbt/,target=/home/dbt_user/.dbt" --mount="type=bind,source=$DBT_PROJECT_DIR,target=/usr/app" --network dbt-net dbt-presto /bin/bash

Code of Conduct

Everyone interacting in the dbt project's codebases, issue trackers, chat rooms, and mailing lists is expected to follow the PyPA Code of Conduct.

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc