
Product
Introducing Scala and Kotlin Support in Socket
Socket now supports Scala and Kotlin, bringing AI-powered threat detection to JVM projects with easy manifest generation and fast, accurate scans.
The DAMN (Data Assets Metric Navigation) tool extracts and reports metrics about your data assets
████████▄ ▄████████ ▄▄▄▄███▄▄▄▄ ███▄▄▄▄
███ ▀███ ███ ███ ▄██▀▀▀███▀▀▀██▄ ███▀▀▀██▄
███ ███ ███ ███ ███ ███ ███ ███ ███
███ ███ ███ ███ ███ ███ ███ ███ ███
███ ███ ▀███████████ ███ ███ ███ ███ ███
███ ███ ███ ███ ███ ███ ███ ███ ███
███ ▄███ ███ ███ ███ ███ ███ ███ ███
████████▀ ███ █▀ ▀█ ███ █▀ ▀█ █▀
The DAMN tool extracts and reports metrics about your data assets.
It allows you to inspect your assets, lineage, and all sorts of metrics around materialization, usage, physical space usage and query performance. The objective of the DAMN tool is to give you a convenient command-line tool to track and report on the data assets you're working on.
To install the DAMN tool, run the following command:
pip install damn-tool
The DAMN tool leverages various connectors to interact with different service providers.
Configuring these connectors is done via a YAML file located at ~/.damn/connectors.yml
. You can override the location of those connector configurations using the --configs-dir
option.
See example configuration file here
The configuration file uses the following structure:
connector_type:
service_provider:
param1: value1
param2: value2
This is the default connector required by the DAMN tool. For now, we only support Dagster as the service provider for this connector. Here's an example configuration for an orchestrator connector with a dagster profile:
orchestrator:
dagster:
endpoint: https://your-dagster-instance.com/prod/graphql
api_token: your-api-token
Your assets can be stored in storage services. For now, we only support the AWS storage service. This can be configured like this.
io-manager:
aws:
credentials:
access_key_id: "{{ env('AWS_ACCESS_KEY_ID') }}"
secret_access_key: "{{ env('AWS_SECRET_ACCESS_KEY') }}"
region: "us-east-1"
bucket_name: "bucket-name"
key_prefix: "asset-prefix"
Your assets can be materialized to a data warehouse. For now, we support both Snowflake and BigQuery. This can be configured like this.
data-warehouse:
snowflake:
account: ab1234.us-east-1
user: username
password: "{{ env('SNOWFLAKE_PASSWORD') }}"
role: my-role
database: my-database
warehouse: my-warehouse
schema: analytics
bigquery:
keyfile: "{{ env('GOOGLE_APPLICATION_CREDENTIALS') }}"
project: "XYZ"
The active service provider for each connector can be changed by specifying the service provider when running DAMN commands. By default, DAMN will use the first service provider configured for each connector.
Example usage:
damn ls --orchestrator dagster --io-manager aws --data-warehouse snowflake
The DAMN tool is both a CLI tool and a python library.
Note that in CLI model, commands support an output
option which allows flexibility in how the DAMN tool might be used:
terminal
: By default, the output of commands will be printed to the terminaljson
: You can also have the output as a json
object, which is more useful if you're to use DAMN in a programmatic way.copy
: You can also copy the output to your clipboard, which is useful if you want to share an asset's metrics in a PR for example.In python...
from damn_tool.ls import list_assets
result = list_assets()
print(result)
From the command line...
foo@bar:~$ damn ls
- airbyte/protest_groupings
- data_warehouse/movements_dim
- data_warehouse/observations_fct
- gdelt/gdelt_gkg_articles
- gdelt/gdelt_mention_summaries
- hex/hex_main_dashboard_refresh
- semantic_definitions
List all assets for a specifc key group In python...
from damn_tool.ls import list_assets
result = list_assets(prefix='gdelt')
print(result)
From the command line...
foo@bar:~$ damn ls --prefix gdelt
- gdelt/gdelt_article_summaries
- gdelt/gdelt_articles_enhanced
- gdelt/gdelt_events
- gdelt/gdelt_gkg_articles
- gdelt/gdelt_mention_summaries
- gdelt/gdelt_mentions
- gdelt/gdelt_mentions_enhanced
In python...
from damn_tool.show import show_asset
result = show_asset('gdelt/gdelt_articles_enhanced')
print(result)
From the command line...
foo@bar:~$ damn show gdelt/data_warehouse/integration/int__events_actors
From orchestrator:
- description: dbt model int__events_actors
- computeKind: dbt
- policyType: LAZY
- maximumLagMinutes: 360.0
- cronSchedule: None
- isPartitioned: False
- dependedByKeys:
- data_warehouse
- events_actors_bridge
- dependencyKeys:
- data_warehouse
- integration
- int__events_observations
- data_warehouse
- integration
- int__actors
- metadataEntries:
- Execution Duration: 4.183706
From data warehouse:
- table_schema: analytics_integration
- table_type: base table
- created: 2023-07-05T08:36:40.935000-07:00
- last_altered: 2023-07-19T09:56:36.410000-07:00
In python...
from damn_tool.metrics import asset_metrics
result = asset_metrics('gdelt/gdelt_articles_enhanced')
print(result)
From the command line...
foo@bar:~$ damn metrics gdelt/gdelt_gkg_articles
From orchestrator:
- run_id: 03466ceb-1c51-43ab-9384-33b6472c3f24
- status: SUCCESS
- start_time: 2023-07-19 14:19:00
- end_time: 2023-07-19 14:19:02
- elapsed_time: 0:00:02.563292
- num_partitions: 4963
- num_materialized: 4963
- num_failed: 0
From IO manager:
- files: 4976
- size: 76.25 MB
- last_modified: 2023-07-19T18:19:03+00:00
From data warehouse:
- row_count: None
- bytes: N/A
Contributions to the DAMN tool are always welcome. Whether it's feature requests, bug fixes, or new features, your contribution is appreciated.
The DAMN tool is open-source software, licensed under MIT.
FAQs
The DAMN (Data Assets Metric Navigation) tool extracts and reports metrics about your data assets
We found that damn-tool demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Socket now supports Scala and Kotlin, bringing AI-powered threat detection to JVM projects with easy manifest generation and fast, accurate scans.
Application Security
/Security News
Socket CEO Feross Aboukhadijeh and a16z partner Joel de la Garza discuss vibe coding, AI-driven software development, and how the rise of LLMs, despite their risks, still points toward a more secure and innovative future.
Research
/Security News
Threat actors hijacked Toptal’s GitHub org, publishing npm packages with malicious payloads that steal tokens and attempt to wipe victim systems.