Product
Introducing Java Support in Socket
We're excited to announce that Socket now supports the Java programming language.
@dataform/core
Advanced tools
@dataform/core is a powerful tool for managing data workflows and transformations. It allows you to define data models, schedule data transformations, and manage dependencies between different data operations. It is particularly useful for teams working with large-scale data warehouses and ETL processes.
Defining Data Models
This feature allows you to define data models using SQL queries. The `table` function is used to create a new table, specifying its type, dependencies, and the SQL query that defines its content.
const dataform = require('@dataform/core');
const { table, ref } = dataform;
table('my_table', {
type: 'table',
dependencies: [ref('source_table')],
query: `
SELECT *
FROM ${ref('source_table')}
`
});
Scheduling Data Transformations
This feature allows you to schedule data transformations using cron syntax. The `schedule` function is used to define a scheduled task, specifying the cron schedule and the actions to be performed.
const dataform = require('@dataform/core');
const { schedule } = dataform;
schedule('daily_update', {
cron: '0 0 * * *',
actions: [
{ name: 'update_table', type: 'operation', query: 'CALL update_table_procedure();' }
]
});
Managing Dependencies
This feature allows you to manage dependencies between different data operations. The `ref` function is used to reference other tables or operations, ensuring that dependencies are correctly managed.
const dataform = require('@dataform/core');
const { ref, table } = dataform;
table('dependent_table', {
type: 'table',
dependencies: [ref('base_table')],
query: `
SELECT *
FROM ${ref('base_table')}
`
});
dbt (data build tool) is a command-line tool that enables data analysts and engineers to transform data in their warehouse more effectively. It is similar to @dataform/core in that it allows you to define data models and manage dependencies, but it also includes features for testing and documentation.
Apache Airflow is an open-source platform to programmatically author, schedule, and monitor workflows. It is more general-purpose than @dataform/core, as it can be used for a wide range of workflow automation tasks beyond just data transformations.
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, and more. Like @dataform/core, it is designed for managing data workflows, but it is more focused on batch processing.
FAQs
Dataform core API.
The npm package @dataform/core receives a total of 4,604,750 weekly downloads. As such, @dataform/core popularity was classified as popular.
We found that @dataform/core demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 0 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
We're excited to announce that Socket now supports the Java programming language.
Security News
Socket detected a malicious Python package impersonating a popular browser cookie library to steal passwords, screenshots, webcam images, and Discord tokens.
Security News
Deno 2.0 is now available with enhanced package management, full Node.js and npm compatibility, improved performance, and support for major JavaScript frameworks.