
Product
Introducing Socket Fix for Safe, Automated Dependency Upgrades
Automatically fix and test dependency updates with socket fix—a new CLI tool that turns CVE alerts into safe, automated upgrades.
@dataform/core
Advanced tools
@dataform/core is a powerful tool for managing data workflows and transformations. It allows you to define data models, schedule data transformations, and manage dependencies between different data operations. It is particularly useful for teams working with large-scale data warehouses and ETL processes.
Defining Data Models
This feature allows you to define data models using SQL queries. The `table` function is used to create a new table, specifying its type, dependencies, and the SQL query that defines its content.
const dataform = require('@dataform/core');
const { table, ref } = dataform;
table('my_table', {
type: 'table',
dependencies: [ref('source_table')],
query: `
SELECT *
FROM ${ref('source_table')}
`
});
Scheduling Data Transformations
This feature allows you to schedule data transformations using cron syntax. The `schedule` function is used to define a scheduled task, specifying the cron schedule and the actions to be performed.
const dataform = require('@dataform/core');
const { schedule } = dataform;
schedule('daily_update', {
cron: '0 0 * * *',
actions: [
{ name: 'update_table', type: 'operation', query: 'CALL update_table_procedure();' }
]
});
Managing Dependencies
This feature allows you to manage dependencies between different data operations. The `ref` function is used to reference other tables or operations, ensuring that dependencies are correctly managed.
const dataform = require('@dataform/core');
const { ref, table } = dataform;
table('dependent_table', {
type: 'table',
dependencies: [ref('base_table')],
query: `
SELECT *
FROM ${ref('base_table')}
`
});
dbt (data build tool) is a command-line tool that enables data analysts and engineers to transform data in their warehouse more effectively. It is similar to @dataform/core in that it allows you to define data models and manage dependencies, but it also includes features for testing and documentation.
Apache Airflow is an open-source platform to programmatically author, schedule, and monitor workflows. It is more general-purpose than @dataform/core, as it can be used for a wide range of workflow automation tasks beyond just data transformations.
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, and more. Like @dataform/core, it is designed for managing data workflows, but it is more focused on batch processing.
FAQs
Dataform core API.
We found that @dataform/core demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 5 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Automatically fix and test dependency updates with socket fix—a new CLI tool that turns CVE alerts into safe, automated upgrades.
Security News
CISA denies CVE funding issues amid backlash over a new CVE foundation formed by board members, raising concerns about transparency and program governance.
Product
We’re excited to announce a powerful new capability in Socket: historical data and enhanced analytics.