Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

github.com/tsinghgill/dynamo-to-postgres-real-time-stream

Package Overview
Dependencies
Alerts
File Explorer
Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

github.com/tsinghgill/dynamo-to-postgres-real-time-stream

  • v0.0.0-20240917180850-511366e00564
  • Source
  • Go
  • Socket score

Version published
Created
Source

DynamoDB to PostgreSQL Pipeline

Pipeline Diagram

This pipeline streams data from DynamoDB to PostgreSQL. It runs three pipelines that stream data into three different Postgres tables. It uses processors to make sure that the data is correctly formatted for Postgres to consume. There is a Python script provided that can generate sample records in real time continuously for testing.

Setting up an AWS DynamoDB instance

  1. Set up a DynamoDB table.

  2. In the "Exports and streams" section in the Table settings, enable DynamoDB streams:

    • Check "Enable DynamoDB stream"
    • Select "New and old images" for the "View type"

    This setting enables Change Data Capture (CDC) for Conduit.

Useful AWS CLI commands for DynamoDB

    # Create a DynamoDB table
    aws dynamodb create-table \
        --table-name users \
        --attribute-definitions AttributeName=id,AttributeType=S \
        --key-schema AttributeName=id,KeyType=HASH \
        --provisioned-throughput ReadCapacityUnits=1,WriteCapacityUnits=1 \
        --region us-east-2

    # Delete a DynamoDB table
    aws dynamodb delete-table --table-name users --region us-east-2

    # Scan all items in a DynamoDB table
    aws dynamodb scan --table-name users --region us-east-2

    # Count the number of items in a DynamoDB table
    aws dynamodb scan --table-name users --select "COUNT" --region us-east-2

Setting up an Postgres instance

- Use https://diffuser.meroxa.io/ to setup an AWS RDS Postgres instance.

Useful PSQL commands for Postgres

    # Connect to the database
    psql -h <host> -U <username> -d <database_name>

    # Create the appearance_settings table
    CREATE TABLE appearance_settings (
        id UUID,
        mode TEXT,
        colorway TEXT,
        theme TEXT,
        deleted BOOLEAN,
        deleted_at TIMESTAMPTZ,
        profile_id UUID
    );

    ALTER TABLE appearance_settings
    ALTER COLUMN deleted_at DROP NOT NULL;

    # Create the notification_settings table
    CREATE TABLE notification_settings (
        id UUID,
        push_follows BOOLEAN,
        push_comments BOOLEAN,
        push_quotes BOOLEAN,
        push_likes BOOLEAN,
        push_mentions BOOLEAN,
        push_app_wide BOOLEAN,
        push_tp_invites BOOLEAN,
        push_tp_replies BOOLEAN,
        app_follows BOOLEAN,
        app_comments BOOLEAN,
        app_quotes BOOLEAN,
        app_likes BOOLEAN,
        app_mentions BOOLEAN,
        app_app_wide BOOLEAN,
        app_tp_invites BOOLEAN,
        app_tp_replies BOOLEAN,
        deleted BOOLEAN DEFAULT FALSE,
        deleted_at TIMESTAMPTZ DEFAULT NULL,
        profile_id UUID PRIMARY KEY -- Set profile_id as the primary key
    );

    # Create the profile table
    CREATE TABLE profile (
        id UUID,
        user_id UUID, -- Same as `id` from DynamoDB
        created_at TIMESTAMPTZ, -- Corresponds to `createdAt`
        name TEXT,
        handle TEXT,
        image_url TEXT,
        bio VARCHAR(210),
        location VARCHAR(48),
        rep_score REAL,
        website TEXT,
        birthdate TIMESTAMPTZ,
        deleted BOOLEAN DEFAULT FALSE, -- Default to FALSE
        deleted_at TIMESTAMPTZ -- Default to NULL
    );

    SELECT * FROM appearance_settings;
    SELECT * FROM notification_settings;
    SELECT * FROM profile;

    SELECT COUNT(*) AS total_profiles FROM profile;
    SELECT COUNT(*) AS total_appearance_settings FROM appearance_settings;
    SELECT COUNT(*) AS total_notification_settings FROM notification_settings;

    # Join the profile and appearance_settings tables using profile.id and appearance_settings.profile_id
    SELECT 
        p.*,  -- All fields from the profile table
        a.mode, 
        a.colorway, 
        a.theme, 
        a.deleted AS appearance_deleted, 
        a.deleted_at AS appearance_deleted_at
    FROM 
        profile p
    JOIN 
        appearance_settings a 
    ON 
        p.id = a.profile_id;  -- Join on profile.id and appearance_settings.profile_id

DynamoDB Sample Data

- Use the `scripts/insert_dynamo_records.py` script to insert sample data into the DynamoDB table.
    python scripts/insert_dynamo_records.py

Map DynamoDB Fields to PostgreSQL Tables

Mapped Fields:

DynamoDB users FieldPostgreSQL TablePostgreSQL Column
idprofileid (UUID)
createdAtprofilecreated_at (TIMESTAMPTZ)
nameprofilename (TEXT)
handleprofilehandle (TEXT)
imageUrlprofileimage_url (TEXT)
bioprofilebio (VARCHAR(210))
locationprofilelocation (VARCHAR(48))
repScoreprofilerep_score (REAL)
websiteprofilewebsite (TEXT)
birthdateprofilebirthdate (TIMESTAMPTZ)
deletedprofiledeleted (BOOLEAN)
deletedAtprofiledeleted_at (TIMESTAMPTZ)
appearance.modeappearance_settingsmode (TEXT)
appearance.colorwayappearance_settingscolorway (TEXT)
appearance.themeappearance_settingstheme (TEXT)
appearance.deletedappearance_settingsdeleted (BOOLEAN)
appearance.deletedAtappearance_settingsdeleted_at (TIMESTAMPTZ)
notifications.push_followsnotification_settingspush_follows (BOOLEAN)
notifications.push_commentsnotification_settingspush_comments (BOOLEAN)
notifications.push_quotesnotification_settingspush_quotes (BOOLEAN)
notifications.push_likesnotification_settingspush_likes (BOOLEAN)
notifications.push_mentionsnotification_settingspush_mentions (BOOLEAN)
notifications.push_app_widenotification_settingspush_app_wide (BOOLEAN)
notifications.push_tp_invitesnotification_settingspush_tp_invites (BOOLEAN)
notifications.push_tp_repliesnotification_settingspush_tp_replies (BOOLEAN)
notifications.deletednotification_settingsdeleted (BOOLEAN)
notifications.deletedAtnotification_settingsdeleted_at (TIMESTAMPTZ)

FAQs

Package last updated on 17 Sep 2024

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc