Research
Security News
Malicious npm Package Targets Solana Developers and Hijacks Funds
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
Mockingbird-DB is a powerful and flexible Python tool designed to populate databases with synthetic data, based on user-defined configurations. It supports a variety of popular database platforms, such as PostgreSQL, MSSQL, and MySQL, and allows users to customize their data generation process through an intuitive YAML configuration file format. By defining table structures, column types, and data sources with different ratios, users can create a wide range of realistic, yet synthetic data scenarios tailored to their specific needs.
Featuring both single-process and multi-process workloads, Mockingbird-DB can effortlessly scale to accommodate databases of varying sizes and complexities. The tool also provides insightful metadata output, enabling users to analyze the distribution of data sources and the overall structure of the populated tables.
To install Mockingbird-DB, simply clone the repository and install:
git clone https://github.com/openraven/mockingbird-db.git
cd mockingbird-db
pip3 install .
or more simply using pypi by running
pip3 install mockingbird-db
To create a new workspace with an example configuration file and a small dataset containing 10 SSNs, run the following command:
mockingbird-db --workspace /path/to/your/workspace
This will create a new directory at the specified path, containing a boilerplate workspace with an example configuration file and a small dataset.
Once you have the configuration file, you can customize it to define the synthetic data generation process according to your requirements. The main components that can be edited in the configuration file are:
Database Information (db_info
): This section allows you to set up the connection to your database. You can define:
connection_type
: Choose between "arguments" or "url" to specify how to connect to the database.platform
: Specify the database platform, such as "postgres", "mssql", or "mysql".url
: If using the "url" connection type, provide the full connection URL.arguments
: If using the "arguments" connection type, provide the required connection arguments such as host, port, user, password, and database name.Tables (tables
): Define one or more tables to be populated with synthetic data. For each table, you can customize the following properties:
table_name
: The name of the table to be created or populated.rows
: The number of rows to insert into the table.columns
: Define the columns in the table and their respective properties:
column_name
: The name of the column.datasources
: A list of data sources to use for using your synthetic data sources.
datasource_name
: The name of the data source.datasource_path
: The path to the dataset file, "random_data" for generating random data, or "null" for null data.datasource_ratio
: The probability of using this data source for generating the column's data.nested_json_name
: The name of the nested JSON field, if applicable.column_type
: The data type of the column (e.g., "string", "json").Feel free to adjust the configuration file according to your specific use case, specifying the desired database connection information, table structure, and synthetic data sources.
To start populating your database tables with synthetic data, run the following command:
mockingbird-db --config-path /path/to/your/workspace/config.yaml
Optionally, you can specify the number of processes to use in parallel when flooding the database with rows by adding the -p
or --processes
flag:
mockingbird-db --config-path /path/to/your/workspace/config.yaml --processes 4
To save the metadata of the populated tables to a file, specify the -m
or --metadata-path
flag followed by the output file path:
mockingbird-db --config-path /path/to/your/workspace/config.yaml --metadata-path /path/to/your/output/metadata.json
This will generate a JSON file containing metadata information about the populated tables, such as the number of rows and the distribution of data sources for each column.
Contributions to Mockingbird-DB are welcome! Feel free to open an issue or submit a pull request on GitHub to improve the project or to add new features.
FAQs
Mockingbird-DB: Populate databases with synthetic data
We found that mockingbird-db demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
A malicious npm package targets Solana developers, rerouting funds in 2% of transactions to a hardcoded address.
Security News
Research
Socket researchers have discovered malicious npm packages targeting crypto developers, stealing credentials and wallet data using spyware delivered through typosquats of popular cryptographic libraries.
Security News
Socket's package search now displays weekly downloads for npm packages, helping developers quickly assess popularity and make more informed decisions.