Mock Data
Here are my tables
Load them [with data] for me
I don't care how
Mock-data is the result of a Pivotal internal hackathon in July 2017. The idea behind it is to allow users to test database queries with sets of fake data in any pre-defined table.
With Mock-data users can have
- Their own tables defined with any particular (supported) data types. It's only needed to provide the target table(s), and the number of rows of randomly generated data to insert.
- Create a demo database
- Create
n
number of table with n
number of column - Custom fit data into the table
- Option to select
realistic
data to be loaded onto the table
An ideal environment to make Mock-data work without any errors would be
- Tables with no constraints
- No custom data types
However, please DO MAKE SURE TO TAKE A BACKUP of your database before you mock data in it as it has not been tested extensively.
Check on the "Known Issues" section below for more information about current identified bugs.
Table of Contents
Important & Disclaimer
Mock-data idea is to generate fake data in new test cluster, and it is NOT TO BE USED IN PRODUCTION ENVIRONMENTS. Please ensure you have a backup of your database before running Mock-data in an environment you can't afford losing.
Supported database engines & data types
Database Engine
- PostgresSQL
- Greenplum Database
Data types
- All datatypes that are listed on the postgres datatype website are supported
- As Greenplum are both base from postgres, the supported postgres datatype also apply in their case
How it works
- PARSES the CLI arguments
- CHECKS if the database connection can be established
- BASED on sub commands i.e either database , table or schema it pull / verifies the tables
- CREATES a backup of all constraints (PK, UK, CK, FK ) and unique indexes (due to cascade nature of the drop constraints)
- STORES this constraint/unique index information in memory and also saves it to the file under
$HOME/mock
- REMOVES all the constraints on the table
- STARTS loading random data based on the columns datatype
- READS all the constraints information from memory
- FIXES PK and UK initially
- FIXES FK
- CHECK constraints are ignored (coming soon?)
- LOADS constraints that it had backed up (Mock-data can fail at this stage if its not able to fix the constraint violations)
Usage
$ mock --help
This program generates fake data into a postgres database cluster.
PLEASE DO NOT run on a mission critical databases
Usage:
mock [flags]
mock [command]
Available Commands:
custom Controlled mocking of tables
database Mock at database level
help Help about any command
schema Mock at schema level
tables Mock at table level
Flags:
-a, --address string Hostname where the postgres database lives
-d, --database string Database to mock the data
-q, --dont-prompt Run without asking for confirmation
-h, --help help for mock
-i, --ignore Ignore checking and fixing constraints
-w, --password string Password for the user to connect to database
-p, --port int Port number of the postgres database
-r, --rows int Total rows to be faked or mocked (default 10)
--uri string Postgres connection URI, eg. postgres://user:pass@host:=port/db?sslmode=disable
-u, --username string Username to connect to the database
-v, --verbose Enable verbose or debug logging
--version version for mock
Use "mock [command] --help" for more information about a command.
Installation
Using Binary
Download the latest release for your OS & Architecture and you're ready to go!
[Optional] You can copy the mock program to the PATH folder, so that you can use the mock from anywhere in the terminal, for eg.s
cp mock-darwin-amd64-v2.0 /usr/local/bin/mock
chmod +x /usr/local/bin/mock
provided /usr/local/bin
is part of the $PATH environment variable.
Using Docker
- Pull the image & you are all set
docker pull ghcr.io/faisaltheparttimecoder/mock-data:latest
- [OPTIONAL] add a tag for easy acess
docker image tag ghcr.io/faisaltheparttimecoder/mock-data mock
- Create a local directory on the host to mount has a volume inside the container, needed to store files (eg.s constraints list) or to send in configuration files to the mock data tool (like custom subcommand)
mkdir /tmp/mock
- Now run the docker command
docker run -v /tmp/mock:/home/mock [docker-image-tag] [subcommand] <flags...>
eg.s
docker run -v /tmp/mock:/home/mock mock database -f -u postgres -d demodb
- For mac users to connect to the host(or local host) database you can use the address
host.docker.internal
as shown in the below command
docker run -v /tmp/mock:/home/mock [docker-image-tag] [subcommand] -a host.docker.internal <flags...>
eg.s
docker run -v /tmp/mock:/home/mock mock database -f -a host.docker.internal -u postgres -d demodb
- [Optional] You can also make an alias of the above command, for eg.s alias with
.zshrc
echo alias mock=\"docker run -it -v /tmp/mock:/home/mock ghcr.io/faisaltheparttimecoder/mock-data:latest\" >> ~/.zshrc
source ~/.zshrc
mock tables -t "public.gardens" --uri="postgres://pg_user:mypassword@myhost:5432/database_name?sslmode=disable"
Examples
Here is a simple demo of how the tool works, provide us your table and we will load the data for you
For more examples how to use the tool, please check out the wiki page for categories like
- Look here on how the database connection works
- For realistic & controlled data, read this section on how the subcommand custom works
- For mocking the whole database or creating a demo database, read this section on how the subcommand database works
- For mocking the whole tables of the schema, read this section on how the subcommand schema works
- For creating fake tables and mocking selected tables, read this section on how the subcommand tables works
Known Issues
- We do struggle when recreating constraints, even though we do try to fix the primary key , foreign key, unique key. So there is no guarantee that the tool will fix all the constraints and manual intervention is needed in some cases.
- If you have a composite unique index where one column is part of foreign key column then there are chances the constraint creation would fail.
- Fixing CHECK constraints isn't supported due to complexity, so recreating check constraints would fail, use
custom
subcommand to control the data being inserted - On Greenplum Database partition tables are not supported (due to check constraint issues defined above), so use the
custom
sub command to define the data to be inserted to the column with check constraints - Custom data types are not supported, use
custom
sub command to control the data for that custom data types
Developers / Collaboration
You can sumbit issues or pull request via github and we will try our best to fix them.
To customize this repository, follow the steps
- Clone the git repository
- Export the GOPATH
export GOPATH=<path to the clone repository>
- Install all the dependencies.
go mod vendor
- Make sure you have a demo postgres database to connect or if you are using mac, you can use
make install_postgres
make start_postgres
make stop_postgres
make uninstall_postgres
- You are all set, you can run it locally using
go run . <commands> <flags.........>
- [Recommended] Run the golang linter to analyzes & fix source code programming errors, bugs, stylistic errors, and suspicious constructs.
golangci-lint run
to install golangci-lint check here, config file .golangci.yml
has been provided with this repo - To run test, use
# Edit the database environment variables on the "Makefile"
make unit_tests
make integration_tests
make tests # Runs the above two test simultaneously
- To build the package use
make build
--- HAPPY HACKING ---
Contributors
License
The Project is licensed under MIT