graphile-migrate
Opinionated SQL-powered productive roll-forward migration tool for PostgreSQL.
Why?
- fast iteration speed — save a file and database is updated in milliseconds
- roll-forward only — maintaining rollbacks is a chore, and in 10 years of API
development I've never ran one in production
- familiar — no custom DSL to learn, just use PostgreSQL syntax
- fully functional — sending SQL commands directly to PostgreSQL means you can
use all of PostgreSQL's features
- complements PostGraphile — works with
any application, but PostGraphile's watch mode means that the GraphQL schema
is instantly regenerated (without server restart) whenever the database
changes
Crowd-funded open-source software
To advance and improve this software more rapidly, I need your support. Please
sponsor ongoing maintenance and development via
Patreon.
If you're already a Patreon sponsor, please let me know you're using this
software so I can justify allocating additional time to it.
Status
HIGHLY EXPERIMENTAL
The interface is raw and doesn't ask for confirmation (e.g. the
graphile-migrate reset
command will drop and re-create that database without
asking for confirmation).
There are no automated tests yet, and APIs may still change. (Pull requests to
add tests welcome!)
Using this for prototyping should be fine, but when it comes to shipping you
may want to
- help us write tests and finalise interfaces
- send us money to do the same
- use an alternative migration framework, such as:
Opinions
- Local iteration should be easy and fast
- Migrating should be fast
- Once deployed, databases should be identical (including subtleties such as column order)
- Migration software should not be tied to a particular application stack
- Migrations should be written in SQL
- Roll-forward only (production issues should be fixed via additional migrations, development can iterate current migration)
- Once a migration is signed off (deployable) it should never be edited
- Use PostgreSQL ;)
- Development databases are cheap; can run multiple
- Resetting development database is acceptable if absolutely necessary
- Production databases are critical - NEVER RESET
- Migrating data (as well as DDL) is acceptable, but should be kept to fast operations (or trigger a background job)
- Migrations should automatically be wrapped in transactions by default
- Migrations that require execution outside of a transaction (e.g. to enable augmenting non-DDL-safe things, such as
ENUM
s in PostgreSQL) should be explicitly marked - Migrations should not pollute PostgreSQL global settings (e.g. use
SET LOCAL
rather than SET
) - Roles should be managed outside of migrations (since they can be shared between databases)
- Certain schemas are managed by other tools and should not be interfered with; e.g.
graphile_jobs
Setup
graphile-migrations
requires two databases: the first is your main database
against which you perform development, the second is a "shadow" database
which is used by the system to apply migrations. You should never interact
with the "shadow" database directly. Further all members of your team should
run the same PostgreSQL version to ensure that the shadow dump matches for
everyone (one way of achieving this is through Docker, but that isn't
required).
Usage
graphile-migrate migrate [--shadow]
Runs any un-executed committed migrations. Does not run current.sql
. For use in production and development.
If --shadow
is specified, migrates the shadow database instead.
graphile-migrate watch [--shadow] [--once]
Runs any un-executed committed migrations and then runs and watches
current.sql
, re-running its contents on any change.
current.sql
should be idempotent (this is your responsibility, see
"Idempotency" below); i.e. it should be able to be ran multiple times and have
the same result.
If --shadow
is specified, changes will be applied against the shadow database instead.
If --once
is specified, current.sql
will be ran once and then the command will exit.
graphile-migrate commit
- reset the shadow database to the latest dump
- apply the current migration to the shadow database, and replace the dump
- move the current migration to committed migrations (adding a hash to prevent tampering)
graphile-migrate reset [--shadow]
Drop and re-create the database, and re-run all the committed migrations from the start. HIGHLY DESTRUCTIVE
If --shadow
is specified, the shadow database will be reset rather than the main database.
Library usage
It's possible to consume this module as a JavaScript library rather than via
the CLI. There's no documentation on this, but the CLI code in cli.ts
is very
approachable.
ALPHA WARNING: internals are likely to change a lot, so expect breakage if you
use library mode right now. CLI is more stable.
Configuration
Configuration goes in .gmrc
, which is a JSON file with the following keys:
connectionString
— optional, alternatively set DATABASE_URL
environment
variable, this is your main development database. If you run
graphile-migrate reset
this will be dropped without warning, so be careful.shadowConnectionString
— optional, alternatively set SHADOW_DATABASE_URL
environment variable. Should not already exist.rootConnectionString
— optional (defaults to "template1"), alternatively
set ROOT_DATABASE_URL
environment variable; this is used to connect to
the database with superuser privileges to drop and re-create the relevant
databases (via the reset
command directly, or via the commit
command for
the shadow database).pgSettings
— optional string-string key-value object defining settings to
set in PostgreSQL when migrating. Useful for setting search_path
for
example. Beware of changing this, a full reset will use the new values which
may lead to unexpected consequences.placeholders
— optional string-string key-value object defining placeholder
values to be replaced when encountered in any migration files. Placeholders
must begin with a colon and a capital letter, and then can continue with a
string of capital letters, numbers and underscores /^:[A-Z][A-Z0-9_]+$/
.
:DATABASE_NAME
and :DATABASE_OWNER
are automatically added to this
object. The value must be a valid in the place you use it (i.e. ensure you
escape the values) — graphile-migrate does not perform any escaping for you.
The special value !ENV
will tell graphile-migrate to
load the setting from the environment variable with the same name.afterReset
— optional list of actions to execute after the database has
been created but before the migrations run. String values are interpreted as
the name of a file in the migrations folder to execute once the database has
been reset; useful for setting default permissions, installing extensions,
and the like. Objects with a command
key specify shell actions (e.g.
installing a separately managed worker schema into the database).
{
"pgSettings": {
"search_path": "app,app_private,app_hidden,public"
},
"placeholders": {
":DATABASE_AUTHENTICATOR": "!ENV",
":DATABASE_VISITOR": "!ENV"
},
"afterReset": ["afterReset.sql", { "command": "graphile-worker --once" }]
}
Collaboration
The intention is that developers can work on different migrations in parallel,
and can switch between git
branches - idempotent migrations using CASCADE
when dropping should make it possible to do this with little issue (other than
the implicit data loss of dropping tables/columns/etc).
graphile-migrate commit
, on the other hand, should be linear - one way to
approach this is to only commit a migration immediately before it is merged to
master
. Another approach is to do the commit on master
itself. Non-linear
migration commits will result in errors, and may lead to you resetting your
development database.
Idempotency
graphile-migrate
is all about iteration; you write your database modification
commands in migrations/current.sql
and every time you save it is ran against
the database, generally taking under 100ms.
Because we run the same script over and over (on every save) and there's no
down migrations, you need to make your script idempotent. PostgreSQL has a number
of idempotent commands such as:
create or replace function...
drop table if exists ...
drop trigger if exists ...
When these aren't suitable you can start your migration with an explicit
rollback: commands that undo later actions. For example:
drop table if exists people;
create table people (
id serial primary key,
name text
);
When it comes time to commit your migration we will run it against a "shadow"
database to make sure it's valid.
It's often wise to use DROP ... CASCADE
so that if other migrations are
worked on in parallel no additional rollback
step is required. When you
DROP ... CASCADE
, be sure to add back any dropped dependents (triggers,
indexes, etc) once the dropped entity has been replaced. Reviewing the database
schema diff can help you spot these issues.
More examples of idempotent operations:
DROP SCHEMA IF EXISTS app CASCADE;
CREATE SCHEMA app;
DROP TABLE IF EXISTS foo CASCADE;
CREATE TABLE foo ...;
ALTER TABLE foo DROP COLUMN IF EXISTS bar CASCADE;
ALTER TABLE foo ADD COLUMN foo ...;
ALTER TABLE foo ALTER COLUMN foo SET NOT NULL;
ALTER TABLE foo ALTER COLUMN foo TYPE int USING foo::int;
CREATE OR REPLACE FUNCTION ...;
DROP FUNCTION IF EXISTS ... CASCADE;
CREATE OR REPLACE FUNCTION ...
TODO: