Security News
GitHub Removes Malicious Pull Requests Targeting Open Source Repositories
GitHub removed 27 malicious pull requests attempting to inject harmful code across multiple open source repositories, in another round of low-effort attacks.
graphile-worker
Advanced tools
Job queue for PostgreSQL running on Node.js. Can be used with any PostgreSQL-backed application. Pairs beautifully with PostGraphile.
runAllJobs
util)LISTEN
/NOTIFY
to be informed of jobs as they're inserted)SKIP LOCKED
to find jobs to execute, resulting in faster fetches)In exchange for the incredible freedom we give in how this software can be used (thanks to the permissive MIT license) we ask the individuals and businesses that use it to sponsor ongoing maintenance and development via sponsorship.
Seems stable and has good test suite, but needs real-world testing before it can be deemed production ready. Reach out on GitHub issues or Discord chat if you'd like to help with this: http://discord.gg/graphile
PostgreSQL 10+ and Node 10+.
If your database doesn't already include the pgcrypto
and uuid-ossp
extensions we'll automatically install them into the public schema for you. If you have them installed in a different schema (unlikely) you may face issues.
yarn add graphile-worker
graphile-worker
manages it's own database schema (graphile_worker
) just
point graphile-worker at your database and we handle our own migrations:
npx graphile-worker -c "postgres://localhost/mydb"
(npx
looks for the graphile-worker
binary locally; it's often better to
use the "scripts"
entry in package.json
instead.)
There's no point having a job queue if there's nothing to execute the jobs!
A task executor is a simple async JS function which receives as input the job payload and a collection of helpers. It does the work and then returns. If it returns then the job is deemed a success and is deleted from the queue. If it throws an error then the job is deemed a failure and the task is rescheduled using an exponential backoff algorithm.
IMPORTANT: your jobs should wait for all asynchronous work to be completed *before returning, otherwise we might mistakenly think they were successful.
IMPORTANT: we automatically retry the job if it fails, so it's often sensible to split large jobs into smaller jobs, this also allows them to run in parallel resulting in faster execution. This is particularly important for tasks that are not idempotent (i.e. running them a second time will have extra side effects) - for example sending emails.
Tasks are created in the tasks
folder in the directory from which you run
graphile-worker
; the name of the file (less the .js
suffix) is used as
the task identifier. Currently only .js
files that can be directly loaded
by Node.js are supported; if you are using Babel, TypeScript or similar you
will need to compile your tasks into the tasks
folder.
current directory
├── package.json
├── node_modules
└── tasks
├── task_1.js
└── task_2.js
// tasks/task_1.js
module.exports = async payload => {
await doMyLogicWith(payload);
};
// tasks/task_2.js
module.exports = async (payload, { debug }) => {
// async is optional, but best practice
debug(`Received ${JSON.stringify(payload)}`);
};
To delete the worker code and all the tasks from your database, just run this one SQL statement:
DROP SCHEMA graphile_worker CASCADE;
graphile-worker
is not intended to replace extremely high performance
dedicated job queues, it's intended to be a very easy way to get a job queue
up and running with Node.js and PostgreSQL. But this doesn't mean it's a
slouch by any means - it achieves an average latency from triggering a job in
one process to executing it in another of just 2ms, and each worker can
handle up to 731 jobs per second on modest hardware (2011 iMac).
graphile-worker
is horizontally scalable. Each instance has a customisable
worker pool, this pool defaults to size 1 (only one job at a time on this
worker) but depending on the nature of your tasks (i.e. assuming they're not
compute-heavy) you will likely want to set this higher to benefit from
Node.js' concurrency. If your tasks are compute heavy you may still wish to
set it higher and then using Node's child_process
(or Node v11's
worker_threads
) to share the compute load over multiple cores without
significantly impacting the main worker's runloop.
To test performance you can run yarn perfTest
. This reveals that on a 2011
iMac running both the worker and the database (and a bunch of other stuff)
starting the command, checking for jobs, and exiting takes about 0.40s and
running 20,000 trivial queued jobs across a
single worker pool of size 1 takes 27.35s (~731 jobs per second). Latencies
are also measured, from before the call to queue the job is fired until when
the job is actually executed. These latencies ranged from 1.39ms to 19.66ms
with an average of 1.90ms.
We currently use the formula exp(least(10, attempt))
to determine the
delays between attempts (the job must fail before the next attempt is
scheduled, so the total time elapsed may be greater depending on how long the
job runs for before it fails). This seems to handle temporary issues well,
after ~4 hours attempts will be made every ~6 hours until the maximum number
of attempts is achieved. The specific delays can be seen below:
select
attempt,
exp(least(10, attempt)) * interval '1 second' as delay,
sum(exp(least(10, attempt)) * interval '1 second') over (order by attempt asc) total_delay
from generate_series(1, 24) as attempt;
attempt | delay | total_delay
---------+-----------------+-----------------
1 | 00:00:02.718282 | 00:00:02.718282
2 | 00:00:07.389056 | 00:00:10.107338
3 | 00:00:20.085537 | 00:00:30.192875
4 | 00:00:54.598150 | 00:01:24.791025
5 | 00:02:28.413159 | 00:03:53.204184
6 | 00:06:43.428793 | 00:10:36.632977
7 | 00:18:16.633158 | 00:28:53.266135
8 | 00:49:40.957987 | 01:18:34.224122
9 | 02:15:03.083928 | 03:33:37.308050
10 | 06:07:06.465795 | 09:40:43.773845
11 | 06:07:06.465795 | 15:47:50.239640
12 | 06:07:06.465795 | 21:54:56.705435
13 | 06:07:06.465795 | 28:02:03.171230
14 | 06:07:06.465795 | 34:09:09.637025
15 | 06:07:06.465795 | 40:16:16.102820
16 | 06:07:06.465795 | 46:23:22.568615
17 | 06:07:06.465795 | 52:30:29.034410
18 | 06:07:06.465795 | 58:37:35.500205
19 | 06:07:06.465795 | 64:44:41.966000
20 | 06:07:06.465795 | 70:51:48.431795
21 | 06:07:06.465795 | 76:58:54.897590
22 | 06:07:06.465795 | 83:06:01.363385
23 | 06:07:06.465795 | 89:13:07.829180
24 | 06:07:06.465795 | 95:20:14.294975
yarn
yarn watch
In another terminal:
createdb graphile_worker_test
yarn test
FAQs
Job queue for PostgreSQL
The npm package graphile-worker receives a total of 33,447 weekly downloads. As such, graphile-worker popularity was classified as popular.
We found that graphile-worker demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
GitHub removed 27 malicious pull requests attempting to inject harmful code across multiple open source repositories, in another round of low-effort attacks.
Security News
RubyGems.org has added a new "maintainer" role that allows for publishing new versions of gems. This new permission type is aimed at improving security for gem owners and the service overall.
Security News
Node.js will be enforcing stricter semver-major PR policies a month before major releases to enhance stability and ensure reliable release candidates.