Socket
Book a DemoSign in
Socket

glide-mq

Package Overview
Dependencies
Maintainers
1
Versions
18
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

glide-mq

High-performance message queue for Node.js with AI-native primitives - built on Valkey/Redis with Rust NAPI bindings

latest
Source
npmnpm
Version
0.14.0
Version published
Maintainers
1
Created
Source

glide-mq

npm version license CI

High-performance message queue for Node.js with first-class AI orchestration. Built on Valkey/Redis Streams with a Rust NAPI core.

Completes and fetches the next job in a single server-side function call (1 RTT per job), hash-tags every key for zero-config clustering, and ships seven built-in primitives for LLM orchestration - cost tracking, token streaming, human-in-the-loop, model failover, TPM rate limiting, budget caps, and vector search.

npm install glide-mq

General Usage

import { Queue, Worker } from 'glide-mq';

const connection = { addresses: [{ host: 'localhost', port: 6379 }] };
const queue = new Queue('tasks', { connection });

await queue.add('send-email', { to: 'user@example.com', subject: 'Welcome' });

const worker = new Worker(
  'tasks',
  async (job) => {
    await sendEmail(job.data.to, job.data.subject);
    return { sent: true };
  },
  { connection, concurrency: 10 },
);

AI Usage

import { Queue, Worker } from 'glide-mq';

const queue = new Queue('ai', { connection });

await queue.add(
  'inference',
  { prompt: 'Explain message queues' },
  {
    fallbacks: [{ model: 'gpt-5.4-nano', provider: 'openai' }],
    lockDuration: 120000,
  },
);

const worker = new Worker(
  'ai',
  async (job) => {
    const result = await callLLM(job.data.prompt);
    await job.reportUsage({
      model: 'gpt-5.4',
      tokens: { input: 50, output: 200 },
      costs: { total: 0.003 },
    });
    await job.stream({ type: 'token', content: result });
    return result;
  },
  { connection, tokenLimiter: { maxTokens: 100000, duration: 60000 } },
);

When to use glide-mq

  • Background jobs and task processing - email, image processing, data pipelines, webhooks, any async work.
  • Scheduled and recurring work - cron jobs, interval tasks, bounded schedulers.
  • Distributed workflows - parent-child trees, DAGs, fan-in/fan-out, step jobs, dynamic children.
  • High-throughput queues over real networks - 1 RTT per job via Valkey Server Functions, up to 38% faster than alternatives.
  • LLM pipelines and model orchestration - cost tracking, token streaming, model failover, budget caps without external middleware.
  • Valkey/Redis clusters - hash-tagged keys out of the box with zero configuration.

How it's different

Aspectglide-mq
Network per job1 RTT - complete + fetch next in a single FCALL
ClientRust NAPI bindings via valkey-glide - no JS protocol parsing
Server logicPersistent Valkey Function library (FUNCTION LOAD + FCALL) - no per-call EVAL
ClusterHash-tagged keys (glide:{queueName}:*) route to the same slot automatically
AI-nativeCost tracking, token streaming, suspend/resume, fallback chains, TPM limits, budget caps
Vector searchKNN similarity queries over job data via Valkey Search

AI-native primitives

Seven primitives for LLM and agent workflows, built into the core API.

  • Cost tracking - job.reportUsage() records model, tokens, cost, latency per job. queue.getFlowUsage() aggregates across flows.
  • Token streaming - job.stream(chunk) pushes LLM output tokens in real time. queue.readStream(jobId) consumes them with optional long-polling.
  • Suspend/resume - job.suspend() pauses mid-processor for human approval or webhook callback. queue.signal(jobId, name, data) resumes with external input.
  • Fallback chains - ordered fallbacks array on job options. On failure, the next retry reads job.currentFallback for the alternate model/provider.
  • TPM rate limiting - tokenLimiter on worker options enforces tokens-per-minute caps. Combine with RPM limiter for dual-axis rate control.
  • Budget caps - FlowProducer.add(flow, { budget }) sets maxTotalTokens and maxTotalCost across all jobs in a flow. Jobs fail or pause when exceeded.
  • Per-job lock duration - override lockDuration per job for adaptive stall detection. Short for classifiers, long for multi-minute LLM calls.

See Usage - AI-native primitives for full examples.

Features

  • 1 RTT per job - complete current + fetch next in a single server-side function call
  • Cluster-native - hash-tagged keys, zero cluster configuration
  • Workflows - FlowProducer trees, DAGs with fan-in, chain/group/chord, step jobs, dynamic children
  • Scheduling - 5-field cron with timezone, fixed intervals, bounded schedulers
  • Retries - exponential, fixed, or custom backoff with dead-letter queues
  • Rate limiting - per-group sliding window, token bucket, global queue-wide limits
  • Broadcast - fan-out pub/sub with NATS-style subject filtering and independent subscriber retries
  • Batch processing - process multiple jobs at once for bulk I/O
  • Request-reply - queue.addAndWait() for synchronous RPC patterns
  • Deduplication - simple, throttle, and debounce modes
  • Compression - transparent gzip at the queue level
  • Serverless - lightweight Producer and ServerlessPool for Lambda/Edge
  • OpenTelemetry - automatic span emission with bring-your-own tracer
  • In-memory testing - TestQueue and TestWorker with zero Valkey dependency
  • Cross-language - HTTP proxy and wire protocol for non-Node.js services

Performance

Benchmarked on AWS ElastiCache Valkey 8.2 (r7g.large) with TLS, EC2 client in the same region.

Concurrencyglide-mqBullMQDelta
c=510,754 j/s9,866 j/s+9%
c=1018,218 j/s13,541 j/s+35%
c=1519,583 j/s14,162 j/s+38%
c=2019,408 j/s16,085 j/s+21%

The advantage comes from completing and fetching the next job in a single FCALL. The savings compound over real network latency - exactly the conditions in every production deployment. At high concurrency both libraries converge toward the Valkey single-thread ceiling.

Reproduce with npm run bench or npx tsx benchmarks/elasticache-head-to-head.ts against your own infrastructure.

Examples

27 runnable examples in examples/. Run any with npx tsx examples/<name>.ts.

ExampleWhat it shows
usage-tracking.tsToken and cost tracking across multi-step flows
token-streaming.tsReal-time LLM token streaming to clients
human-approval.tsSuspend/resume with editorial review gate
model-failover.tsFallback chains across providers
tpm-throttle.tsDual-axis RPM + TPM rate limiting
budget-cap.tsFlow-level token and cost caps
vector-search.tsKNN similarity search with pre-filters
with-langchain.tsLangChain integration with token tracking
with-vercel-ai-sdk.tsVercel AI SDK integration with streaming
rag-pipeline.tsRAG with embedding, indexing, retrieval
ai-agent-loop.tsAutonomous agent loop with budget enforcement
testing-mode.tsIn-memory testing without Valkey
agent-budget-loop.tsAgent loop with per-step budget tracking
multi-model-cost.tsCost breakdown across multiple models
fallback-usage.tsUsage tracking through fallback chains
streaming-sse.tsServer-sent events with token streaming
batch-embed-tpm.tsBatch embeddings with TPM rate limiting
thinking-model.tsThinking/reasoning model token tracking
cost-breakdown.tsDetailed per-category cost breakdown
budget-weighted.tsWeighted budget allocation across flow steps
reasoning-stream.tsStreaming reasoning/chain-of-thought tokens
adaptive-timeout.tsAdaptive lock duration based on model complexity
broadcast-events.tsFan-out event publishing with subject filtering
agent-memory.tsMulti-turn agent with persistent memory
search-dashboard.tsJob search and monitoring dashboard
embedding-pipeline.tsBatch document embedding with rate limiting
content-pipeline.tsContent moderation with streaming and approval

When NOT to use glide-mq

  • You need a log-based event streaming platform. glide-mq is a job/task queue, not a partitioned event log. It does not provide Kafka-style topic partitions, consumer offset management, or event replay.
  • You need browser support. The Rust NAPI client requires a server-side runtime (Node.js 20+, Bun, or Deno with NAPI support).
  • You need exactly-once semantics. glide-mq provides at-least-once delivery. Duplicate processing is rare but possible - design processors to be idempotent.
  • You need to run without Valkey or Redis. Production use requires Valkey 7.0+ or Redis 7.0+. For dev/testing, TestQueue/TestWorker run fully in-memory.

Documentation

GuideTopics
UsageQueue, Worker, Producer, batch, request-reply, cluster mode
WorkflowsFlowProducer, DAG, chain/group/chord, dynamic children
AdvancedSchedulers, rate limiting, dedup, compression, retries, DLQ
BroadcastPub/sub fan-out, subject filtering
ObservabilityOpenTelemetry, metrics, job logs, dashboard
ServerlessProducer, ServerlessPool, Lambda/Edge
TestingIn-memory TestQueue and TestWorker
Wire ProtocolCross-language FCALL specs, Python/Go examples
Step JobsStep-job workflows with moveToDelayed
DurabilityDurability guarantees, persistence, delivery semantics
ArchitectureInternal architecture and design reference
MigrationComing from BullMQ - API mapping guide

Ecosystem

PackageDescription
@glidemq/speedkeyValkey GLIDE client with native NAPI bindings
@glidemq/dashboardWeb UI for metrics, schedulers, job mutations
@glidemq/honoHono middleware
@glidemq/fastifyFastify plugin
@glidemq/nestjsNestJS module
@glidemq/hapiHapi plugin
glide-mq.devFull documentation site

Contributing

Bug reports, feature requests, and pull requests are welcome.

License

Apache-2.0

Keywords

queue

FAQs

Package last updated on 28 Mar 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts