
Research
TeamPCP Compromises Telnyx Python SDK to Deliver Credential-Stealing Malware
Malicious versions of the Telnyx Python SDK on PyPI delivered credential-stealing malware via a multi-stage supply chain attack.
High-performance message queue for Node.js with AI-native primitives - built on Valkey/Redis with Rust NAPI bindings
High-performance message queue for Node.js with first-class AI orchestration. Built on Valkey/Redis Streams with a Rust NAPI core.
Completes and fetches the next job in a single server-side function call (1 RTT per job), hash-tags every key for zero-config clustering, and ships seven built-in primitives for LLM orchestration - cost tracking, token streaming, human-in-the-loop, model failover, TPM rate limiting, budget caps, and vector search.
npm install glide-mq
import { Queue, Worker } from 'glide-mq';
const connection = { addresses: [{ host: 'localhost', port: 6379 }] };
const queue = new Queue('tasks', { connection });
await queue.add('send-email', { to: 'user@example.com', subject: 'Welcome' });
const worker = new Worker(
'tasks',
async (job) => {
await sendEmail(job.data.to, job.data.subject);
return { sent: true };
},
{ connection, concurrency: 10 },
);
import { Queue, Worker } from 'glide-mq';
const queue = new Queue('ai', { connection });
await queue.add(
'inference',
{ prompt: 'Explain message queues' },
{
fallbacks: [{ model: 'gpt-5.4-nano', provider: 'openai' }],
lockDuration: 120000,
},
);
const worker = new Worker(
'ai',
async (job) => {
const result = await callLLM(job.data.prompt);
await job.reportUsage({
model: 'gpt-5.4',
tokens: { input: 50, output: 200 },
costs: { total: 0.003 },
});
await job.stream({ type: 'token', content: result });
return result;
},
{ connection, tokenLimiter: { maxTokens: 100000, duration: 60000 } },
);
| Aspect | glide-mq |
|---|---|
| Network per job | 1 RTT - complete + fetch next in a single FCALL |
| Client | Rust NAPI bindings via valkey-glide - no JS protocol parsing |
| Server logic | Persistent Valkey Function library (FUNCTION LOAD + FCALL) - no per-call EVAL |
| Cluster | Hash-tagged keys (glide:{queueName}:*) route to the same slot automatically |
| AI-native | Cost tracking, token streaming, suspend/resume, fallback chains, TPM limits, budget caps |
| Vector search | KNN similarity queries over job data via Valkey Search |
Seven primitives for LLM and agent workflows, built into the core API.
job.reportUsage() records model, tokens, cost, latency per job. queue.getFlowUsage() aggregates across flows.job.stream(chunk) pushes LLM output tokens in real time. queue.readStream(jobId) consumes them with optional long-polling.job.suspend() pauses mid-processor for human approval or webhook callback. queue.signal(jobId, name, data) resumes with external input.fallbacks array on job options. On failure, the next retry reads job.currentFallback for the alternate model/provider.tokenLimiter on worker options enforces tokens-per-minute caps. Combine with RPM limiter for dual-axis rate control.FlowProducer.add(flow, { budget }) sets maxTotalTokens and maxTotalCost across all jobs in a flow. Jobs fail or pause when exceeded.lockDuration per job for adaptive stall detection. Short for classifiers, long for multi-minute LLM calls.See Usage - AI-native primitives for full examples.
queue.addAndWait() for synchronous RPC patternsProducer and ServerlessPool for Lambda/EdgeTestQueue and TestWorker with zero Valkey dependencyBenchmarked on AWS ElastiCache Valkey 8.2 (r7g.large) with TLS, EC2 client in the same region.
| Concurrency | glide-mq | BullMQ | Delta |
|---|---|---|---|
| c=5 | 10,754 j/s | 9,866 j/s | +9% |
| c=10 | 18,218 j/s | 13,541 j/s | +35% |
| c=15 | 19,583 j/s | 14,162 j/s | +38% |
| c=20 | 19,408 j/s | 16,085 j/s | +21% |
The advantage comes from completing and fetching the next job in a single FCALL. The savings compound over real network latency - exactly the conditions in every production deployment. At high concurrency both libraries converge toward the Valkey single-thread ceiling.
Reproduce with npm run bench or npx tsx benchmarks/elasticache-head-to-head.ts against your own infrastructure.
27 runnable examples in examples/. Run any with npx tsx examples/<name>.ts.
| Example | What it shows |
|---|---|
usage-tracking.ts | Token and cost tracking across multi-step flows |
token-streaming.ts | Real-time LLM token streaming to clients |
human-approval.ts | Suspend/resume with editorial review gate |
model-failover.ts | Fallback chains across providers |
tpm-throttle.ts | Dual-axis RPM + TPM rate limiting |
budget-cap.ts | Flow-level token and cost caps |
vector-search.ts | KNN similarity search with pre-filters |
with-langchain.ts | LangChain integration with token tracking |
with-vercel-ai-sdk.ts | Vercel AI SDK integration with streaming |
rag-pipeline.ts | RAG with embedding, indexing, retrieval |
ai-agent-loop.ts | Autonomous agent loop with budget enforcement |
testing-mode.ts | In-memory testing without Valkey |
agent-budget-loop.ts | Agent loop with per-step budget tracking |
multi-model-cost.ts | Cost breakdown across multiple models |
fallback-usage.ts | Usage tracking through fallback chains |
streaming-sse.ts | Server-sent events with token streaming |
batch-embed-tpm.ts | Batch embeddings with TPM rate limiting |
thinking-model.ts | Thinking/reasoning model token tracking |
cost-breakdown.ts | Detailed per-category cost breakdown |
budget-weighted.ts | Weighted budget allocation across flow steps |
reasoning-stream.ts | Streaming reasoning/chain-of-thought tokens |
adaptive-timeout.ts | Adaptive lock duration based on model complexity |
broadcast-events.ts | Fan-out event publishing with subject filtering |
agent-memory.ts | Multi-turn agent with persistent memory |
search-dashboard.ts | Job search and monitoring dashboard |
embedding-pipeline.ts | Batch document embedding with rate limiting |
content-pipeline.ts | Content moderation with streaming and approval |
TestQueue/TestWorker run fully in-memory.| Guide | Topics |
|---|---|
| Usage | Queue, Worker, Producer, batch, request-reply, cluster mode |
| Workflows | FlowProducer, DAG, chain/group/chord, dynamic children |
| Advanced | Schedulers, rate limiting, dedup, compression, retries, DLQ |
| Broadcast | Pub/sub fan-out, subject filtering |
| Observability | OpenTelemetry, metrics, job logs, dashboard |
| Serverless | Producer, ServerlessPool, Lambda/Edge |
| Testing | In-memory TestQueue and TestWorker |
| Wire Protocol | Cross-language FCALL specs, Python/Go examples |
| Step Jobs | Step-job workflows with moveToDelayed |
| Durability | Durability guarantees, persistence, delivery semantics |
| Architecture | Internal architecture and design reference |
| Migration | Coming from BullMQ - API mapping guide |
| Package | Description |
|---|---|
| @glidemq/speedkey | Valkey GLIDE client with native NAPI bindings |
| @glidemq/dashboard | Web UI for metrics, schedulers, job mutations |
| @glidemq/hono | Hono middleware |
| @glidemq/fastify | Fastify plugin |
| @glidemq/nestjs | NestJS module |
| @glidemq/hapi | Hapi plugin |
| glide-mq.dev | Full documentation site |
Bug reports, feature requests, and pull requests are welcome.
Apache-2.0
FAQs
High-performance message queue for Node.js with AI-native primitives - built on Valkey/Redis with Rust NAPI bindings
We found that glide-mq demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Research
Malicious versions of the Telnyx Python SDK on PyPI delivered credential-stealing malware via a multi-stage supply chain attack.

Security News
TeamPCP is partnering with ransomware group Vect to turn open source supply chain attacks on tools like Trivy and LiteLLM into large-scale ransomware operations.

Security News
/Research
Widespread GitHub phishing campaign uses fake Visual Studio Code security alerts in Discussions to trick developers into visiting malicious website.