
Research
TeamPCP Compromises Telnyx Python SDK to Deliver Credential-Stealing Malware
Malicious versions of the Telnyx Python SDK on PyPI delivered credential-stealing malware via a multi-stage supply chain attack.
A modular runtime and orchestration system for AI agents - works with Claude Code, OpenCode, and Codex CLI
A modular runtime and orchestration system for AI agents.
19 plugins · 38 agents · 39 skills (across all repos) · 30k lines of lib code · 3,575 tests · 5 platforms
Plugins distributed as standalone repos under agent-sh org — agentsys is the marketplace & installer
Commands · Installation · Website · Discussions
Built for Claude Code · Codex CLI · OpenCode · Cursor · Kiro
New skills, agents, and integrations ship constantly. Follow for real-time updates:
AI models can write code. That's not the hard part anymore. The hard part is everything around it — task selection, branch management, code review, artifact cleanup, CI, PR comments, deployment. AgentSys is the runtime that orchestrates agents to handle all of it — structured pipelines, gated phases, specialized agents, and persistent state that survives session boundaries.
Building custom skills, agents, hooks, or MCP tools? agnix is the CLI + LSP linter that catches config errors before they fail silently - real-time IDE validation, auto suggestions, auto-fix, and 342 rules for Claude Code, Codex, OpenCode, Cursor, Kiro, Copilot, Gemini CLI, Cline, Windsurf, Roo Code, Amp, and more.
An agent orchestration system — 19 plugins, 38 agents, and 39 skills that compose into structured pipelines for software development. Each plugin lives in its own standalone repo under the agent-sh org. agentsys is the marketplace and installer that ties them together.
Each agent has a single responsibility, a specific model assignment, and defined inputs/outputs. Pipelines enforce phase gates so agents can't skip steps. State persists across sessions so work survives interruptions.
The system runs on Claude Code, OpenCode, Codex CLI, Cursor, and Kiro. Install via the marketplace or the npm installer, and the plugins are fetched automatically from their repos.
Code does code work. AI does AI work.
Certainty levels exist because not all findings are equal:
| Level | Meaning | Action |
|---|---|---|
| HIGH | Definitely a problem | Safe to auto-fix |
| MEDIUM | Probably a problem | Needs context |
| LOW | Might be a problem | Needs human judgment |
This came from testing on 1,000+ repositories.
| Command | What it does |
|---|---|
/next-task | Task workflow: discovery, implementation, PR, merge |
/agnix | Lint agent configurations (342 rules) |
/ship | PR creation, CI monitoring, merge |
/deslop | Clean AI slop patterns |
/perf | Performance investigation with baselines and profiling |
/drift-detect | Compare plan vs implementation |
/audit-project | Multi-agent iterative code review |
/enhance | Plugin, agent, and prompt analyzers |
/repo-map | AST-based repository map |
/sync-docs | Sync documentation with code changes |
/learn | Research topics, create learning guides |
/consult | Cross-tool AI consultation |
/debate | Structured debate between AI tools |
/web-ctl | Browser automation for AI agents |
/release | Versioned release with ecosystem detection |
/skillers | Workflow pattern learning and automation |
/git-map | Git history analysis: hotspots, coupling, ownership, bus factor |
/onboard | Codebase orientation for newcomers |
/can-i-help | Match contributor skills to project needs |
Each command works standalone. Together, they compose into end-to-end pipelines.
39 skills included across the plugins:
| Category | Skills |
|---|---|
| Workflow | discover-tasks, orchestrate-review, validate-delivery |
| Enhancement | enhance-agent-prompts, enhance-claude-memory, enhance-cross-file, enhance-docs, enhance-hooks, enhance-orchestrator, enhance-plugins, enhance-prompts, enhance-skills |
| Performance | baseline, benchmark, code-paths, investigation-logger, perf-analyzer, profile, theory-gatherer, theory-tester |
| Cleanup | deslop, sync-docs |
| Code Review | audit-project |
| AI Collaboration | consult, debate, learn, recommend, skillers-compact |
| Onboarding | can-i-help, onboard |
| Web | web-auth, web-browse |
| Release | release |
| Analysis | drift-analysis, git-mapping, repo-mapping |
| Other | glide-mq-migrate-bee, glide-mq-migrate-bullmq, glide-mq |
External skill plugins (standalone repos, installed separately):
| Category | Skills | Plugin |
|---|---|---|
| Message Queues | glide-mq, glide-mq-migrate-bullmq, glide-mq-migrate-bee | agent-sh/glidemq |
Skills are the reusable implementation units. Agents invoke skills; commands orchestrate agents. When you install a plugin, its skills become available to all agents in that session.
| Section | What's there |
|---|---|
| The Approach | Why it's built this way |
| Commands | All 19 commands overview |
| Skills | 39 skills across plugins |
| Command Details | Deep dive into each command |
| How Commands Work Together | Standalone vs integrated |
| Design Philosophy | The thinking behind the architecture |
| Installation | Get started |
| Research & Testing | What went into building this |
| Documentation | Links to detailed docs |
Purpose: Complete task-to-production automation.
What happens when you run it:
Phase 9 uses the orchestrate-review skill to spawn parallel reviewers (code quality, security, performance, test coverage) plus conditional specialists.
Agents involved:
| Agent | Model | Role |
|---|---|---|
| task-discoverer | sonnet | Finds and ranks tasks from your source |
| worktree-manager | haiku | Creates git worktrees and branches |
| exploration-agent | sonnet | Deep codebase analysis before planning |
| planning-agent | opus | Designs step-by-step implementation plan |
| implementation-agent | opus | Writes the actual code |
| test-coverage-checker | sonnet | Validates tests exist and are meaningful |
| delivery-validator | sonnet | Final checks before shipping |
| ci-monitor | haiku | Watches CI status |
| ci-fixer | sonnet | Fixes CI failures and review comments |
| simple-fixer | haiku | Executes mechanical edits |
Cross-plugin agent:
| Agent | Plugin | Role |
|---|---|---|
| deslop-agent | deslop | Removes AI artifacts before review |
| sync-docs-agent | sync-docs | Updates documentation |
Usage:
/next-task # Start new workflow
/next-task --resume # Resume interrupted workflow
/next-task --status # Check current state
/next-task --abort # Cancel and cleanup
Purpose: Lint agent configurations before they break your workflow. The first dedicated linter for AI agent configs.
agnix is a standalone open-source project that provides the validation engine. This plugin integrates it into your workflow.
The problem it solves:
Agent configurations are code. They affect behavior, security, and reliability. But unlike application code, they have no linting. You find out your SKILL.md is malformed when the agent fails. You discover your hooks have security issues when they're exploited. You realize your CLAUDE.md has conflicting rules when the AI behaves unexpectedly.
agnix catches these issues before they cause problems.
What it validates:
| Category | What It Checks |
|---|---|
| Structure | Required fields, valid YAML/JSON, proper frontmatter |
| Security | Prompt injection vectors, overpermissive tools, exposed secrets |
| Consistency | Conflicting rules, duplicate definitions, broken references |
| Best Practices | Tool restrictions, model selection, trigger phrase quality |
| Cross-Platform | Compatibility across Claude Code, Codex, OpenCode, Cursor, Kiro, Copilot, Gemini CLI, Cline, Windsurf, Roo Code, Amp, and more |
342 validation rules (102 auto-fixable) derived from:
Supported files:
| File Type | Examples |
|---|---|
| Skills | SKILL.md, */SKILL.md |
| Memory | CLAUDE.md, AGENTS.md, .github/CLAUDE.md |
| Hooks | .claude/settings.json, hooks configuration |
| MCP | *.mcp.json, MCP server configs |
| Cursor | .cursor/rules/*.mdc, .cursorrules |
| Copilot | .github/copilot-instructions.md |
| Kiro | .kiro/steering/**/*.md, .kiro/agents/*.json, .kiro/hooks/*.kiro.hook, POWER.md |
| Windsurf | .windsurf/rules/**/*.md, .windsurf/workflows/**/*.md, .windsurfrules |
| Roo Code | .roo/rules/*.md, .roo/rules-{mode}/*.md, .roomodes, .rooignore, .roorules |
| Gemini CLI | GEMINI.md, .gemini/settings.json, gemini-extension.json |
| OpenCode | opencode.json |
| Amp | .agents/checks/**/*.md, .amp/settings.json |
CI/CD Integration:
agnix outputs SARIF format for GitHub Code Scanning. Add it to your workflow:
- name: Lint agent configs
run: agnix --format sarif > results.sarif
- uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: results.sarif
Usage:
/agnix # Validate current project
/agnix --fix # Auto-fix fixable issues
/agnix --strict # Treat warnings as errors
/agnix --target claude-code # Only Claude Code rules
/agnix --format sarif # Output for GitHub Code Scanning
Agent: agnix-agent (sonnet model)
External tool: Requires agnix CLI
npm install -g agnix # Install via npm
# or
cargo install agnix-cli # Install via Cargo
# or
brew install agnix # Install via Homebrew (macOS)
Why use agnix:
Purpose: Takes your current branch from "ready to commit" to "merged PR."
What happens when you run it:
Platform Detection:
| Type | Detected |
|---|---|
| CI | GitHub Actions, GitLab CI, CircleCI, Jenkins, Travis |
| Deploy | Railway, Vercel, Netlify, Fly.io, Render |
| Project | Node.js, Python, Rust, Go, Java |
Review Comment Handling:
Every comment gets addressed. No exceptions. The workflow categorizes comments and handles each:
If something can't be fixed, the workflow replies explaining why and resolves the thread.
Usage:
/ship # Full workflow
/ship --dry-run # Preview without executing
/ship --strategy rebase # Use rebase instead of squash
Purpose: Finds AI slop—debug statements, placeholder text, verbose comments, TODOs—and removes it.
How detection works:
Three phases run in sequence:
Phase 1: Regex Patterns (HIGH certainty)
console.log, print(), dbg!(), println!()// TODO, // FIXME, // HACKPhase 2: Multi-Pass Analyzers (MEDIUM certainty)
Phase 3: CLI Tools (LOW certainty, optional)
Languages supported: JavaScript/TypeScript, Python, Rust, Go, Java
Usage:
/deslop # Report only (safe)
/deslop apply # Fix HIGH certainty issues
/deslop apply src/ 10 # Fix 10 issues in src/
Thoroughness levels:
quick - Phase 1 only (fastest)normal - Phase 1 + Phase 2 (default)deep - All phases if tools availablePurpose: Structured performance investigation with baselines, profiling, and evidence-backed decisions.
10-phase methodology (based on recorded real performance investigation sessions):
Agents and skills:
| Component | Role |
|---|---|
| perf-orchestrator | Coordinates all phases |
| perf-theory-gatherer | Generates hypotheses from git history and code |
| perf-theory-tester | Validates hypotheses with controlled experiments |
| perf-analyzer | Synthesizes findings into recommendations |
| perf-code-paths | Maps entrypoints and likely hot paths |
| perf-investigation-logger | Structured evidence logging |
Usage:
/perf # Start new investigation
/perf --resume # Resume previous investigation
Phase flags (advanced):
/perf --phase baseline --command "npm run bench" --version v1.2.0
/perf --phase breaking-point --param-min 1 --param-max 500
/perf --phase constraints --cpu 1 --memory 1GB
/perf --phase hypotheses --hypotheses-file perf-hypotheses.json
/perf --phase optimization --change "reduce allocations"
/perf --phase decision --verdict stop --rationale "no measurable improvement"
Purpose: Compares your documentation and plans to what's actually in the code.
The problem it solves:
Your roadmap says "user authentication: done." But is it actually implemented? Your GitHub issue says "add dark mode." Is it already in the codebase? Plans drift from reality. This command finds the drift.
How it works:
JavaScript collectors gather data (fast, token-efficient)
Single Opus call performs semantic analysis
auth/, login.js, session.ts)Why this approach:
Multi-agent collection wastes tokens on coordination. JavaScript collectors are fast and deterministic. One well-prompted LLM call does the actual analysis. Result: 77% token reduction vs multi-agent approaches.
Tested on 1,000+ repositories before release.
Usage:
/drift-detect # Full analysis
/drift-detect --depth quick # Quick scan
Purpose: Multi-agent code review that iterates until issues are resolved.
What happens when you run it:
Up to 10 specialized role-based agents run based on your project:
| Agent | When Active | Focus Area |
|---|---|---|
| code-quality-reviewer | Always | Code quality, error handling |
| security-expert | Always | Vulnerabilities, auth, secrets |
| performance-engineer | Always | N+1 queries, memory, blocking ops |
| test-quality-guardian | Always | Coverage, edge cases, mocking |
| architecture-reviewer | If 50+ files | Modularity, patterns, SOLID |
| database-specialist | If DB detected | Queries, indexes, transactions |
| api-designer | If API detected | REST, errors, pagination |
| frontend-specialist | If frontend detected | Components, state, UX |
| backend-specialist | If backend detected | Services, domain logic |
| devops-reviewer | If CI/CD detected | Pipelines, configs, secrets |
Findings are collected and categorized by severity (critical/high/medium/low). All non-false-positive issues get fixed automatically. The loop repeats until no open issues remain.
Usage:
/audit-project # Full review
/audit-project --quick # Single pass
/audit-project --resume # Resume from queue file
/audit-project --domain security # Security focus only
/audit-project --recent # Only recent changes
Purpose: Analyzes your prompts, plugins, agents, docs, hooks, and skills for improvement opportunities.
Seven analyzers run in parallel:
| Analyzer | What it checks |
|---|---|
| plugin-enhancer | Plugin structure, MCP tool definitions, security patterns |
| agent-enhancer | Agent frontmatter, prompt quality |
| claudemd-enhancer | CLAUDE.md/AGENTS.md structure, token efficiency |
| docs-enhancer | Documentation readability, RAG optimization |
| prompt-enhancer | Prompt engineering patterns, clarity, examples |
| hooks-enhancer | Hook frontmatter, structure, safety |
| skills-enhancer | SKILL.md structure, trigger phrases |
Each finding includes:
Auto-learning: Detects obvious false positives (pattern docs, workflow gates) and saves them for future runs. Reduces noise over time without manual suppression files.
Usage:
/enhance # Run all analyzers
/enhance --focus=agent # Just agent prompts
/enhance --apply # Apply HIGH certainty fixes
/enhance --show-suppressed # Show what's being filtered
/enhance --no-learn # Analyze but don't save false positives
Purpose: Builds an AST-based map of symbols and imports for fast repo analysis.
What it generates:
Output is cached at {state-dir}/repo-map.json and exposed via the MCP repo_map tool.
Why it matters:
Tools like /drift-detect and planners can use the map instead of re-scanning the repo every time.
Usage:
/repo-map init # First-time map generation
/repo-map update # Incremental update
/repo-map status # Check freshness
Required: ast-grep (sg) must be installed.
Purpose: Sync documentation with actual code changes—find outdated refs, update CHANGELOG, flag stale examples.
The problem it solves:
You refactor auth.js into auth/index.js. Your README still says import from './auth'. You rename a function. Three docs still reference the old name. You ship a feature. CHANGELOG doesn't mention it. Documentation drifts from code. This command finds the drift.
What it detects:
| Category | Examples |
|---|---|
| Broken references | Imports to moved/renamed files, deleted exports |
| Version mismatches | Doc says v2.0, package.json says v2.1 |
| Stale code examples | Import paths that no longer exist |
| Missing CHANGELOG | feat: and fix: commits without entries |
Auto-fixable vs flagged:
| Auto-fixable (apply mode) | Flagged for review |
|---|---|
| Version number updates | Removed exports referenced in docs |
| CHANGELOG entries for commits | Code examples needing context |
| Function renames |
Usage:
/sync-docs # Check what docs need updates (safe)
/sync-docs apply # Apply safe fixes
/sync-docs report src/ # Check docs related to src/
/sync-docs --all # Full codebase scan
Purpose: Research any topic online and create a comprehensive learning guide with RAG-optimized indexes.
What it does:
Depth levels:
| Depth | Sources | Use Case |
|---|---|---|
| brief | 10 | Quick overview |
| medium | 20 | Default, balanced |
| deep | 40 | Comprehensive |
Output structure:
agent-knowledge/
CLAUDE.md # Master index (updated each run)
AGENTS.md # Index for OpenCode/Codex
recursion.md # Topic-specific guide
resources/
recursion-sources.json # Source metadata with quality scores
Usage:
/learn recursion # Default (20 sources)
/learn react hooks --depth=deep # Comprehensive (40 sources)
/learn kubernetes --depth=brief # Quick overview (10 sources)
/learn python async --no-enhance # Skip enhancement pass
Agent: learn-agent (sonnet model)
Purpose: Get a second opinion from another AI CLI tool without leaving your current session.
What it does:
--continue)Supported tools:
| Tool | Default Model (high) | Reasoning Control |
|---|---|---|
| Claude | claude-opus-4-6 | max-turns |
| Gemini | gemini-3.1-pro-preview | built-in |
| Codex | gpt-5.3-codex | model_reasoning_effort |
| OpenCode | (user-selected or default) | --variant |
| Copilot | (default) | none |
Usage:
/consult "Is this the right approach?" --tool=gemini --effort=high
/consult "Review for performance issues" --tool=codex
/consult "Suggest alternatives" --tool=claude --effort=max
/consult "Continue from where we left off" --continue
/consult "Explain this error" --context=diff --tool=gemini
Agent: consult-agent (sonnet model for orchestration)
Purpose: Stress-test ideas through structured multi-round debate between two AI CLI tools.
What it does:
Usage:
# Natural language
/debate codex vs gemini about microservices vs monolith
/debate with claude and codex about our auth implementation
/debate thoroughly gemini vs codex about database schema design
/debate codex vs gemini 3 rounds about event sourcing
# Explicit flags
/debate "Should we use event sourcing?" --tools=claude,gemini --rounds=3 --effort=high
/debate "Valkey vs PostgreSQL for caching" --tools=codex,opencode
# With codebase context
/debate "Is our current approach correct?" --tools=gemini,codex --context=diff
Options:
| Flag | Description |
|---|---|
--tools=TOOL1,TOOL2 | Proposer and challenger (comma-separated) |
--rounds=N | Number of debate rounds, 1–5 (default: 2) |
--effort=low|medium|high|max | Reasoning depth per tool call |
--context=diff|file=PATH|none | Codebase context passed to both tools |
Agent: debate-orchestrator (opus model for orchestration)
Purpose: Browser automation for AI agents - navigate, authenticate, and interact with web pages.
How it works:
Each invocation is a single Node.js process using Playwright. No daemon, no MCP server. Session state persists via Chrome's userDataDir with AES-256-GCM encrypted storage.
Agent calls skill -> node scripts/web-ctl.js <args> -> Playwright API -> JSON result
Session lifecycle:
session start <name> - Create session (encrypted profile directory)session auth <name> --url <login-url> - Opens headed Chrome for human login (2FA, CAPTCHAs). Polls for success URL/selector, encrypts cookies on completionrun <name> <action> - Headless actions using persisted cookiessession end <name> - CleanupActions:
| Action | Description | Key flag |
|---|---|---|
goto <url> | Navigate to URL | |
snapshot | Get accessibility tree (primary page inspection) | |
click <sel> | Click element | --wait-stable |
click-wait <sel> | Click and wait for DOM + network stability | --timeout <ms> |
type <sel> <text> | Type with human-like delays | |
read <sel> | Read element text content | |
fill <sel> <value> | Clear field and set value | |
wait <sel> | Wait for element to appear | --timeout <ms> |
evaluate <js> | Execute JS in page context | --allow-evaluate |
screenshot | Full-page screenshot | --path <file> |
network | Capture network requests | --filter <pattern> |
checkpoint | Open headed browser for user (CAPTCHAs) | --timeout <sec> |
click-wait waits for network idle + no DOM mutations for 500ms before returning. Cuts SPA interactions from multiple agent turns to one.
Error handling:
All errors return classified codes with actionable recovery suggestions:
| Code | Recovery suggestion |
|---|---|
element_not_found | Snapshot included in response for selector discovery |
timeout | Increase --timeout |
browser_closed | session start <name> |
network_error | Check URL; verify cookies with session status |
no_display | Use --vnc flag |
session_expired | Re-authenticate |
Security: Output sanitization (cookies/tokens redacted), prompt injection defense ([PAGE_CONTENT: ...] delimiters), AES-256-GCM encryption at rest, anti-bot measures (webdriver=false, random delays), read-only agent (no Write/Edit tools).
Selector syntax: role=button[name='Submit'], css=div.class, text=Click here, #id
Usage:
/web-ctl goto https://example.com
/web-ctl auth twitter --url https://x.com/i/flow/login
/web-ctl # describe what you want to do, agent orchestrates it
Install:
agentsys install web-ctl
npm install playwright
npx playwright install chromium
Agent: web-session (sonnet model)
Skills: web-auth (human-in-the-loop auth), web-browse (headless actions)
Versioned release with automatic ecosystem and tooling detection
/release # Patch release (auto-discovers how this repo releases)
/release minor # Minor version bump
/release major --dry-run # Preview what would happen
The release agent discovers how your repo releases before executing:
release: target, npm release script, scripts/release.*Supports 12+ ecosystems: npm, cargo, python, go, maven, gradle, ruby, nuget, dart, hex, packagist, swift.
Agent: release-agent (sonnet model)
Skill: release (generic fallback workflow)
Learn from your workflow patterns and suggest automations
/skillers show # Display current config and knowledge stats
/skillers compact # Analyze recent transcripts, extract patterns
/skillers compact --days=14 # Analyze older transcripts
/skillers recommend # Get automation suggestions from accumulated knowledge
Reads your Claude Code conversation transcripts, identifies recurring patterns (pain points, repeated workflows, wishes), clusters them into weighted themes, and suggests skills, hooks, or agents to automate them.
No per-turn overhead - it reads transcripts that Claude Code already saves.
Agents: skillers-compactor (sonnet), skillers-recommender (opus)
Skills: skillers-compact, recommend
Purpose: Analyze git history to surface hotspots, coupling, ownership, bus factor, bugspots, area health, and AI attribution.
How it works:
The plugin wraps the agent-analyzer Rust binary. Run init once to scan git history and cache the result as repo-intel.json. Then run queries instantly.
20 query types:
| Category | Queries |
|---|---|
| Activity | hotspots, coldspots, file-history |
| Quality | bugspots, test-gaps, diff-risk |
| People | ownership, contributors, bus-factor |
| Coupling | coupling |
| Standards | norms, conventions |
| Health | areas, health, release-info |
| AI | ai-ratio, recent-ai |
| Guidance | onboard, can-i-help |
| Docs | doc-drift |
9 plugins consume git-map data automatically - deslop, sync-docs, drift-detect, audit-project, next-task, enhance, ship, onboard, can-i-help.
Usage:
/git-map init # First-time scan
/git-map update # Add new commits
/git-map query hotspots # Most active files
/git-map query ownership src/ # Who owns a path
/git-map query bus-factor # Knowledge risk
Purpose: Get oriented in any codebase in under 3 minutes.
What happens when you run it:
74% fewer tokens than manual onboarding. Validated on 100 repos across JS/TS, Rust, Go, Python, C/C++, Java, and Deno.
Depth levels:
| Level | Time | Data |
|---|---|---|
| quick | ~2s | Manifest + README + structure |
| normal | ~5s | + CLAUDE.md/AGENTS.md + CI + repo-intel |
| deep | ~15s | + repo-map AST symbols |
Supported manifests: package.json, Cargo.toml, go.mod, pyproject.toml, deno.json, CMakeLists.txt, meson.build, setup.py, pom.xml, build.gradle. Detects monorepos (npm/pnpm/lerna/Cargo workspaces, Python libs/, Deno workspaces).
Usage:
/onboard # Current repo
/onboard /path/to/repo # Specific repo
/onboard --depth=deep # Include AST data
Agent: onboard-agent (opus model)
Purpose: Match a contributor's skills to specific areas where they can help.
What happens when you run it:
Matching:
| Developer profile | Gets recommended |
|---|---|
| New to stack | Good-first areas with clear patterns |
| Experienced | Hard problems in pain-point areas |
| Test-focused | Test gaps in frequently-changed files |
| Bug-focused | Bugspot files + relevant open issues |
| Docs-focused | Stale documentation with code examples |
Usage:
/can-i-help # Current repo
/can-i-help /path/to/repo # Specific repo
/can-i-help --depth=deep # Include AST data
Agent: can-i-help-agent (opus model)
Plugins that provide skills without a / command. Installed alongside agentsys; skills become available to all agents.
Purpose: Build message queues, background jobs, and workflow orchestration with glide-mq - high-performance Node.js queue on Valkey/Redis.
Skills:
| Skill | What it does |
|---|---|
glide-mq | Greenfield queue development - queues, workers, ordering, rate limiting, flows, broadcast, step jobs |
glide-mq-migrate-bullmq | Migrate from BullMQ to glide-mq - API mapping, breaking changes, feature comparison |
glide-mq-migrate-bee | Migrate from Bee-Queue to glide-mq - API mapping, pattern conversion |
Key features covered: per-key ordering, group concurrency, runtime group rate limiting (job.rateLimitGroup()), token bucket, DAG workflows, broadcast pub/sub, step jobs, deduplication, serverless producers.
Full documentation → | glide-mq docs →
Standalone use:
/deslop apply # Just clean up your code
/sync-docs # Just check if docs need updates
/ship # Just ship this branch
/audit-project # Just review the codebase
Integrated workflow:
When you run /next-task, it orchestrates everything:
/next-task picks task → explores codebase → plans implementation
↓
implementation-agent writes code
↓
deslop-agent cleans AI artifacts
↓
Phase 9 review loop iterates until approved
↓
delivery-validator checks requirements
↓
sync-docs-agent syncs documentation
↓
[/ship](#ship) creates PR → monitors CI → merges
The workflow tracks state so you can resume from any point.
Frontier models write good code. That's solved. What's not solved:
1. One agent, one job, done extremely well
Same principle as good code: single responsibility. The exploration-agent explores. The implementation-agent implements. Phase 9 spawns multiple focused reviewers. No agent tries to do everything. Specialized agents, each with narrow scope and clear success criteria.
2. Pipeline with gates, not a monolith
Same principle as DevOps. Each step must pass before the next begins. Can't push before review. Can't merge before CI passes. Hooks enforce this—agents literally cannot skip phases.
3. Tools do tool work, agents do agent work
If static analysis, regex, or a shell command can do it, don't ask an LLM. Pattern detection uses pre-indexed regex. File discovery uses glob. Platform detection uses file existence checks. The LLM only handles what requires judgment.
4. Agents don't need to know how tools work
The slop detector returns findings with certainty levels. The agent doesn't need to understand the three-phase pipeline, the regex patterns, or the analyzer heuristics. Good tool design means the consumer doesn't need implementation details.
5. Build tools where tools don't exist
Many tasks lack existing tools. JavaScript collectors for drift-detect. Multi-pass analyzers for slop detection. The result: agents receive structured data, not raw problems to figure out.
6. Research-backed prompt engineering
Documented techniques that measurably improve results:
7. Validate plan and results, not every step
Approve the plan. See the results. The middle is automated. One plan approval unlocks autonomous execution through implementation, review, cleanup, and shipping.
8. Right model for the task
Match model capability to task complexity:
Quality compounds. Poor exploration → poor plan → poor implementation → review cycles. Early phases deserve the best model.
9. Persistent state survives sessions
Two JSON files track everything: what task, what phase. Sessions can die and resume. Multiple sessions run in parallel on different tasks using separate worktrees.
10. Delegate everything automatable
Agents don't just write code. They:
If it can be specified, it can be delegated.
11. Orchestrator stays high-level
The main workflow orchestrator doesn't read files, search code, or write implementations. It launches specialized agents and receives their outputs. Keeps the orchestrator's context window available for coordination rather than filled with file contents.
12. Composable, not monolithic
Every command works standalone. /deslop cleans code without needing /next-task. /ship merges PRs without needing the full workflow. Pieces compose together, but each piece is useful on its own.
/plugin marketplace add agent-sh/agentsys
/plugin install next-task@agentsys
/plugin install ship@agentsys
npm install -g agentsys && agentsys
Interactive installer for Claude Code, OpenCode, Codex CLI, Cursor, and Kiro.
# Non-interactive install
agentsys --tool claude # Single tool
agentsys --tool cursor # Cursor (project-scoped skills + commands)
agentsys --tool kiro # Kiro (project-scoped steering + skills + agents)
agentsys --tools "claude,opencode" # Multiple tools
agentsys --development # Dev mode (bypasses marketplace)
Required:
For GitHub workflows:
gh) authenticatedFor GitLab workflows:
glab) authenticatedFor /repo-map:
sg) installedFor /agnix:
npm install -g agnix, cargo install agnix-cli, or brew install agnix)Local diagnostics (optional):
npm run detect # Platform detection (CI, deploy, project type)
npm run verify # Tool availability + versions
The system is built on research, not guesswork.
Knowledge base (agent-docs/): 8,000 lines of curated documentation from Anthropic, OpenAI, Google, and Microsoft covering:
Testing:
Methodology:
/perf investigation phases based on recorded real performance investigation sessions| Topic | Link |
|---|---|
| Installation | docs/INSTALLATION.md |
| Cross-Platform Setup | docs/CROSS_PLATFORM.md |
| Usage Examples | docs/USAGE.md |
| Architecture | docs/ARCHITECTURE.md |
| Workflow | Link |
|---|---|
| /next-task Flow | docs/workflows/NEXT-TASK.md |
| /ship Flow | docs/workflows/SHIP.md |
| Topic | Link |
|---|---|
| Slop Patterns | docs/reference/SLOP-PATTERNS.md |
| Agent Reference | docs/reference/AGENTS.md |
MIT License | Made by Avi Fenesh
FAQs
A modular runtime and orchestration system for AI agents - works with Claude Code, OpenCode, and Codex CLI
The npm package agentsys receives a total of 422 weekly downloads. As such, agentsys popularity was classified as not popular.
We found that agentsys demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Research
Malicious versions of the Telnyx Python SDK on PyPI delivered credential-stealing malware via a multi-stage supply chain attack.

Security News
TeamPCP is partnering with ransomware group Vect to turn open source supply chain attacks on tools like Trivy and LiteLLM into large-scale ransomware operations.

Security News
/Research
Widespread GitHub phishing campaign uses fake Visual Studio Code security alerts in Discussions to trick developers into visiting malicious website.