๐ AGI Agent Kit
Stop hallucinating. Start executing.

AGI Agent Kit is the enterprise-grade scaffolding that turns any AI coding assistant into a deterministic production machine. While LLMs are probabilistic (90% accuracy per step = 59% over 5 steps), this framework forces them through a 3-Layer Architecture โ Intent โ Orchestration โ Execution โ where business logic lives in tested scripts, not hallucinated code.
Why this exists
Most AI coding setups give you a prompt and hope for the best. AGI Agent Kit gives you:
- ๐ง Hybrid Memory โ Qdrant vectors + BM25 keywords: semantic similarity for concepts, exact matching for error codes and IDs (90-100% token savings)
- ๐ฏ 19 Specialist Agents โ Domain-bounded experts (Frontend, Backend, Security, Mobile, Game Dev...) with enforced file ownership
- โก 853 Curated Skills โ 4 core + 75 professional + 774 community skills across 16 domain categories
- ๐ Verification Gates โ No task completes without evidence. TDD enforcement. Two-stage code review.
- ๐ 9 Platforms, One Config โ Write once, run on Claude Code, Gemini CLI, Codex CLI, Cursor, Copilot, OpenCode, AdaL CLI, Antigravity IDE, OpenClaw
npx @techwavedev/agi-agent-kit init
If this project helps you, consider supporting it here or simply โญ the repo.
๐ Quick Start
Scaffold a new agent workspace in seconds:
npx @techwavedev/agi-agent-kit init
You'll be prompted to choose a pack:
- core โ 4 essential skills (webcrawler, pdf-reader, qdrant-memory, documentation)
- medium โ Core + 75 specialized skills in 16 categories +
.agent/ structure (API, Security, Design, Architecture)
- full โ Complete suite: Medium + 774 community skills from antigravity-awesome-skills (853 total)
After installation, run the one-shot setup wizard to auto-configure your environment:
python3 skills/plugin-discovery/scripts/platform_setup.py --project-dir .
This detects your platform, scans the project stack, and configures everything with a single confirmation.
Then boot the memory system for automatic token savings:
python3 execution/session_boot.py --auto-fix
This checks Qdrant, Ollama, embedding models, and collections โ auto-fixing any issues.
โจ Key Features
| Deterministic Execution | Separates business logic (Python scripts) from AI reasoning (Directives) |
| Modular Skill System | 853 plug-and-play skills across 3 tiers, organized in 16 domain categories |
| Structured Plan Execution | Batch or subagent-driven execution with two-stage review (spec + quality) |
| TDD Enforcement | Iron-law RED-GREEN-REFACTOR cycle โ no production code without failing test |
| Verification Gates | Evidence before claims โ no completion without fresh verification output |
| Platform-Adaptive | Auto-detects Claude Code, Gemini CLI, Codex CLI, Cursor, Copilot, OpenCode, AdaL, Antigravity |
| Multi-Agent Orchestration | Agent Teams, subagents, Powers, or sequential personas โ adapts to platform |
| Hybrid Memory | Qdrant vectors + BM25 keywords with weighted score merge (95% token savings) |
| Self-Healing Workflows | Agents read error logs, patch scripts, and update directives automatically |
| One-Shot Setup | Platform detection + project stack scan + auto-configuration in one command |
๐ How This Compares to Superpowers
The agi framework adopts all best patterns from obra/superpowers and extends them with capabilities superpowers does not have:
| TDD Enforcement | โ
| โ
Adapted |
| Plan Execution + Review | โ
| โ
Adapted + platform-adaptive |
| Systematic Debugging | โ
| โ
Adapted + debugger agent |
| Verification Gates | โ
| โ
Adapted + 12 audit scripts |
| Two-Stage Code Review | โ
| โ
Adapted into orchestrator |
| Multi-Platform Orchestration | โ Claude only | โ
4 platforms |
| Semantic Memory (Qdrant) | โ | โ
90-100% token savings |
| 19 Specialist Agents | โ | โ
Domain boundaries |
| Agent Boundary Enforcement | โ | โ
File-type ownership |
| Dynamic Question Generation | โ | โ
Trade-offs + priorities |
| Memory-First Protocol | โ | โ
Auto cache-hit |
| Skill Creator + Catalog | โ | โ
853 composable skills |
| Platform Setup Wizard | โ | โ
One-shot config |
| Multi-Platform Symlinks | โ Claude only | โ
8 platforms |
๐งช Real Benchmark: Subagents vs Agent Teams
The framework supports two orchestration modes. Here are real test results from execution/benchmark_modes.py running on local infrastructure (Qdrant + Ollama nomic-embed-text, zero cloud API calls):
MODE A: SUBAGENTS โ Independent, fire-and-forget
๐ค Explore Auth Patterns โ โ
stored in cache + memory (127ms)
๐ค Query Performance โ โ FAILED (timeout โ fault tolerant)
๐ค Scan CVEs โ โ
stored in cache + memory (14ms)
Summary: 2/3 completed, 1 failed, 0 cross-references
MODE B: AGENT TEAMS โ Shared context, coordinated
๐ค Backend Specialist โ โ
stored in shared memory (14ms)
๐ค Database Specialist โ โ
stored in shared memory (13ms)
๐ค Frontend Specialist โ ๐ Read Backend + Database output first
โ
Got context from team-backend: "API contract: POST /api/messages..."
โ
Got context from team-database: "Schema: users(id UUID PK, name..."
โ โ
stored in shared memory (14ms)
Summary: 3/3 completed, 0 failed, 2 cross-references
2nd run (cache warm): All queries hit cache at score 1.000, reducing total time from 314ms โ 76ms (Subagents) and 292ms โ 130ms (Agent Teams).
| Execution model | Fire-and-forget (isolated) | Shared context (coordinated) |
| Tasks completed | 2/3 (fault tolerant) | 3/3 |
| Cross-references | 0 (not supported) | 2 (peers read each other's work) |
| Context sharing | โ Each agent isolated | โ
Peer-to-peer via Qdrant |
| Two-stage review | โ | โ
Spec + Quality |
| Cache hits (2nd run) | 5/5 | 5/5 |
| Embedding provider | Ollama local (nomic-embed-text 137M) | Ollama local (nomic-embed-text 137M) |
Try it yourself:
docker run -d -p 6333:6333 -v qdrant_storage:/qdrant/storage qdrant/qdrant
ollama serve & ollama pull nomic-embed-text
python3 execution/session_boot.py --auto-fix
python3 execution/benchmark_modes.py --verbose
python3 execution/memory_manager.py store \
--content "Chose PostgreSQL for relational data" \
--type decision --project myapp
python3 execution/memory_manager.py auto \
--query "what database did we choose?"
python3 execution/memory_manager.py cache-store \
--query "how to set up auth?" \
--response "Use JWT with 24h expiry, refresh tokens in httpOnly cookies"
python3 execution/memory_manager.py auto \
--query "how to set up auth?"
๐ Platform Support
The framework automatically detects your AI coding environment and activates the best available features.
Skills are installed to the canonical skills/ directory and symlinked to each platform's expected path:
| Claude Code | .claude/skills/ | CLAUDE.md | Agent Teams (parallel) or Subagents |
| Gemini CLI | .gemini/skills/ | GEMINI.md | Sequential personas via @agent |
| Codex CLI | .codex/skills/ | AGENTS.md | Sequential via prompts |
| Antigravity IDE | .agent/skills/ | AGENTS.md | Full agentic orchestration |
| Cursor | .cursor/skills/ | AGENTS.md | Chat-based via @skill |
| GitHub Copilot | N/A (paste) | COPILOT.md | Manual paste into context |
| OpenCode | .agent/skills/ | OPENCODE.md | Sequential personas via @agent |
| AdaL CLI | .adal/skills/ | AGENTS.md | Auto-load on demand |
Run /setup to auto-detect and configure your platform, or use the setup script directly:
python3 skills/plugin-discovery/scripts/platform_setup.py --project-dir .
python3 skills/plugin-discovery/scripts/platform_setup.py --project-dir . --auto
python3 skills/plugin-discovery/scripts/platform_setup.py --project-dir . --dry-run
๐ฆ What You Get
your-project/
โโโ AGENTS.md # Master instruction file
โโโ GEMINI.md โ AGENTS.md # Platform symlinks
โโโ CLAUDE.md โ AGENTS.md
โโโ OPENCODE.md โ AGENTS.md
โโโ COPILOT.md โ AGENTS.md
โโโ skills/ # Up to 853 skills (depends on pack)
โ โโโ webcrawler/ # Documentation harvesting
โ โโโ qdrant-memory/ # Semantic caching & memory
โ โโโ ... # 852 more skills in full pack
โโโ .claude/skills โ skills/ # Platform-specific symlinks
โโโ .gemini/skills โ skills/
โโโ .codex/skills โ skills/
โโโ .cursor/skills โ skills/
โโโ .adal/skills โ skills/
โโโ directives/ # SOPs in Markdown
โโโ execution/ # Deterministic Python scripts
โ โโโ session_boot.py # Session startup (Qdrant + Ollama check)
โ โโโ memory_manager.py # Store/retrieve/cache operations
โโโ skill-creator/ # Tools to create new skills
โโโ .agent/ # (medium/full) Agents, workflows, rules
โโโ workflows/ # /setup, /deploy, /test, /debug, etc.
๐ Architecture
The system operates on three layers:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Layer 1: DIRECTIVES (Intent) โ
โ โโ SOPs written in Markdown (directives/) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Layer 2: ORCHESTRATION (Agent) โ
โ โโ LLM reads directive, decides which tool to call โ
โ โโ Platform-adaptive: Teams, Subagents, or Personas โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Layer 3: EXECUTION (Code) โ
โ โโ Pure Python scripts (execution/) do the actual work โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Why? LLMs are probabilistic. 90% accuracy per step = 59% success over 5 steps. By pushing complexity into deterministic scripts, we achieve reliable execution.
๐ง Hybrid Memory (BM25 + Vector)
Dual-engine retrieval: Qdrant vector similarity for semantic concepts + SQLite FTS5 BM25 for exact keyword matching. Automatically merges results with configurable weights.
| Repeated question | ~2000 tokens | 0 tokens | 100% |
| Similar architecture | ~5000 tokens | ~500 tokens | 90% |
| Past error resolution | ~3000 tokens | ~300 tokens | 90% |
| Exact ID/code lookup | ~3000 tokens | ~200 tokens | 93% |
Setup (requires Qdrant + Ollama):
docker run -d -p 6333:6333 -v qdrant_storage:/qdrant/storage qdrant/qdrant
ollama serve &
ollama pull nomic-embed-text
python3 execution/session_boot.py --auto-fix
Agents automatically run session_boot.py at session start (first instruction in AGENTS.md). Memory operations:
python3 execution/memory_manager.py auto --query "your task summary"
python3 execution/memory_manager.py store --content "what was decided" --type decision
python3 execution/memory_manager.py health
python3 execution/memory_manager.py bm25-sync
Hybrid search modes (via hybrid_search.py):
python3 scripts/hybrid_search.py --query "ImagePullBackOff error" --mode hybrid
python3 scripts/hybrid_search.py --query "database architecture" --mode vector
python3 scripts/hybrid_search.py --query "sg-018f20ea63e82eeb5" --mode keyword
โก Prerequisites
The npx init command automatically creates a .venv and installs all dependencies. Just activate it:
source .venv/bin/activate
If you need to reinstall or update dependencies:
.venv/bin/pip install -r requirements.txt
๐ง Commands
Initialize a new project
npx @techwavedev/agi-agent-kit init --pack=full
Auto-detect platform and configure environment
python3 skills/plugin-discovery/scripts/platform_setup.py --project-dir .
Update to latest version
npx @techwavedev/agi-agent-kit@latest init --pack=full
python3 skills/self-update/scripts/update_kit.py
Boot memory system
python3 execution/session_boot.py --auto-fix
System health check
python3 execution/system_checkup.py --verbose
Create a new skill
python3 skill-creator/scripts/init_skill.py my-skill --path skills/
Update skills catalog
python3 skill-creator/scripts/update_catalog.py --skills-dir skills/
๐ฏ Activation Reference
Use these keywords, commands, and phrases to trigger specific capabilities:
Slash Commands (Workflows)
/setup | Auto-detect platform and configure environment |
/setup-memory | Initialize Qdrant + Ollama memory system |
/create | Start interactive app builder dialogue |
/plan | Create a structured project plan (no code) |
/enhance | Add or update features in existing app |
/debug | Activate systematic debugging mode |
/test | Generate and run tests |
/deploy | Pre-flight checks + deployment |
/orchestrate | Multi-agent coordination for complex tasks |
/brainstorm | Structured brainstorming with multiple options |
/preview | Start/stop local dev server |
/status | Show project progress and status board |
/update | Update AGI Agent Kit to latest version |
/checkup | Verify agents, workflows, skills, and core files |
Agent Mentions (@agent)
@orchestrator | Multi-agent coordinator | Complex multi-domain tasks |
@project-planner | Planning specialist | Roadmaps, task breakdowns, phase planning |
@frontend-specialist | UI/UX architect | Web interfaces, React, Next.js |
@mobile-developer | Mobile specialist | iOS, Android, React Native, Flutter |
@backend-specialist | API/DB engineer | Server-side, databases, APIs |
@security-auditor | Security expert | Vulnerability scanning, audits, hardening |
@debugger | Debug specialist | Complex bug investigation |
@game-developer | Game dev specialist | 2D/3D games, multiplayer, VR/AR |
Skill Trigger Keywords (Natural Language)
| Memory | "don't use cache", "no cache", "skip memory", "fresh" | Memory opt-out |
| Research | "research my docs", "check my notebooks", "deep search", "@notebooklm" | notebooklm-rag |
| Documentation | "update docs", "regenerate catalog", "sync documentation" | documentation |
| Quality | "lint", "format", "check", "validate", "static analysis" | lint-and-validate |
| Testing | "write tests", "run tests", "TDD", "test coverage" | testing-patterns / tdd-workflow |
| TDD | "test first", "red green refactor", "failing test" | test-driven-development |
| Plan Execution | "execute plan", "run the plan", "batch execution" | executing-plans |
| Verification | "verify", "prove it works", "evidence", "show me the output" | verification-before-completion |
| Debugging | "debug", "root cause", "investigate", "why is this failing" | systematic-debugging |
| Architecture | "design system", "architecture decision", "ADR", "trade-off" | architecture |
| Security | "security scan", "vulnerability", "audit", "OWASP" | red-team-tactics |
| Performance | "lighthouse", "bundle size", "core web vitals", "profiling" | performance-profiling |
| Design | "design UI", "color scheme", "typography", "layout" | frontend-design |
| Deployment | "deploy", "rollback", "release", "CI/CD" | deployment-procedures |
| API | "REST API", "GraphQL", "tRPC", "API design" | api-patterns |
| Database | "schema design", "migration", "query optimization" | database-design |
| Planning | "plan this", "break down", "task list", "requirements" | plan-writing |
| Brainstorming | "explore options", "what are the approaches", "pros and cons" | brainstorming |
| Code Review | "review this", "code quality", "best practices" | code-review-checklist |
| i18n | "translate", "localization", "RTL", "locale" | i18n-localization |
| AWS | "terraform", "EKS", "Lambda", "S3", "CloudFront" | aws / aws-terraform |
| Infrastructure | "Consul", "service mesh", "OpenSearch" | consul / opensearch |
Memory System Commands
| Boot memory | python3 execution/session_boot.py --auto-fix |
| Check before a task | python3 execution/memory_manager.py auto --query "..." |
| Store a decision | python3 execution/memory_manager.py store --content "..." --type decision |
| Cache a response | python3 execution/memory_manager.py cache-store --query "..." --response "..." |
| Health check | python3 execution/memory_manager.py health |
| Skip cache for this task | Say "fresh", "no cache", or "skip memory" in your prompt |
๐ Documentation
The Full tier includes 774 community skills adapted from the Antigravity Awesome Skills project (v5.4.0) by @sickn33, distributed under the MIT License.
This collection aggregates skills from 50+ open-source contributors and organizations including Anthropic, Microsoft, Vercel Labs, Supabase, Trail of Bits, Expo, Sentry, Neon, fal.ai, and many more. For the complete attribution ledger, see SOURCES.md.
Each community skill has been adapted for the AGI framework with:
- Qdrant Memory Integration โ Semantic caching and context retrieval
- Agent Team Collaboration โ Orchestrator-driven invocation and shared memory
- Local LLM Support โ Ollama-based embeddings for local-first operation
If these community skills help you, consider starring the original repo or supporting the author.
๏ฟฝ๏ธ Roadmap
| Federated Agent Memory | ๐ฌ Design | Cross-agent knowledge sharing via project-scoped Qdrant collections. Agents working on the same project read each other's decisions, errors, and patterns โ building collective intelligence across sessions and platforms. |
| Blockchain-Authenticated Memory | ๐ฌ Design | Cryptographic trust layer for shared memory using enterprise blockchains (Hyperledger Fabric, MultiChain, or Quorum) โ self-hosted, no fees, no cryptocurrency. Agent writes are signed, content hashes are anchored on-chain, and access is token-gated per project. |
| Event-Driven Agent Streaming | ๐ฌ Design | Real-time agent communication via Kafka/Flink. Agents publish decisions and observations to topics, enabling reactive workflows โ e.g., a security agent triggers remediation when a vulnerability scan agent publishes findings. |
| Workflow Engine | ๐ Planned | Execute data/workflows.json playbooks as guided multi-skill sequences with progress tracking and branching logic. |
๏ฟฝ๐ก๏ธ Security
This package includes a pre-flight security scanner that checks for private terms before publishing. All templates are sanitized for public use.
โ Support
If the AGI Agent Kit helps you build better AI-powered workflows, consider supporting the project:
๐ License
Apache-2.0 ยฉ Elton Machado@TechWaveDev
Community skills in the Full tier are licensed under the MIT License. See THIRD-PARTY-LICENSES.md for details.