← Back to Blog
AI Tools16 min readOctober 11, 2025

State of Vibe Coding: Emerging Tools and Ways of Working (2025)

Six months ago, "vibe coding" meant prototyping with AI and praying the code worked. Today, it means shipping production features in 30 minutes that used to take 3 days—with better test coverage than your manual code. The shift happened fast. Here's what changed: mandatory TDD for AI, orchestration that actually works, and the realization that parallel agents are infrastructure, not a hack.

🔎 TL;DR

  • Superpowers gives Claude mandatory skills for TDD, debugging, and structured workflows
  • Conductor orchestrates multi-agent workflows with observability and human approval gates
  • Happy Engineering runs parallel Claude sessions from your phone with voice control
  • • Winning pattern: spec-first + failing tests + parallel agents + mandatory code review

1) New Superpowers: Emerging Tools

Three tools have crossed the threshold from "interesting experiment" to "I'm annoyed when I don't have them":

  • Conductor — orchestration for agentic workflows and automation. Explore the platform at conductor.build.
  • Superpowers — a comprehensive skills library for Claude Code that gives AI agents proven techniques, patterns, and tools. It includes slash commands (/brainstorm, /write-plan, /execute-plan), systematic testing and debugging skills, and a framework for creating and sharing new skills. See the original write-up "Superpowers" (Oct 9, 2025) and the companion repo obra/superpowers.
  • Claude Code Templates — a growing catalog of task-specific templates and plugins that dramatically reduce setup time. Browse community-driven templates at claude-code-templates.
  • Happy Engineering — an open-source mobile client that lets you run multiple Claude Code sessions in parallel from your phone, desktop, or web. Features voice commands, end-to-end encryption, and smart notifications. Free and MIT-licensed. Install with npm i -g happy-coder && happy or visit happy.engineering.

Deep Dive: How "Superpowers" Works

Superpowers solves a problem that drove me nuts for months: Claude would write beautiful code, then immediately violate it by skipping tests on the next iteration. Superpowers fixes this by making TDD and code review mandatory—not through nagging, but through a skills system that Claude can't bypass. Each skill is a markdown file containing battle-tested patterns. When a skill exists for a task, Claude must use it. No negotiations, no "I'll add tests later." This one change—mandatory skills—turned chaotic agent loops into predictable workflows.

Core Components

  • Skills Library: TDD, debugging, collaboration, and meta-skills in markdown format.
  • Slash Commands: /brainstorm, /write-plan, /execute-plan for structured workflows.
  • Skills Discovery: Search tools to find relevant skills before starting tasks.
  • Personal Skills: Custom skills in ~/.config/superpowers/ that override core skills.
  • Git Worktrees: Automatic parallel development branches for concurrent work.

Key Workflows

  • Brainstorm → Plan → Implement: Structured design refinement before coding.
  • RED/GREEN TDD: Write failing tests first, implement minimal code to pass.
  • Subagent Dispatch: Parallel agents for different aspects of implementation.
  • Code Review: Automated review checkpoints before merging work.
  • Skills Testing: Pressure-test skills with realistic scenarios using subagents.

The system works by injecting skills context into every Claude session, making it mandatory to search for and use relevant skills. When you start a task, Claude automatically looks for applicable skills and follows their instructions. Skills are tested using subagents in pressure scenarios to ensure they work in real-world conditions. The framework also includes tools for creating new skills, sharing them with the community, and maintaining a personal skills repository.

Deep Dive: Conductor for Orchestration

Conductor helps you wire together agentic steps, tools, and external systems into observable workflows. Think of it as the "runtime + control plane" for your superpowers.

  • Triggers: webhooks, schedules, or manual runs kick off workflows.
  • DAG-style steps: parallel branches, fan-out/fan-in, and retries with backoff.
  • Secrets & connectors: safely bring in GitHub, Slack, cloud APIs, and data sources.
  • Observability: run history, logs, artifacts, and per-step metrics.
  • Human approvals: pause for review before merging, deploying, or notifying.

💡 Real Workflow: Automated Issue Triage

When a GitHub issue arrives, Conductor triggers a workflow:

  1. Label & prioritize using an LLM agent that analyzes issue content
  2. Draft a spec in parallel with a research agent searching similar past issues
  3. Open draft PR with skeleton code and tests (via Superpowers TDD skill)
  4. Run tests in parallel (unit, integration, type-check) with per-step retries
  5. Human approval gate: Slack notification with approve/reject buttons
  6. Merge or request changes based on approval, post final status to Slack

Result: 80% of routine issues get spec + draft PR within 5 minutes, with full audit trail.

Deep Dive: Claude Code Plugins

Claude Code plugins extend the IDE assistant with purpose-built capabilities—so the agent can use familiar tools directly from your editor. Combine plugins with templates to cut setup time and improve reliability.

Common Categories

  • Code quality and testing (linters, unit/integration runners)
  • Repo/devops (GitHub, PR helpers, release notes)
  • Docs and knowledge (summaries, search, generators)
  • Cloud/data (secrets-aware APIs, DB explorers, analytics)

Adoption Tips

  • Start with read-only and low-risk actions; expand to writes after trust.
  • Pair plugins with verifiers—tests, type checks, CI gates.
  • Document plugin usage in your CONTRIBUTING.md to make workflows team-friendly.

2) Ways of Vibe Coding (That Actually Ship)

Spec-First, Test-First

Write a crisp spec and failing tests before the agent touches code. Seriously—RED tests first, or you'll spend tomorrow debugging vibes. The failing test is your contract. When it goes GREEN, you're done. When it stays RED after 3 loops, your spec was wrong. This discipline turns "it works on my machine" into "it passes tests I wrote before the code existed."

Parallel Agent Loops

Run multiple narrow agents concurrently (parser, UI glue, tests) and reconcile via review. Yes, they'll create merge conflicts. That's still faster than doing it yourself sequentially. Embrace the chaos.

Template-Driven Starts

Start from a trusted template/plugin, not a blank page. You inherit guardrails and structure.

Review-As-A-Discipline

Treat agent output like a junior's PR: diff, test, benchmark, and iterate. Ship only what you can own. Except this junior writes code at 3am without coffee breaks and never gets defensive in code review. Use that to your advantage.

💡 Real Workflow: Adding a New API Endpoint

Task: Add POST /api/users endpoint with validation and rate limiting.

  1. Write spec: "Accept {name, email}, validate email format, return 201 or 400, rate limit 10/min per IP"
  2. Write failing tests: Use Superpowers TDD skill to generate RED tests first (invalid email → 400, valid → 201, 11th request → 429)
  3. Launch parallel agents: Agent A implements validation logic, Agent B adds rate limiting, Agent C writes integration tests
  4. Review & reconcile: Human reviews diffs, runs full test suite, checks for conflicts
  5. Iterate: Tests fail? Agent loop continues with error context until GREEN
  6. Ship: Deploy to staging, monitor error rates, promote to prod with feature flag

Result: Spec to production in <30 minutes, with test coverage baked in from the start.

3) Finding the Right Plugins and Templates

The fastest path to useful vibes is discovering the right starting points. For Claude Code users, the curated directory at claude-code-templates is a practical way to explore community templates you can adopt or adapt. Combine these with repo-based recipes like superpowers to build repeatable, auditable workflows.

4) Your Vibe Engineering Stack

The shift from "vibe coding" to "vibe engineering" isn't about new AI models—it's about building disciplined systems around them. Here's the stack that's working in production teams today:

The Three-Layer Stack

🧠 Skills Layer (Superpowers)

Mandatory skills ensure every agent follows your team's TDD, debugging, and code review practices. Start with the core skills library, then add custom skills for your domain (e.g., "security review for payment endpoints"). Your agents get smarter over time as you capture what works.

🔀 Orchestration Layer (Conductor)

Wire agents into repeatable workflows with observability and human gates. Map your existing processes (issue → spec → PR → review → deploy) into Conductor workflows. Start with read-only automation (labeling, summarizing), then gradually add write operations with approval gates.

⚡ Execution Layer (Claude Code + Happy)

Run the actual code generation and testing. Use Happy Engineering to run 3-5 parallel agents on different aspects of a feature (API, tests, docs, UI). Review and reconcile their work in a single PR. This turns hours of sequential work into minutes of parallel execution.

Getting Started (Your First Week)

  1. Day 1-2: Add Superpowers to Claude Code. Pick a small feature you understand cold—maybe a form validator or a config parser. Run /brainstorm, then /write-plan, then /execute-plan. The workflow will feel bureaucratic. Do it anyway. By day 2, you'll see why: the plan catches design mistakes before Claude writes 300 lines of wrong code. That's the point.
  2. Day 3-4: Set up Happy Engineering. Run npm i -g happy-coder && happy, try parallel agents on your phone. Start with 2 agents on a simple task (one for implementation, one for tests). Watch them work in parallel—it's oddly satisfying, like having a very organized anxiety attack. When they both finish and you reconcile their work in under 10 minutes, you'll get it.
  3. Day 5-7: Build your first Conductor workflow. Automate something read-only first (e.g., "label new issues based on content"). Add observability. Once it's reliable, add a write operation with a human approval gate. The key is starting small—don't try to automate your entire CI/CD pipeline on day one. You'll regret it.

⚠️ What Not To Do

  • • Don't skip tests. Vibes without verification create technical debt faster than manual coding.
  • • Don't run one giant agent. Narrow agents with clear interfaces are easier to review and debug.
  • • Don't merge without human review. Agents are junior developers—treat their PRs accordingly.
  • • Don't automate processes you don't understand. Build manually first, then automate the proven path.

💥 What I Learned the Hard Way

  • Week 1: Gave an agent "improve the auth system" with no spec. It refactored 40 files. None compiled. Spent 6 hours untangling the mess.
  • Week 2: Ran 5 parallel agents on the same codebase without branches. Merge conflicts looked like a git log from hell. Lost an entire afternoon to conflict resolution.
  • Week 3: Started writing specs and failing tests first. Finished a 3-feature sprint in 2 days. The difference was night and day.
  • Week 4: Realized the AI is infrastructure, not magic. Infrastructure needs guardrails. Added mandatory code review. Haven't merged broken code since.
  • Week 5: Discovered that naming agents helps manage them. "Parser agent" is easier to debug than "agent-3." Small things matter.