Assess and improve codebase readiness for AI coding agents across style, tests, docs, architecture, CI, observability, security, and dev setup. Produces a scored report, prioritizes remediation, then executes the highest-impact fixes. Use when: "agent readiness", "is this codebase agent-ready", "readiness report", "make this codebase agent-friendly", "agent-ready assessment", "readiness audit", "prepare for agents". Trigger: /agent-readiness, /readiness.
Pick browser automation for web/Electron: CI E2E, scripted flows, scraping, visual regression, exploratory QA, persona walks, monitoring, or browser agents. Deterministic Playwright first; harden exploratory findings into repeatable tests. Use for "automate the browser", "test this web app", "test this electron app", "Playwright or Stagehand", "scrape this site", "browser agent", "visual regression", "E2E tests". Trigger: /browser.
Audit CI gates, strengthen weak coverage, then drive green. Dagger owns the canonical pipeline; missing Dagger is auto-scaffolded. Acts directly on mechanical fixes and never returns red without structured diagnosis. Use when: "run ci", "check ci", "fix ci", "audit ci", "is ci passing", "run the gates", "dagger check", "why is ci failing", "strengthen ci", "tighten ci", "ci is red", "gates failing". Trigger: /ci, /gates.
Generate a repository-local skill from live repo discovery and user intent. Use when: "create QA skill", "generate repo skill", "make a local skill", "persona acceptance skill", "value proposition QA", "bespoke repo QA", "scaffold local skill". Creates concrete `.agents/skills/<name>/` guidance with harness-specific bridges when useful. Trigger: /create-repo-skill, /repo-skill, /create-qa-skill.
Run one targeted, read-only architecture or quality critique through a named lens from the shared rubric. Use when: "critique this module", "run an Ousterhout pass", "lens critique", "architecture critique". Trigger: /critique.
Lightweight evidence-backed retro and catch-up reports for a current repo, branch, PR, backlog slice, or recent agent session. Use when the user asks for a debrief, catch me up, what changed, why it matters, product implications, end-user implications, developer experience implications, current app state, backlog state, workspace state, alternatives considered, or context rebuild after losing the thread. Trigger: /debrief.
Inner-loop composer for one backlog item to merge-ready code. Composes /shape, /implement, /code-review, /ci, /refactor, and /qa; stops before push, merge, or deploy. Emits receipt.json plus operator brief + /reflect. Use for "deliver this", "make it merge-ready", shaped-ticket builds, and `--polish-only <branch|PR>` for existing branch/PR cleanup. Trigger: /deliver.
Ship merged code to one deploy target. Thin router: detect target, run the platform recipe, capture receipt (sha, version, URL, rollback handle), stop when healthy. Does not monitor, triage, or decide whether to deploy. Use when: "deploy", "deploy to prod", "release", "push to staging", "deploy this branch", "release cut". Trigger: /deploy, /release.
Investigate, audit, triage, and fix. Systematic debugging, incident lifecycle, domain auditing, and issue logging. Feedback-loop-first protocol: reproduce or replay before root cause, pattern analysis, hypothesis test, and fix. Use for: any bug, test failure, production incident, error spikes, audit, triage, postmortem, "diagnose", "why is this broken", "debug this", "production down", "is production ok", "audit stripe", "log issues". Trigger: /diagnose.
Always-on backlog grooming. Tidy, brainstorm, interrogate, investigate, research, and simplify in a single loop. Tidy is not a mode — it happens every time. Strategic-layer work fans out parallel interrogation, design-critique, technical-review, and research lanes. Use when: "groom", "what should we build", "rethink this", "biggest opportunity", "backlog", "prioritize", "backlog session", "audit skills", "skill quality audit". Trigger: /groom, /groom audit, /backlog, /rethink, /moonshot, /scaffold.
Atomic TDD build skill. Takes a context packet (shaped ticket) and produces code + tests on a feature branch. Red → Green → Refactor. Does not shape, review, QA, or ship — single concern: spec to green tests. Use when: "implement this spec", "build this", "TDD this", "code this up", "write the code for this ticket", after /shape has produced a context packet. Trigger: /implement, /build (alias).
Four LLM-agent guardrails: surface assumptions, prefer simplicity, make surgical changes, and drive by verifiable goals. Reference for scope, simplicity, assumptions, or success-criteria judgment calls. Use when: "am I overcomplicating this", "what are my assumptions", "how do I verify success", "/karpathy", "/principles". Trigger: /karpathy, /principles.
Verify the running thing works. Browser walks for web, request replay for APIs, shell smoke for CLIs, consumer builds for libraries, tool-call replay for MCP. "Tests pass" is not QA. Use when: "run QA", "verify the feature", "test this", "check the app", "exploratory test", "QA this PR", "smoke test", "manual testing", "capture evidence". For generated repo QA skills, use /create-repo-skill qa. Trigger: /qa.
Branch-aware simplification. On feature branches, compare against base and simplify the diff before merge. On primary branch, scan for the highest impact deletion, consolidation, or boundary improvement. Use when: "refactor this", "simplify this diff", "clean this up", "reduce complexity", "pay down tech debt", "make this easier to maintain", "make this more elegant", "reduce the number of states", "clarify naming". Trigger: /refactor.
DEPRECATED redirect. /settle, /pr-fix, and /pr-polish now route to /deliver --polish-only <branch|PR> — the single owner of "existing branch -> merge-ready" (backlog 080 collapsed settle into deliver). Slated for deletion next release. Use when: muscle-memory "polish this", "address PR reviews", "get this merge-ready" — then run /deliver --polish-only. Trigger: /settle, /pr-fix, /pr-polish.
Shape a raw idea into something buildable. Product + technical exploration. Spec, design, critique, plan. Output is a context packet. Use when: "shape this", "write a spec", "design this feature", "plan this", "spec out", "context packet", "technical design". Trigger: /shape, /spec, /plan, /cp.
Final mile from merge-ready branch to shipped: squash-merge, archive backlog tickets with trailers, update touched docs, run /reflect, apply outputs. Assumes /deliver or /deliver --polish-only already made the branch ready. Use when: "ship it", "merge and close out", "final mile", "land and reflect", "finish this ticket". Trigger: /ship.
Turn proven agent-session patterns into first-party Harness Kit skills. Use when: "skillify this conversation", "make this into a skill", "generate a skill from current transcript", "extract reusable workflow". Trigger: /skillify.
Capture agent-session work records as local JSONL audit evidence. Links a backlog/spec, branch, commits, review verdicts, QA/demo evidence, transcript refs, and shipped ref without storing raw private transcripts. Use when: "trace this work", "write work record", "agent session trace", "journal this delivery", "link transcript evidence". Trigger: /trace, /journal.
End-to-end "ship it to the remote": read worktree, classify changes, tidy debris, split semantic commits, and push. Judgment layer over git, not a wrapper; decides what belongs and how reviewers should read the diff. Use when: "yeet", "yeet this", "commit and push", "ship local changes", "tidy and commit", "wrap this up and push", "get this off my machine". Trigger: /yeet, /ship-local (alias).