Four LLM-agent guardrails: surface assumptions, prefer simplicity, make surgical changes, and drive by verifiable goals. Reference for scope, simplicity, assumptions, or success-criteria judgment calls. Use when: "am I overcomplicating this", "what are my assumptions", "how do I verify success", "/karpathy", "/principles". Trigger: /karpathy, /principles.
Four LLM-agent guardrails: surface assumptions, prefer simplicity, make surgical changes, and drive by verifiable goals. Reference for scope, simplicity, assumptions, or success-criteria judgment calls. Use when: "am I overcomplicating this", "what are my assumptions", "how do I verify success", "/karpathy", "/principles". Trigger: /karpathy, /principles.
Workflow role
Support primitive in the Harness Kit operating loop.
Source contract preview
This generated excerpt gives readers the beginning of the live primitive contract before they jump to GitHub.
Four principles for avoiding the specific LLM-agent failure modes
Karpathy has repeatedly called out on Twitter/X: silent
assumption-picking, premature abstraction, scope creep into adjacent
code, and vague success criteria. Use as a self-check before acting,
or dispatch when an agent seems to be drifting.
**Tradeoff.** These bias toward caution over speed. For mechanical
tasks (renames, find/replace, dep bumps) they're overhead — use
judgment. The threshold is "does this need a judgment call?", same
as for delegation.
## Delegation Floor
This is a reference/self-check skill, so reading it does not satisfy the roster
floor. When the underlying task is substantive, use the caller's Delegation
Floor and commission specialized lanes for assumptions, simplicity, surgical
scope, and verification risk as needed. Direct solo is fine for mechanical
tasks where these guidelines are only a quick checklist.
Local lane guidance: If dispatched, use one narrowly scoped critic lane against
the live artifact and the relevant principle; do not turn the four principles
into a standing review bench.
## 1. Think before coding
**Don't assume. Don't hide confusion. Surface tradeoffs.**
Before implementing:
- State your assumptions explicitly. If uncertain, ask.
- If multiple interpretations exist, present them — don't pick
silently.
- If a simpler approach exists, say so. Push back when warranted.
- If something is unclear, stop. Name what's confusing. Ask.
The failure this prevents: the model picks one interpretation of a
vague request, runs three-hundred lines, and the user only
discovers the mismatch after reviewing the diff. Surface the fork
*before* the work, not after.
**Example.** Ticket: "add validation to the signup form."
Wrong: silently pick "validate email format + password length,"
...
What to verify
The source file exists and carries valid frontmatter.
The primitive has one generated reference page.
Claims about behavior can be traced back to the linked source.