AI-native craft

The seven best-practices

There's a small set of habits that separates an AI agent you can trust from one that quietly breaks. In an agent-run company the right answer to "which of these should we use?" is all of them. We adopt each one across our own agents and flows — and we teach it here, because we don't hoard the toolbox.

Each practice has two faces: adopt it in your own work, and learn it as a craft. The lessons land one by one; cards marked soon are on the way.

01 · Evals

Test your agent

A small set of "golden" cases you re-run after every prompt change or model swap — so quality is measured, not eyeballed. The biggest gap most people have.

Adopt: golden set for the classifier & drafts, run on change. Learn: "How to test an AI agent (without being a programmer)."

Lesson soon
02 · Observability

See what it actually did

Every run writes one structured row — input, output, cost, latency, error — to a board you can read. No more guessing what the agent was up to.

Adopt: a shared run-log every flow calls. Learn: "See what your agent actually did."

Lesson soon
03 · Dry-run by default

Let it rehearse before it acts

Anything that touches the real world — mail, money, publishing — rehearses first behind a guard and a kill-switch. The default, not the exception.

Adopt: the dry-run + guard + kill-switch pattern, retrofitted. Learn: "Let it rehearse before it acts."

Lesson soon
04 · Cost guardrails

Keep it from running up a bill

A token budget, a cap per run, and graceful back-off when you're throttled. The same instinct as Models, limits & cost, baked into your flows.

Adopt: a standard circuit-breaker snippet. Learn: "Keep your agent from running up a bill."

Related lesson →
05 · Least privilege

The smallest key that works

Scoped tokens per use, rotated on a schedule, and one registry of who and what can reach what. Critical the moment you add roles or foreign mailboxes.

Adopt: a capability registry + scheduled rotation. Learn: "Give the agent the smallest key that works."

Lesson soon
06 · Decision log

Write down the why

Short "why we chose this" notes on the bigger calls. They survive handoffs — and survive you. The record that stops a team re-litigating the same question.

Adopt: a docs/decisions/ folder with a tiny template. Learn: "Write down the why, not just the what."

Lesson soon
07 · Versioned prompts

Treat instructions like code

Prompts and knowledge live in git: diffable, reviewable, roll-back-able. A bad prompt becomes a one-line revert instead of a mystery.

Adopt: prompts/KB in git with a rollback runbook. Learn: "Treat your instructions like code."

Lesson soon
These seven are the spine of the AI-native craft track. They pair with why we do this and the model-management instinct in Models, limits & cost.