I like AI systems more when the human side is boring. The less time I spend remembering where a prompt lives, the more time I have for actual judgment.
What lives in the workbench
The stack stays small on purpose:
- Local files for notes, specs, and draft outputs
- One or two model providers, not five
- A visible review step before anything ships
- Simple scripts for repeatable tasks
A narrow workflow
export const workbench = { intake: "capture the problem in plain language", draft: "let the model propose a first pass", review: "check the result against a short rubric", ship: "publish only after the review passes", } as const;
Why this stays usable
The workbench is not trying to be an operating system. It is just enough structure to keep the work legible:
- The prompt is visible.
- The schema is visible.
- The output is visible.
- The handoff is visible.
That makes agent work calmer. It also makes debugging far less dramatic, which I appreciate more than I probably should.
The real constraint
The hard part is not model choice. It is deciding where human judgment still matters. Once that boundary is clear, the rest becomes a composition problem.
