The Agent Promise vs. the Agent Reality Every AI agent demo looks the same. The agent gets a prompt, calls a tool, returns a result. The audience applauds. Then someone tries to build a real workflow — one that handles messy PDFs, evaluates whether the extraction is trustworthy, generates a report when it is, and flags a human when it isn't — and the demo falls apart. The hard part of agent-base