See the agent work an invoice
Walk through the agent's architecture and decision trace on four sample invoices, each of which forces a different path.
Try the live demo →Hosted free, so give it a moment to wake (30 to 60 seconds on first load). The public demo runs in simulation mode: it replays a scripted decision path so it's free and safe to leave open. It's a walkthrough of the agent's architecture and decision trace, not live model reasoning. The real agent runs the live model.
Most "AI" features are retrieval pipelines: the steps are fixed by the designer. This is different. It's an agent. I gave a language model a set of accounting tools and ran it in a loop, and the model itself decides which tool to call next based on what it just learned, iterating toward a goal.
The task is accounts-payable invoice processing. Hand it an invoice and it works out its own path: parse the invoice, validate the vendor, pull and match the purchase order, flag discrepancies, code the GL, then stop at a human-approval checkpoint, because in finance a machine prepares and a person authorizes. Every decision is an inspectable audit trail.
I designed it and directed Claude Code to build it. It's the agent counterpart to my Construction-Accounting RAG. Together they show two tools for two different problems.
What makes it an agent
The difference between an agent and a fixed pipeline is who decides the order of operations. In a pipeline, the designer hard-codes the sequence. In an agent, the model is given tools and run in a loop, and it decides which tool to call next based on what the last step revealed, iterating toward a goal. Nobody scripts the path. A different invoice produces a different path. That autonomy is the point.
Given a vendor invoice, the agent chooses its tools in whatever order fits the situation:
- Extract the invoice fields.
- Validate the vendor (active? W-9 on file? on hold?).
- Pull a referenced purchase order when the invoice cites one.
- Match invoice against PO and flag any discrepancies.
- Code lines to GL accounts when there's no PO to match against.
It never posts anything itself. Its final move always hands off to a human-approval checkpoint, mirroring real AP controls. And every step it takes is written to a visible audit trail. That inspectable reasoning is the differentiator. It comes straight from the accounting side of this: in finance, you don't trust a number you can't trace.
Four invoices, four paths
The demo includes four sample invoices. Each one forces the agent down a different path, which is how you can tell it's deciding rather than following a script.
The decision trace
The agent shows its work. Each tool call and the reasoning behind it is laid out step by step, ending in either a proposed posting or an escalation, always pending a human's approval.
RAG vs Agent
This sits next to my Construction-Accounting RAG demo, which answers plain-language questions grounded in a construction-accounting knowledge base. The two solve different problems, and the difference is worth being precise about.
RAG grounds the model's answers in retrieved knowledge. An agent lets the model choose and sequence its own actions in a loop. They compose, an agent can call RAG as one of its tools, but they are not the same thing.
| RAG | Agent | |
|---|---|---|
| Control flow | Fixed by the designer: retrieve, then generate | The model chooses its own next action |
| Best for | Grounded answers from a knowledge base | Multi-step tasks with branching decisions |
| Human role | Reads the answer | Approves the agent's proposed action |
| They compose | — | An agent can call RAG as one of its tools |
Two demos, two tools, two different problems. Try them both:
- AP Invoice Agent: the agent (this project).
- Construction-Accounting RAG: the retrieval demo.
Let's talk
If you've got a repetitive, rules-heavy process that still needs a human's judgment at the end, this is the shape of problem an agent fits. Reach out and I'll tell you whether it's a good candidate.
Get in touch →