The three hard metrics that actually matter
AI ROI doesn’t live on marketing multipliers. It lives on three numbers any finance lead understands:
- Time saved per unit of work. How long a human took to draft the email, classify the ticket or summarise the case, and how long it takes now — including human review when there is one. The average is measured across hundreds of cases, not the Friday afternoon demo.
- Cost per unit of work. Fully loaded cost of the process before and after: prorated salary, AI licences, infrastructure, maintenance and review hours. If cost per unit drops while volume stays the same or grows, the business case holds.
- Error rate. Share of outputs that need correction, retraction or rework. A faster AI with a higher error rate than the manual process isn’t savings: it’s hidden debt that surfaces later.
These three are measured before you start — the famous baseline — and re-measured at week four, eight and twelve. No baseline, no ROI — just opinion.
The soft metrics that don’t show up in the spreadsheet
Some improvements don’t fit in an Excel column but move the business anyway. Worth listing them explicitly so the case isn’t lopsided:
- Consistency. The agent answers with the same quality at 2 pm and at 2 am, in low season and high. Human variance drops.
- Customer response speed. Going from twenty-four hours to thirty seconds changes how the service feels, even if the conversation lands in the same place.
- Freed-up team capacity. The hours AI returns aren’t always billable, but they get reinvested into higher-value work — consultative selling, process improvement, premium service. You have to decide explicitly where they go.
- Team satisfaction. Removing the repetitive part of the job lowers attrition. Hard to quantify in the short term but it shows up in internal surveys and in the twelve-month retention curve.
How to build the business case without inflating ROI
An honest business case survives production and the first finance review. Three rules we recommend:
- Count the full cost, not just implementation. Model, integrations, observability, monthly maintenance, updates when the provider changes the model and internal governance hours. A project that only counts the upfront phase shows spectacular ROI in year one and pain from year two onwards.
- Use ranges, not points. Instead of “€120,000 savings/year”, use “€80,000 to €140,000/year depending on adoption”. It’s honest and, paradoxically, more credible in front of the steering committee.
- Attribute carefully. If the process was already going to improve for other reasons — a new CRM, recent training, a shift change — don’t attribute the whole delta to AI. The clean portion is the one that holds up under scrutiny.
Common mistakes when measuring AI ROI
The four failures we see most, in order of severity:
- Not counting maintenance. The model changes, the API is updated, integrations break. If maintenance cost isn’t in the case, year-two ROI will disappoint.
- Ignoring HITL cost. If a person reviews every output, that time is real cost. Add it in, even if it can be reduced later as the agent earns trust.
- Crediting all savings to AI. When the agent ships alongside a redesigned process, part of the saving comes from the redesign, not the model. Splitting the two avoids disappointment.
- Measuring only at three months. Ninety-day ROI captures the learning curve but not the structural cost. We ask for a twelve-month checkpoint at minimum before talking about mature ROI.
The goal isn’t to present the highest number possible. It’s to present the number that holds when the project has been running for a year and someone reopens the spreadsheet.