Voice AI board reporting: the metrics directors want

Your voice AI programme can be working — containment climbing, cost-per-call falling, CSAT holding — and still get a hostile reception in the boardroom. Not because the numbers are bad, but because they are the wrong numbers. The team walks in with the operations dashboard: forty tiles, latency percentiles, a containment-rate sparkline, an intent-recognition heatmap. The board wanted four things on one page, and got none of them. By the third slide a non-executive director has stopped reading and started asking the question nobody prepared for: "Remind me what we are actually getting for this, and what could go wrong?"

Board reporting for an AI programme is a distinct discipline, and most teams never learn it because nobody owns it. The operations lead reports throughput. The CFO's team reports return against the total programme economics of AI voice. Neither produces the upward view a board governs from — value captured against plan, the risk posture in a single colour, regulatory exposure as one defensible line, and the explicit decision the board is being asked to take. This post is the template for that one page: what goes on it, what stays off it, the cadence it runs on, and the failure modes that quietly turn a working programme into a "let's pause and review" agenda item.

This guide is shipped by the team behind Dilr Voice — enterprise voice AI live in 40+ countries — and informed by how directors actually read programme reporting. Or see DATS, our five-stage AI consulting system, for the governance layer this sits inside.

Three audiences, three dashboards — and only one goes to the board

The first mistake is assuming one report serves everyone. It does not. A voice AI programme produces three legitimate views, each with a different reader, a different question, and a different metric set. Confuse them and you either drown the board in operational noise or starve the operations team of the signal it needs to tune the agent.

The operations view answers "is the agent performing right now?" Its reader is the programme lead and the conversation designers. It is dense, real-time, and tactical: containment rate, P95 latency, barge-in handling, intent accuracy, escalation triggers, transcription confidence. This is the layer covered by the enterprise KPI set for AI voice programmes — and it is exactly what should not appear in a board pack.

The finance view answers "is it paying back, and can we defend the number?" Its reader is the CFO's team. It is the attribution model — which savings and revenue are genuinely caused by the agent versus coincident with it — built into a defensible credit stack. We cover that mechanics in the voice AI ROI attribution guide. The board cares about the output of this model — value captured against plan — not its internal workings.

The board view answers a different question entirely: "is this creating enterprise value, is it under control, and what do you need from us?" Its reader is a director who spends ninety seconds on your item and governs a dozen others the same morning. They are not auditing the programme; they are discharging a duty — value capture, risk oversight, and capital allocation. The board view is therefore not a smaller operations dashboard. It is a different artefact, written in the language of value, risk, and decision. Everything else rolls up into it.

The discipline is subtraction. Of the forty things your operations dashboard tracks, perhaps two earn a place on the board page — and only as a one-line "is the service healthy" signal, never as the substance. If a metric does not change a board decision, it does not belong in the board pack. It belongs in the appendix, or nowhere.

The one-page board view: the five things a director actually asks

A good board page resolves five questions before a director has to ask them aloud. Structure the page in this order — value first, because that is what justifies the programme's existence; the ask last, because that is the action the meeting exists to take.

Question 1

What value are we capturing?

Value captured year-to-date in pounds, against the plan the board approved. One number, one comparison, one trend arrow. Not cost-per-call — the board approved a business case, so report against that business case.

Question 2

Is it under control?

Risk posture as a single RAG colour, with the top three risks named and their direction of travel. A board governs risk; give them the posture, not the register.

Question 3

Are we compliant?

Regulatory exposure as one defensible line: which obligations apply, and the readiness status against each deadline. The board needs to know it will not be surprised by a regulator.

Question 4 & 5

Is it helping the brand — and what do you need from us?

One customer-outcome signal (is the agent improving or eroding experience), and the explicit decision: approve, fund, pause, or note. A board pack with no ask is a status update, not a governance item.

Notice what is absent. No latency chart. No model-version history. No intent taxonomy. Those are the evidence base behind the page, available on request and in the appendix — but the page itself carries only what a director needs to govern. The art of board reporting is holding your nerve on that subtraction when the team that built the programme wants to show its work.

The board metric set: what rolls up onto the page

Below is the metric set that belongs on the board page, mapped to the question it answers and the threshold that turns it red. Each board metric is a roll-up of operational and financial detail — the director sees the headline; the appendix carries the calculation. This is the discriminator between a board view and an operations dashboard: a board metric exists to trigger a decision or a question, and carries an explicit red-flag threshold so the board knows when to act.

Board metric	Answers	Healthy	Red flag (board acts)
Value captured vs plan (£, YTD)	Value	≥ 85% of plan	< 70% of plan for two periods
Net programme cost (£, run-rate)	Value	Within approved envelope	Forecast overrun > 15%
Risk posture (RAG)	Control	Green / stable amber	Any red, or amber worsening
Open high/critical risks (count)	Control	0 critical	≥ 1 critical unmitigated
Material incidents (period)	Control	0 customer-impacting	Any reportable or near-miss escalation
Regulatory readiness (status vs deadlines)	Compliance	On track to every applicable date	Any obligation at risk of being missed
Customer outcome signal (CSAT / complaint trend)	Brand	At or above human baseline	Sustained drop below baseline
Coverage / scale (journeys live vs planned)	Value	On roadmap	Stalled > 1 quarter
The ask (decision requested)	Decision	Clear, single, time-bound	— (absence is the failure)

Nine lines. A director can read that table in under two minutes and know whether to approve, probe, or escalate. The two operational metrics that survive the cut — customer outcome and coverage — survive only because they map directly to brand risk and value realisation, the two things a board is on the hook for. Everything else has been pushed down into the governance framework that sits beneath the board line, where the working group reviews it at operational cadence.

Reporting risk: the register a director reads in one colour

Risk is where AI board reporting most often fails, in both directions. Report too little and the first incident looks like negligence — the board was never told the agent could hallucinate a policy figure or mishandle a vulnerable caller. Report too much and the board glazes over a forty-row register and stops engaging with risk at all. The fix is a two-layer structure: a posture on the board page, a register in the appendix.

On the page, risk is a single RAG colour plus the top three risks named, each with an owner, a mitigation status, and a direction-of-travel arrow. That is enough for a board to discharge its oversight duty: it can see the posture, see the worst three things, and see whether they are getting better or worse. A director who wants more turns to the appendix; a director who does not has still governed.

Three categories of risk belong in the named top three for almost every voice AI programme:

Conduct and customer-harm risk — the agent giving wrong information, mishandling a vulnerable caller, or failing to escalate. This is the risk that becomes a regulatory and reputational event, so it almost always ranks first.
Continuity and resilience risk — what happens when the agent, a model provider, or a telephony layer fails mid-volume. The board's question is not "will it break" but "is there a tested plan for when it does." That plan is the incident response runbook, and its existence — or absence — is itself a board-level fact.
Concentration and vendor risk — dependence on a single platform or model whose pricing, ownership, or availability you do not control. In a consolidating market, this is a live governance concern, not a hypothetical.

The single most important risk-reporting discipline is the near-miss. A board that only hears about risk when something has already gone wrong is being managed, not informed. Reporting the near-miss — the escalation that caught a bad answer before it reached a customer, the load spike the fallback absorbed — is what earns a board's trust that the green is real. Surprises destroy that trust faster than incidents do.

The regulatory exposure line: one defensible sentence

Boards have become acutely sensitive to AI regulatory exposure, and rightly so — the obligations are now concrete, dated, and enforceable. But a board does not want a compliance lecture; it wants one defensible line that says "here is what applies to us, and here is our readiness against each deadline." The job is to compress a genuinely complex landscape into a status a director can govern.

For a UK or EU enterprise running customer-facing voice AI, the exposure line typically tracks three to four obligations: the EU AI Act obligations for voice AI (transparency and disclosure being the most immediate), the FCA's conduct expectations where the deployment touches financial services, the ICO's data-protection and automated-decision expectations, and any cross-border data-transfer exposure where call data leaves the jurisdiction. Each gets a status — ready, on track, or at risk — and a date. That is the whole line.

88%

of enterprises use AI — but the board's question is the next number

capture material EBIT impact — the value-capture gap the board page must close

2.5×

more EBIT for AI leaders vs peers — why the board funds scale, not pilots

The framing matters here. In 2026, roughly 88% of enterprises use AI but only about 6% capture material EBIT impact, per McKinsey's State of AI 2025, and AI leaders earn around 2.5× more EBIT than peers, as BCG's analysis of the widening AI value gap sets out. A board reading those numbers is not asking whether to do AI; it is asking whether this programme is on the value-capturing side of that gap. Your one-page view either answers that or invites the doubt.

The board narrative: the 150 words a director actually reads

Every board page needs a narrative — the short paragraph a director reads first and remembers longest. Numbers without narrative get misread; narrative without numbers gets distrusted. The board narrative is not a summary of the page; it is the judgement the page supports, written in plain English and owned by a named executive. Here is the shape it should take.

Sample board narrative · voice AI programme

The voice AI programme is delivering, slightly behind plan. We have captured £2.4m of value year-to-date against a £2.9m plan (83%); the shortfall is timing, not performance, as two journeys launched a quarter late. Risk posture is amber: one open high risk — vulnerable-caller handling on the collections journey — with a tested mitigation live and the residual rating expected to fall to green next period. Regulatory readiness is on track against every applicable deadline, including EU AI Act disclosure. Customer outcome is at or above the human baseline on every live journey, with no material complaints.

The ask: approve extension to three further journeys, which the value model shows recovering the plan gap within two quarters. No new funding is required within the approved envelope.

That is roughly 150 words. It leads with value, names the one risk that matters and what is being done about it, closes the regulatory question in a sentence, and ends with a clear, single, time-bound ask tied back to the value number. A director can govern from that paragraph alone — which is the point. The table and the RAG posture are there to let them verify it, not to replace it.

Cadence: what goes to the board, and how often

Reporting cadence is a governance design choice, not an administrative one. Report too often and the board micro-manages an operational programme; too rarely and it cannot discharge its oversight duty. The workable pattern for most enterprise voice AI programmes separates three rhythms.

Cadence	Goes to the board	Stays below the line
Monthly / management	Exception report only — nothing unless a board-level threshold is breached (a red risk, a material incident, a missed regulatory date, a value shortfall beyond tolerance).	Full operational and financial dashboards, reviewed by the working group.
Quarterly / board	The full one-page view: value vs plan, RAG posture and top-three risks, regulatory line, customer signal, and the ask.	The appendix — risk register, attribution model, KPI detail — available on request.
Annual / strategic	Programme thesis review: is the value case still true, is the operating model right, what is the next horizon of scale.	Roadmap mechanics and procurement detail.

The discipline that makes this work is the exception report. Between quarterly meetings, the board hears nothing unless a defined threshold is crossed — and when it is, it hears immediately, not at the next scheduled slot. This is what lets a board trust a quarterly cadence on a fast-moving programme: not that nothing changes between meetings, but that they will be told the moment something does. It is the reporting equivalent of the near-miss discipline, applied to time rather than risk.

Building this layer for the first time? You can see the platform that produces the underlying data live, book an AI placement diagnostic to scope where voice AI earns board attention first, or read about our deployment approach for placing AI where the P&L actually moves.

Five board-reporting failures that stall working programmes

The programmes that lose board confidence rarely lose it on performance. They lose it on reporting. Five failure modes recur often enough to name.

Vanity metrics in place of value. Reporting "1.2 million calls handled" instead of "£2.4m captured against £2.9m plan." Volume is an activity number; the board funds outcomes. A big activity number with no value translation reads as effort without result.
No ask. A board pack that ends with a status and no decision wastes the only governance leverage the meeting has. If you are not asking the board to approve, fund, pause, or formally note something, you are reporting to the wrong forum.
Surprise risk. A risk that appears on the page for the first time as a red, with no prior amber and no near-miss history, tells the board your risk reporting is reactive. The first time the board hears about a risk should almost never be the period it materialised.
Regulatory hand-waving. "We are monitoring the regulatory landscape" is not a status; it is an admission that you do not have one. Boards now expect named obligations, dated deadlines, and a readiness colour against each.
The operations dashboard in disguise. Forty tiles shrunk to fit one slide is not a board page; it is an unreadable operations page. The board cannot govern from latency percentiles, and presenting them signals you have not done the work of deciding what matters at board level.

Each of these is a translation failure, not a performance failure — which is why they are so frustrating and so fixable. The programme is working; the reporting is speaking the wrong language. The one-page view exists precisely to force the translation: value, control, compliance, brand, decision. If your board pack cannot be reduced to those five lines, the problem is the pack, not the programme.

Frequently asked questions

How is a voice AI board report different from the operational KPI dashboard?

The operational dashboard answers "is the agent performing right now" for the programme team — containment, latency, intent accuracy. The board report answers "is this creating enterprise value, is it under control, and what do you need from us" for directors. The board view is a one-page upward report built on value captured against plan, risk posture as a single RAG colour, regulatory readiness, a customer-outcome signal, and an explicit decision. Operational metrics roll up into it but do not appear on it.

What metrics should be on a voice AI board page?

Nine at most: value captured vs plan, net programme cost vs envelope, risk posture (RAG), open high/critical risks, material incidents, regulatory readiness against deadlines, a customer-outcome signal, coverage against roadmap, and the decision being requested. Each carries a red-flag threshold so the board knows when to act. Anything that does not change a board decision belongs in the appendix.

How often should a voice AI programme report to the board?

Quarterly for the full one-page view, with a monthly exception report that escalates only when a board-level threshold is breached — a red risk, a material incident, a missed regulatory date, or a value shortfall beyond tolerance — and an annual strategic review of the programme thesis. The exception discipline is what makes a quarterly cadence safe on a fast-moving programme.

How should regulatory exposure be reported to the board?

As one defensible line: the specific obligations that apply (typically EU AI Act disclosure, FCA conduct expectations where relevant, ICO data-protection and automated-decision expectations, and cross-border transfer exposure), each with a readiness status and a date. Boards expect named obligations and dated deadlines, not "we are monitoring the landscape."

Service

AI Operating Model

Service

AI Placement Diagnostic

Product

Dilr Voice

Talk to the operators

Give your board the one page it can govern from.

We help enterprise teams build the voice AI board pack and the reporting cadence beneath it — value capture, risk posture, and regulatory exposure on one defensible page.

Book a call → See operating model →

Written by the Dilr.ai engineering team — practitioners who ship enterprise AI in production and sit in the rooms where it gets governed. Follow us on LinkedIn for shipping notes, or subscribe via the RSS feed.