On 12 May 2026, PolyAI announced a Toronto technology hub, four senior executive hires, and a new headline number: 200+ enterprise customers, thousands of live deployments, 75 languages, 25 countries. Within the same 48 hours, Vapi closed a $50M Series B at a $500M valuation on the back of winning Amazon Ring against 40 rival vendors. Two scale-narrative events, one news window — and most of the coverage read them as vendor milestones.
That is the wrong frame for a buyer. If you are evaluating enterprise voice AI on a 36-month contract, the question is no longer "can the vendor handle scale?" — every credible vendor in this category can now point to live volume. The question is which scaling model survives the contract: capacity scaling, orchestration scaling, or platform scaling. PolyAI just told you which model it is investing in. So did Vapi. Knowing how to read the signal is what separates a procurement decision from a press-release purchase.
This post is the buyer's framework. Not vendor PR.
This guide is shipped by the team behind Dilr Voice — enterprise voice AI live in 40+ countries on a shared platform stack. For pre-procurement evaluation, see our DATS five-stage methodology.
Vendor scale evidence is a necessary condition, not the procurement decision. What matters on a 36-month contract is which scaling model the vendor uses to grow, and whether that model fits how your organisation actually consumes voice AI. Capacity scaling (PolyAI), orchestration scaling (Vapi), and platform scaling (DILR.AI's approach) each carry different contract risks, different switching costs, and different EBIT profiles. The same 200+ customer count means different things under each model.
A reader at this stage usually has one of three voice AI buying motions in flight — a CX modernisation programme, an outbound automation play, or a regulated-industry compliance build. The scaling model question lands differently in each, but the framework is shared. The reference point worth keeping open is our voice AI procurement framework reading the May 2026 vendor map, which puts the cap-table moves into a four-axis grid. For buyers anchoring on cost, the companion piece on voice AI TCO and the hidden enterprise costs is the right next read — total ownership cost behaves very differently under each scaling model.
What PolyAI's Canada move actually tells you
Read the announcement as an operator. PolyAI did three things in one press release. First, it hired senior executives in customer success, AI delivery, and engineering — functions that scale linearly with deployed accounts. Second, it opened a Toronto hub of ~20 people focused on agent design, AI deployments, and business development. Third, it surfaced its current customer base: 200+ enterprises, thousands of live deployments, 75 languages, 25 countries (up from 100+ customers, 2,000+ deployments, 45 languages at the Series D close).
This is the textbook fingerprint of capacity scaling: more customers require more agent designers, more AI engineers, more delivery managers, in new time zones. The model works — PolyAI was named fastest-growing AI company in Europe by the FT, and revenue grew ~10× in a year per the company's own filings (PolyAI press release, 12 May 2026{target="_blank" rel="noopener"}). But capacity scaling has a procurement consequence buyers should price in: as the vendor's customer base grows, each individual account's share of voice with senior staff falls. That matters when an outage hits, when a regulator request lands, or when you need a new language live for a market launch.
The 200+ enterprise count is the headline, but it is the deployment-to-customer ratio that you actually want to model. Thousands of deployments across 200+ customers implies roughly 10–20 live agents per logo. That is consistent with PolyAI's bespoke posture — multiple voice agents per brand, each tuned to a specific contact-centre workflow. Compare that against an orchestrator like Vapi, which reports 1B+ calls handled and 1–5M calls per day across its enterprise base. The volume per logo is structurally higher because the model is API-thin. (TechCrunch, Vapi $500M valuation, 12 May 2026{target="_blank" rel="noopener"}.)
The same logic shapes how DATS thinks about deployment sequencing. Our AI placement diagnostic — a fixed-fee assessment used before any procurement commitment — explicitly maps deployment-to-customer density into a vendor risk score. A buyer evaluating PolyAI today should expect different responsiveness than a buyer who signed at customer #20.
The three scaling models, side by side
For a deeper read on the orchestration-vs-platform split, see our breakdown of voice AI orchestration vs platform architecture — it covers how each model behaves under change requests, latency budgets, and contract renegotiation.
How buyers should read scale signals on a 36-month contract
Here is the framework. Every voice AI vendor announcement maps to one of the three scaling models below. The model dictates the procurement risk profile. The customer count, valuation, and language coverage are inputs — but they only become decision-relevant once you know which model the vendor is running.
| Scaling model | Vendor archetype | Scale evidence | Portability | 36-month contract risk |
|---|---|---|---|---|
| Capacity | PolyAI | 200+ customers, hubs, headcount growth | Low — bespoke build per logo | Senior-staff dilution as base grows |
| Orchestration | Vapi, Bland, Retell | 1B+ calls, 1–5M/day, API throughput | Medium — components swap, glue stays | Buyer carries the integration burden |
| Full-stack vertical | ElevenLabs (voice agents arm) | Owned STT + LLM + TTS + telephony | Low — proprietary at every layer | Lock-in across the full call stack |
| Platform | DILR.AI | Shared stack across Voice, Studio, Academy, DATS | High — one stack, four surfaces | Feature-parity tradeoffs vs bespoke |
Capacity scaling: when it fits, and when it bites
Capacity scaling fits buyers who want a vendor-managed outcome — the team designs your agents, tunes them per campaign, and owns the production loop. PolyAI's hospitality and banking case studies sit here. The bite comes at year two: the vendor's customer count has doubled, your account is no longer in the top-20 cohort, and the senior staff who closed your deal are now on three other accounts. This is not a flaw in the model — it is the model's growth tax. Buyers should write it into the contract via named-staff continuity clauses, escalation SLAs that tighten over time, and a contractual right to a senior delivery review every six months.
Orchestration scaling: when it fits, and when it bites
Orchestration scaling fits buyers with strong internal engineering who want to own the conversational logic and treat the vendor as an API surface. Amazon Ring choosing Vapi over 40 rivals is the proof point. The bite is that "best-of-breed APIs" still requires somebody to integrate them — and when the LLM, the TTS, and the telephony layer each have their own pricing model, latency budget, and compliance posture, the buyer is doing the systems integration work. Our piece on what enterprises should take from the Vapi–Amazon Ring deal unpacks the four-gate methodology Ring used to evaluate this tradeoff.
Platform scaling: the DILR.AI argument
Platform scaling — the model we run at DILR.AI — assumes voice AI is one expression of a shared infrastructure stack that also produces content (Studio), training (Academy), and consulting outputs (DATS). The 200+ customer signal becomes less decisive because the unit economics live at the stack layer, not the customer-success layer. Buyers get portability across surfaces: a voice agent's brand-voice profile can feed a Studio asset library; a transcript can feed an Academy module. The honest tradeoff is less bespoke depth per logo than PolyAI offers at the top end — and we tell buyers that in the diagnostic, not after the contract is signed. If you want the architecture comparison in detail, our Dilr Voice product page and the broader DILR.AI approach page walk through the trade matrix. Buyers running a head-to-head should also pilot Dilr Voice agents against their incumbent for a single workflow before committing.
What this means for the next 36 months
Two scaling narratives landed in one 48-hour news window. The investor read is "category-maturing." The buyer read should be "category-bifurcating." Vendors have made structural choices about how they will grow over the contract life you are signing into. PolyAI has chosen capacity. Vapi has chosen orchestration. ElevenLabs has chosen full-stack control. DILR.AI has chosen platform. The signals will keep coming — funding rounds, headcount, language counts, language coverage, valuation marks. Read every one of them through the model question first.
For deeper context on how valuation moves themselves carry buyer risk, see our analysis of voice AI funding signals and what buyers should infer from them, and for the related buyer-side de-risking around platform consolidation, our piece on voice AI vendor consolidation risk. If you want a 30-minute conversation about how your current shortlist maps onto these three models, book a scoping call with the operators — we will tell you whether DILR.AI fits, and where it does not.
Want to see this in production? Try Dilr Voice live ($20 free credits), book an AI placement diagnostic, see our DATS methodology, or read AI operating model consulting for governance work.
Sign the right scaling model — not the loudest one.
30-min scoping call · No deck · Confidential. We will map your voice AI shortlist onto the three scaling models and tell you where each one breaks.
Written by the Dilr.ai engineering team — practitioners who ship enterprise AI in production. Follow us on LinkedIn for shipping notes, or subscribe via the RSS feed.