Voice AI and DSARs: When a Caller Asks for the Recording

The call ended two weeks ago. Today, an email lands in your DPO inbox: "I would like a copy of all personal data you hold about me, including any recordings of telephone conversations."

Your voice AI programme has just received a data subject access request. And it is harder to fulfil than any DSAR your compliance team has handled before.

Not because the law is different. GDPR Article 15 is the same right of access that applies to email threads and CRM records — available to any UK or EEA resident, without reason, to any controller processing their data. The difficulty is that voice AI creates personal data in at least five distinct forms, spread across multiple sub-processors, with third-party voices embedded in audio that the requester has no right to receive in full. Mapping all of it, locating it across your data estate, and returning only what the individual is entitled to — within thirty days — is an operational challenge most programmes have not engineered for.

This guide gives you the architecture: what is in scope, what the clock looks like, how to handle the third-party problem, and what to build before the first request arrives.

This guide is published by the team behind Dilr Voice — enterprise voice AI deployed across regulated industries in 40+ countries. For the full compliance architecture, see our AI operating model consulting service.

30days

Statutory response window (extendable to 3 months)

5types

Distinct personal data categories created by voice AI

£17.5m

Maximum ICO fine for DSAR non-compliance under UK GDPR

Sub-processors typically holding voice AI call data

What counts as personal data in a voice AI programme?

The common mistake is treating a DSAR for call data as a request for the audio file. The ICO's position — confirmed in its guidance on automated decision-making and in enforcement notices — is that all data derived from or linked to an identified individual is in scope. For a voice AI call, that means five distinct data types, not one.

Type 1 — Call recording. The raw audio file, typically stored by your telephony layer or a dedicated call recording platform. This is the obvious one. What teams miss is that the recording may exist in multiple formats across multiple systems: the raw .wav file from the telephony platform, a compressed version in your call archive, and potentially a copy retained by a sub-processor for quality or training purposes. All three are in scope.

Type 2 — Transcript. The text rendering of the call, generated by your speech-to-text (STT) engine. Transcripts are the data layer that powers the rest of voice AI — sentiment analysis, search, coaching, CRM update. They also constitute separate personal data in their own right. A transcript of a call in which the data subject discussed a health condition, a financial position, or a family situation is personal data with a high sensitivity level, regardless of whether it is accompanied by the recording.

Type 3 — AI-derived analytical data. This is the category most voice AI programmes have not mapped to their DSAR process. When your agent or your quality-assurance layer analyses a call, it generates derived outputs: sentiment scores (positive/negative/neutral or numeric), emotion labels (calm, frustrated, confused), intent classifications (payment query, complaint, cancellation intent), topic tags, call summary notes, escalation reason codes. Each of these is personal data — it is information about an identified caller, derived from their voice. It does not matter that it is a score or a label rather than a verbatim statement. The ICO's guidance on automated decision-making makes this explicit: outputs of automated processing that relate to an individual are personal data.

Type 4 — System metadata. Call duration, timestamp, CLI (the number the caller called from, or was called on), IVR path taken, agent version that handled the call, escalation triggers fired, disposition codes, QA flags. This data may be held in your telephony CDR (call detail records), your CRM, or your analytics database. It is all in scope.

Type 5 — CRM and downstream updates. Any updates written to the customer's CRM record as a result of the call — notes added by the AI agent, status changes, follow-up tasks created — are in scope as part of the overall personal data held about the individual. The DSAR obligates you to provide a complete picture, not just the raw call artefacts.

Key obligation

Under GDPR Article 15, the data subject is entitled to a copy of all personal data undergoing processing — not just data the controller considers "primary." A voice AI programme that returns only the recording and ignores transcripts, sentiment data, and metadata is providing an incomplete and non-compliant DSAR response.

The third-party voice problem

Call recordings create a specific DSAR complication that email threads and CRM records do not: other people's voices.

In many enterprise voice AI scenarios, calls involve more than one party whose voice is captured. An outbound debt collection call may have a third party present in the household. An inbound customer service call to a utilities company may be handled partly by an AI agent and then transferred to a human agent whose voice is also in the recording. A recruitment screening call may involve two candidates being assessed in sequence. A compliance call at a financial services firm may involve the caller's adviser.

The third-party voices in your recording are personal data belonging to those third parties — not to the requester. You cannot provide that data to the requester without breaching the privacy of the third party, unless the third party has consented or it is reasonable in all the circumstances to disclose it (UK GDPR Schedule 2, Part 3, paragraph 16).

In practice, this means your DSAR fulfilment process for voice AI must include a redaction step for recordings that involve third-party voices. The mechanics are:

Identify calls in which voices other than the data subject and automated systems are present.
Assess whether the third-party is identifiable (a named agent, a household member identified in the conversation, another customer).
Decide whether disclosure is reasonable — typically it is not, for identifiable third parties.
Redact the third-party audio before providing the recording to the requester.

Transcripts have the same problem. A transcript that contains the name or identifying details of a third party may need to be partially redacted before it can be supplied. Where the third-party's contribution to the conversation is so interleaved with the requester's that clean redaction is not possible, you may need to provide a summary rather than the full document.

This redaction step is not optional, and it is not trivial. If your voice AI programme has not built a redaction workflow into its DSAR process, the first time you receive a request will expose that gap.

AI-derived data: the forgotten scope

Sentiment analysis is the capability that makes voice AI analytically powerful and GDPR-challenging at the same time. When your platform analyses a call and assigns a sentiment score, it is processing the caller's voice to infer their emotional state. That output — the score, the label, the summary — is personal data.

More significantly: if your voice AI platform uses tone-of-voice analysis, pitch analysis, or speech-rhythm analysis to infer emotional state, it may be processing biometric data within the meaning of GDPR Article 9. The GDPR Article 9 obligations around biometric voice data are materially higher than for standard personal data — they require an explicit legal basis and, in many cases, a Data Protection Impact Assessment.

The DSAR implications are the same: AI-derived data is in scope. If your requester has called you twenty times and your system has generated twenty sentiment scores and twenty call summaries, all forty records are in scope for the DSAR. You must locate them, collate them, and include them in your response.

Most voice AI platforms do not make this data easy to export in a per-caller format. Analytics dashboards are designed for aggregate programme management, not individual data subject responses. Before you have a functioning DSAR process, you need to confirm that your platform can:

Query analytical records by caller CLI or customer ID
Export those records in a readable format (JSON, CSV, or a plain-English summary)
Tie them to the correct call records

If it cannot, you have an architecture gap that needs to close before DSARs start arriving.

The one-month clock

Article 12 of UK GDPR requires a controller to respond to a DSAR without undue delay and in any event within one month of receipt of the request. The month runs from the day the request is received — not from when the controller decides the request is valid, not from when an identity check is completed.

You may extend by a further two months (three months total) where the request is complex or you have received a number of requests. But you must notify the data subject of the extension, and the reasons for it, within the first month. Silence is not an option.

What counts as complex? The ICO accepts complexity where the sheer volume of data, the need for redaction of third-party information, or the involvement of multiple systems requiring manual extraction makes one month genuinely insufficient. A voice AI programme handling thousands of calls per customer over years is likely to generate DSARs that meet this threshold. But you must document your reasoning and communicate clearly.

Identity verification does not pause the clock. If you need to verify the identity of the requester before you can respond, you must ask for clarification promptly and without delay. The period between asking for clarification and receiving it is excluded from the one-month window — but only if your request for information was genuinely necessary and proportionate, not an attempt to delay.

The practical implication: your DSAR process must start the moment the request is received, not when your DPO gets around to reviewing it. That means automated receipt acknowledgement, immediate logging to a DSAR register, and a triage step to assess complexity and whether identity verification is needed.

For context on how long voice AI retains data in the first place — which directly affects DSAR scope — see our guide on voice AI data retention and GDPR obligations. A retention policy that deletes recordings after 60 days, for example, means a DSAR received on day 90 cannot be fulfilled for that call period — but your programme must still provide all other data types that are still within retention windows.

Sub-processor disclosure: the hidden complexity

Voice AI call data does not live in one system. In a typical enterprise voice AI deployment, personal data flows through at least three distinct sub-processors:

Typical sub-processor data estate

Telephony platform Raw recording, CDR metadata, CLI
STT / transcription engine Transcript, word-level timestamps, confidence scores
LLM / analytics layer Sentiment scores, intent labels, call summaries, entity extractions
CRM / case management Call notes, status updates, follow-up tasks, agent logs
Call archive / compliance store Long-term recording copies, QA review flags, audit logs

Your DPA (Data Processing Agreement) with each sub-processor should obligate them to assist you in fulfilling data subject rights requests — including providing copies of data they hold about the requester, within a timeframe that lets you meet your one-month window.

In practice, sub-processor DSAR assistance is uneven. Telephony platforms typically have a "download call recordings by date and CLI" function. Analytics platforms may have nothing — their dashboards are built for programme managers, not data subject response. Your DPA might say they will assist, but the mechanism may be a manual data extraction request that takes two weeks of engineering time.

The AI tool inventory that regulators increasingly expect is the starting point for mapping where your voice AI data actually lives. Every system in the inventory needs a corresponding DSAR extraction process documented before DSARs start arriving. Your voice AI architecture for regulated industries must treat DSAR extractability as a first-class design requirement — the same way it treats data residency or encryption at rest.

DSAR fulfilment: the process flow

The sequencing above assumes sub-processor turnaround of 5–10 days — which requires your DPAs to mandate it and your sub-processors to have built it. Where sub-processor turnaround is longer, the 30-day window becomes very tight.

ICO enforcement context

The ICO has been increasingly active on DSAR failures. The right of access is not a formality — it is an enforceable individual right, and complainants can go directly to the ICO when controllers fail to respond within the statutory window or provide incomplete responses.

In 2024 and 2025, the ICO issued formal reprimands and, in some cases, enforcement notices to organisations that either failed to respond to DSARs on time or provided materially incomplete responses. Voice AI programmes are not yet a common feature in enforcement decisions — but only because the technology is still reaching scale. As voice AI becomes standard in financial services, healthcare, and public-sector contact centres, DSAR failures in those programmes will become a live enforcement category.

The specific risks for voice AI programmes are:

Incomplete responses — providing the recording but not the AI-derived data. The ICO's guidance is clear that derived outputs are in scope. An organisation that provides the recording but withholds sentiment scores or call summaries is providing an incomplete response and has not discharged its Article 15 obligation.

Failure to redact — providing third-party voice data to the requester. This is not only a DSAR failure; it is a breach of the third party's personal data rights and may constitute a personal data breach reportable to the ICO under Article 33.

Late responses — exceeding the one-month window without an extension notification. The ICO's enforcement data shows that late responses are the most common DSAR failure and are treated seriously even where the data eventually provided is complete.

No awareness of the request — a DSAR received by a customer-facing team, forwarded internally, and lost in transit before reaching the DPO. A DSAR received anywhere in your organisation triggers the clock. Your staff training and intake process must ensure every employee knows what a DSAR looks like and where to route it immediately.

Building DSAR-ready architecture into your voice AI programme

The organisations that handle voice AI DSARs well are not the ones with the most resourced legal teams — they are the ones that built extractability into their architecture from the start. The checklist below covers the design decisions that make DSAR fulfilment operationally manageable rather than a crisis.

Step 01 — Data inventory per call. Every call should generate a structured record that links all five data types (recording, transcript, AI-derived data, metadata, CRM updates) under a single customer or caller identifier. That identifier should be indexed in every sub-processor system so that data can be retrieved by customer ID, not by hunting through date ranges and call logs. See the voice AI auditability requirements for how audit logging and data indexing interact — DSAR extractability and audit trail are the same infrastructure requirement.

Step 02 — Sub-processor DSAR clause. Every DPA with every sub-processor must contain an explicit obligation to assist with DSAR fulfilment within a defined timeframe (recommend: 10 business days from request). It must specify the mechanism — API export, secure file transfer, or a documented manual process — and the format.

Step 03 — Third-party voice detection. If your calls involve any possibility of third-party voices (conference calls, household members, transfers to human agents), your archive process needs a flag for calls that contain identifiable third-party audio. This does not need to be automated — a review step in the DSAR process is sufficient — but it must exist.

Step 04 — Redaction capability. Either a software tool capable of removing audio segments from a recording file (preserving the data subject's voice while muting the third party's), or a documented manual process for producing a written transcript with third-party sections omitted. The transcript-only route is acceptable where redaction of the audio is technically disproportionate — the ICO accepts alternative formats where the primary format is impractical.

Step 05 — DSAR register and intake. A documented process for logging, acknowledging, triaging, and tracking every DSAR, including a mechanism to trigger sub-processor requests immediately on receipt. The register should record: date received, data subject name and CLI, assigned handler, complexity assessment, extension decision (if applicable), sub-processor requests dispatched, redaction review outcome, and response date.

Step 06 — Staff training on recognition. A caller who asks "can I get a copy of my call?" is making a DSAR. A customer who writes to your complaints team asking "what information do you hold about me?" is making a DSAR. Any member of staff who receives that request must know immediately that it is a DSAR and route it to the DPO without delay. Training needs to cover this explicitly for contact centre staff, who are the most common point of first contact.

There is a specific DSAR scenario that creates a double obligation: a caller asks for a copy of their data and, in the same request or shortly after, asks whether you have consent to record their calls.

Your consent capture architecture for GDPR-compliant voice AI must maintain a durable, queryable record of consent — not just the audio statement of consent captured on the call, but a database record tied to the caller's identity and call ID. When a DSAR asks about consent, you need to provide:

Evidence that consent was obtained (or a statement of the alternative lawful basis used)
The date and form of consent (e.g., verbal consent captured in call ID #12345, 2026-03-14)
What the consent covered (recording, processing, AI analysis)
Whether consent has since been withdrawn and what action was taken

Your multi-jurisdiction recording consent architecture will determine which consent records you hold and in what format. If you are operating across UK, EU, and US markets, your consent evidence will differ by jurisdiction — and your DSAR response may need to present it differently for different caller populations.

The specific case of DSARs triggered by automated decisions

If your voice AI agent makes a decision that has a legal or similarly significant effect on the data subject — a credit decision, a claim acceptance or rejection, an account restriction — GDPR Article 22 gives the data subject the right to:

Obtain human review of the decision
Express their point of view
Receive meaningful information about the logic involved

A DSAR that arrives in the context of a disputed automated decision is more complex than a standard call-data request. It requires you to provide not just the call data but the decision logic — the rules, thresholds, or model outputs that resulted in the outcome — and to document that human review is available.

This requirement is why the decision logic of your AI agent, not just the call data it generates, must be auditable and explainable at the per-call level. Where AI agent behaviour influenced a significant outcome, the explanation of that behaviour is part of the DSAR response.

Need to build this into your programme now? Try Dilr Voice with built-in audit logging, book an AI placement diagnostic to map your existing data estate, or read how we approach AI operating model design for regulated industries.

Service

AI Operating Model

Service

AI Placement Diagnostic

Product

Dilr Voice

Build it before the first request arrives

DSAR-ready voice AI starts with the architecture.

30-min scoping call · No deck · Confidential. We'll map your current data estate and tell you whether your DSAR process is audit-ready.

Book a call → See operating model →

Written by the Dilr.ai engineering team — practitioners who ship enterprise voice AI in regulated industries. Follow us on LinkedIn for compliance updates, or subscribe via the RSS feed.

What counts as personal data in a voice AI programme?

The third-party voice problem

AI-derived data: the forgotten scope

The one-month clock

Sub-processor disclosure: the hidden complexity

DSAR fulfilment: the process flow

ICO enforcement context

Building DSAR-ready architecture into your voice AI programme

The consent record crossover

The specific case of DSARs triggered by automated decisions

DSAR-ready voice AI starts with the architecture.

Related articles

Voice AI and PCI DSS: Handling Spoken Card Numbers

Voice AI Call Recording: A Multi-Jurisdiction Consent Map

ISO 42001 for Voice AI: The New Procurement Signal

One email, once a month. No hype. Just what we learned shipping.