PAN & KYC Verification Over Voice: Secure Architecture for Indian Banks

Summarize Blog With:

Collecting a PAN number over a voice call takes four seconds. Securing that collection against recording leaks, AI pipeline exposure, regulatory violations, and replay attacks requires an architecture that most voice AI vendors have not built.

The gap is specific: when a customer speaks their PAN aloud, the audio enters the speech-to-text (STT) pipeline, passes through intent classification, gets logged in conversation transcripts, and appears in call recordings. That is five exposure points for a 10-character alphanumeric string that links to the customer’s entire tax and financial identity. DTMF capture at the telecom network layer eliminates all five.

This guide walks through the full security architecture for voice-based PAN verification and KYC data collection, maps each component to RBI, DPDP Act, Aadhaar Act, and IT Act requirements, and provides a capability framework for evaluating vendors.

Why voice-based KYC is harder than it looks

Banks run KYC across two distinct journeys.

Inbound: A customer calls to update their address, and the agent needs to verify identity before making changes.
Outbound: The bank calls customers approaching re-KYC deadlines (every two years for high-risk, every eight years for medium-risk, every 10 years for low-risk per RBI Master Direction) to collect updated documents.

Both journeys require the same verification steps: confirm the customer’s identity, validate their PAN against NSDL records, and capture consent under the DPDP Act. The difference is operational context. Inbound calls have a motivated customer who initiated contact. Outbound calls reach customers who may be distracted, suspicious, or unaware of the re-KYC requirement.

A voicebot handling either journey needs three capabilities that most platforms treat as separate features rather than an integrated security architecture:

Secure data capture: How sensitive numbers enter the system
Real-time verification: How those numbers get validated against government databases
Audit-ready logging: How every step gets recorded for RBI inspection

The five-layer security architecture

A production-grade voice KYC system operates across five layers. Each layer addresses a specific attack surface.

Layer 1: Network-layer DTMF capture

When the customer presses digits on their keypad, the DTMF signal travels through the telecom network. A licensed operator captures these tones at the network infrastructure layer before reaching the AI processing pipeline. The STT engine receives masked tones — not the original DTMF signals. Thus, the customer’s PAN digits never enter the speech recognition system, transcript, or call recording.

Layer 2: Encrypted API verification

The captured PAN travels over TLS 1.3 to the NSDL verification API, which returns the PAN holder’s name and validity status in under 200 milliseconds. For Aadhaar-based verification, an OTP is sent via the UIDAI API, and entered via DTMF.

The DPDP Act requires explicit, informed consent before processing. The voicebot plays a consent disclosure, records the customer’s response, and timestamps the event against the DLT ledger.

Layer 4: Temporary storage with auto-purge

Section 29 of the Aadhaar Act prohibits permanent storage. The architecture uses encrypted temporary cache (e.g., Redis with TTL) for the verification transaction, then purges it. Only the verification result and a tokenised reference persist.

Layer 5: Encrypted audit log

Every event (DTMF timestamp, API call/response, consent, data purge) is written to an encrypted, append-only audit log. This log is RBI-auditable without exposing raw PAN or Aadhaar data.

Architecture flow: Customer call → DTMF capture (network layer) → encrypted API call (NSDL/UIDAI) → KYC database write (tokenised) → encrypted audit log

DTMF masking: why the capture layer matters

The security difference between DTMF capture and voice capture is not incremental. It is architectural.

When a customer speaks their PAN number, the audio passes through the voice AI pipeline, STT transcription, intent classification, entity extraction, and response generation. At each stage, the raw PAN string exists in memory, logs, and potentially in training data. Call recordings also retain the spoken PAN.

In contrast, DTMF keypad entry is intercepted at the telecom network layer. DTMF masking replaces the original tones with flat audio before the stream reaches the AI pipeline. The STT engine receives silence or a beep; the conversation transcript says “[masked input]”. The call recording contains no recoverable PAN data.

This is compliance-critical: PCI DSS scope reduction requires that sensitive data never enters systems that process voice or store recordings. AI platforms processing PAN via voice cannot achieve scope reduction, regardless of downstream encryption.

Platforms relying on third-party carriers cannot intercept DTMF at the network layer because they do not control the telecom infrastructure.

Real-time verification: NSDL and UIDAI integration

A voicebot that collects a PAN but verifies it hours later in batch introduces fraud risk. Real-time verification closes this window.

PAN verification via NSDL: Real-time API returns the PAN holder’s name, last update date, and validity in under 200 milliseconds. Voicebot confirms name while on call, immediately flagging mismatches.
Aadhaar verification via UIDAI: Voice-only Aadhaar verification is not approved. OTP-based collection via DTMF, then validation via UIDAI Authentication API 2.5. System receives and verifies OTP within five minutes.
Voice biometric authentication: Proven at national banks, but best as an additional layer (alongside DTMF + OTP). Modern voice biometrics achieve 95–99% accuracy but are susceptible to deepfake voice attacks. Apply as multi-factor.

Compliance mapping: four regulatory frameworks

Voice-based KYC in Indian banking intersects four regulatory frameworks. This mapping connects platform controls to each requirement.

Regulatory framework	Requirement	Platform control
RBI Master Direction on KYC (updated June 2025)	Customer consent recorded audibly and securely, in auditable and alteration-proof manner	Inline voice consent capture with DLT ledger timestamp
RBI Master Direction on KYC	Re-KYC cycles: 2 years (high-risk), 8 years (medium), 10 years (low)	Outbound campaign automation with re-KYC schedule triggers
RBI Master Direction on IT Governance (April 2024)	Board-approved BCP/DR with half-yearly DR drills	Geographic failover with documented RTO/RPO
DPDP Act 2023	Explicit, informed consent before processing personal data	Consent disclosure playback with affirmative response recording
DPDP Act 2023	Right to erasure; 48-hour advance notice before deletion	Auto-purge with customer notification workflow
DPDP Act 2023	Full compliance deadline: 13 May 2027	Consent manager service with audit trail
Aadhaar Act, Section 29	No permanent storage of Aadhaar numbers outside Aadhaar Data Vault	Temporary cache (TTL) with tokenisation; raw Aadhaar purged post-verification
Aadhaar Act, Section 29	Core biometric info cannot be shared for any reason	Voice biometric voiceprints stored separately from Aadhaar data; no biometric data sent to UIDAI
IT Act, Section 43A	Reasonable security practices for sensitive personal data	AES-256 encryption at rest, TLS 1.3 in transit, role-based access control
TRAI TCCCPR 2018	Outbound calls on 140/160-series logged on the DLT platform	Automatic DLT logging with consent and call outcome

In FY 2024–25, RBI imposed approximately ₹54.78 crore in penalties across 353 regulated entities for KYC/AML violations — the top violation category.

Reusable authentication modules: one build, two journeys

Technical debt builds up when inbound and outbound flows are separate. Reusable modules fix this.

Authentication orchestrator: A single service handles authentication sequences for any voice journey. It determines verification level and invokes the required modules.

PAN verification module: Accepts DTMF input, applies masking, calls NSDL API, returns result. No code duplication.
Aadhaar OTP module: Triggers OTP via UIDAI API, captures OTP via DTMF, validates, and auto-purges Aadhaar as required.
Voice biometric module: Captures/enrolls voiceprint, compares live voice with stored print, returns confidence score and liveness result.

Example workflows:

Inbound: Customer calls → IVR routing → authentication orchestrator → PAN module (DTMF) → NSDL verification → account access.
Outbound:Auto dialer connects → customer answers → authentication orchestrator → voice biometric (quick identity confirm) → PAN module (if transaction requires) → Aadhaar OTP (if high-value) → re-KYC completion.

The orchestrator maintains session state in a distributed cache and logs every event. Adding new verification methods (face biometrics, digital signature, etc.) becomes modular and scalable.

Vendor capability framework

When evaluating voice AI platforms for KYC automation, seven capabilities determine compliance readiness:

Capability	What to verify	Why it matters
DTMF network-layer capture	Does the vendor process DTMF at the telecom network layer, or does the AI pipeline handle it?	Network-layer capture prevents PAN/Aadhaar data from entering STT, transcripts, and recordings
Licensed telecom operator status	Does the vendor hold a UL-VNO or equivalent, or rely on third-party SIP trunking?	Only licensed operators control the network layer for DTMF masking
Real-time API integration	Sub-200ms NSDL/UIDAI API calls during live conversation, or batch verification?	Batch verification creates exposure windows for fraud
Section 29 compliance	Temporary storage with auto-purge, or persistent Aadhaar storage?	Permanent Aadhaar storage violates the Aadhaar Act
Consent capture inline	Voice consent recorded and timestamped during call, or separate consent workflow?	DPDP Act requires explicit consent before processing
Reusable module architecture	Same authentication modules for both journey types?	Separate flows create control drift and audit complexity
Audit log completeness	Every DTMF, API call, consent, and purge event logged?	Incomplete logs fail RBI inspection

A platform that scores “yes” across all seven is architecturally ready for national-scale voice-based KYC. A platform relying on voice recognition for sensitive capture or third-party telecom infrastructure creates gaps no amount of downstream encryption can close.

Sources

NSDL e-Gov PAN verification API documentation
UIDAI Aadhaar Authentication API 2.5 (Revision 1, January 2022)
RBI Master Direction on KYC (updated June 2025)
RBI Master Direction on IT Governance, Risk, Controls, and Assurance Practices (April 2024)
Digital Personal Data Protection Act 2023
Aadhaar (Targeted Delivery of Financial and Other Subsidies, Benefits and Services) Act 2016, Section 29
Information Technology Act 2000, Section 43A
TRAI TCCCPR 2018 regulations
RBI enforcement action data FY 2024–25
ICICI Bank voice biometric deployment data
Indian Bank mobile voice biometric integration
PCI DSS v4.0 scope reduction requirements for DTMF masking

Shiva Tripathi

Shiva is Head of Digital Marketing & Developer Network at Exotel, a growing community of builders working with voice, messaging, and AI-powered communication APIs. He has spent 13+ years helping B2B SaaS companies grow through data-driven marketing, and today he's equally focused on helping developers discover, adopt, and get more out of Exotel's platform. He writes about developer ecosystems, voice AI trends, and what it takes to build great CX infrastructure.

PAN & KYC Verification Over Voice: Secure Architecture for Indian Banks

Table of Contents

Transform CX with AI Solutions

Why voice-based KYC is harder than it looks