Voice, the darling of mobile telecommunications service providers and one of their two cash cows alongside SMS ~ short message service, has been under siege for years now with the growth of OTT ~ over the top service providers. As a channel of communication it has not seen much by way of innovation and its use has been pigeon holed by the peer to peer use case.
A key characteristic of human voice is that it carries a unique signature with every human being in the world having a differential in intonation, accent, pronunciation and other voice qualities that make it an ideal candidate for use in authentication and verification. With good growth in the fintech space driven by the opportunity that financial inclusion for the underbanked presents, it is curious that we have not seen adoption of voice biometrics in service differentiation. Currently leading fintech products in the microcredit and transaction processing space require access to a personal terminal such as a phone or card to transact
In March of this year Google opened up its Cloud Speech API to third party developers covering over 80 languages. This brings it head to head with companies such as Nuance which has close ties to IBM and has been at the core of many popular services that leverage voice for OEM’s and car manufacturers.
My imagined application of this technology goes beyond transcription and processing. Mashed up with fingerprinting technology that powers services such as Shazam and Sound Hound, voice recognition API’s can change the service experience for money transfer, banking and commerce as we know it.
The simplified step by step play would look like this. First the mobile consumer would be asked to opt in and sign up for the new voice based identity as a service offering that promises flexibility and added security while performing critical financial transactions. Part of the onboarding process will be a simple sampling of the user’s voice, reading out a phrase or set of numbers in a language of their choosing. Second and post onboarding, day to day conversations would be randomly sampled at lengths of no more than 5 seconds each to continuously build out a highly fine-tuned signature that is encrypted in storage. Thirdly, the check out or fulfilment process can take three forms; where there is a stored PIN, where a one-time transaction code is presented and where a random phrase is given personalized based on language preferences that were captured when onboarding.
For the stored PIN, the verification call to the consumer will simply ask them to repeat it, for the one time code that is presented on screen at a POS terminal or vendors mobile phone the verification call will ask for it to be read while for the random phrase the consumer may be asked to repeat a word or number set. The verification call does not need to be received on the consumers own handset so imagine scenarios where your phone is out of charge or has just been stolen alongside perhaps your wallet or purse. Here is the underpinning value; even if you would know my PIN, one time access code or the phrase that I have been served, it is near impossible to replicate my exact voice print with the nuances of accents and others mentioned earlier.
With the dynamics of authentication and verification streamlined and unshackled from personal terminals and devices we can now start to re-imagine financial services that are truly universally accessible.