Over the last year, access to artificial intelligence (AI)
has increased substantially, putting world-altering technologies in the palm of more people’s hands. Between the widespread adoption and nearly 92 percent of leading businesses
investing in AI, it has never been more important for high-quality voice recognition technology to keep personal information secure.
This is especially true considering since 2010, the number of AI incidents and controversies has increased over 26x, according to the AIAAIC database
, which tracks incidents related to the ethical misuse of AI.
While voice biometrics have long been trusted to authorize individuals based on their unique vocal characteristics such as on the phone with the bank or insurance company, the advancement of synthetic voice generation and accessibility to open-source applications has put the integrity of authentic voice authentication systems, and people’s privacy, in jeopardy – while making it more critical than ever to safeguard voice biometrics technologies.
Synthetic Voice Evolution
Text-to-speech (TTS) technology has evolved significantly in the last ten years. While TTS models used to be daunting and costly to create and quickly identified by their robotic tones, with recent advancements it is now easier to create custom voices that closely resemble human speech, complete with expressions, emotions, and versatility. While this is exciting and has many positive and far-reaching implications for the customer service industry, it has also opened the door to misuse, raising concerns about data privacy and cybersecurity.
Disinformation and Security Concerns
Scammers are now able to impersonate individuals with incredible accuracy, putting people’s privacy and security at risk. A deepfake can be created with just a short recording of a person’s voice and will likely be able to bypass traditional voice biometric systems.
This not only poses a significant security risk but impacts the human ability to trust all forms of media, from fake videos on the news to social media. A few notable examples are viral deepfakes of Tom Cruise, Ukrainian President Volodymyr Zelenskyy
, many U.S. political figures
, and more.
As the line between reality and artificial intelligence becomes blurred, deepfakes across media platforms and industries will have serious consequences. So how can systems ensure that who they are speaking to is in fact who they say they are?
Detection – Voice Biometrics Protection in Action
Robust text-to-speech detection algorithms can dissect and synthesize the new complexities of synthetic voices. By analyzing human and synthetic voices in day-to-day interactions, TTS algorithms can be refined and perfected. In these cases, every caller’s voice is thoroughly examined to confirm authenticity, and by reassessing the voice time and time again, these systems gain confidence and extreme accuracy in detecting deepfakes. And even for some modern systems, they can distinguish between real and artificial voices more accurately than a human listener.
Additionally, there is a fine line to maintaining a balance between security and accessibility. While being dedicated to safeguarding voice biometrics against fraudulent activities is important, there must be a commitment to accommodating legitimate users who may rely on TTS technology due to disabilities. This is where having a system that enables seamless enrollment and verification for users who require TTS assistance and allows them to whitelist their preferred TTS software, is key.
As this landscape continues to evolve, the risk to voice biometric security will only continue to increase. However, through further innovation and an understanding of the technology behind synthetic voices, people and companies can become equipped to detect and combat these threats. This is the only way to maintain the safety and privacy of personal information and trust in the authenticity of humans.