Confidential Speech Recognition

shashank admane fortanix
Shashank Admane
Published:Jun 15, 2022
Reading Time:6 Minutes
Confidential Speech Recognition

AI (Artificial Intelligence) Speech platforms such as Speech-to-text (STT) and text-to-speech (TTS) are rising as a forefront of technology enablers, making an impact on businesses across the sectors.

STT and TTS technologies have made it easy to automate data gathering, data extraction, data analysis, and data analytics with a great pace and best efficiency. STT and TTS technologies have become pivots in healthcare, banking, finance, insurance, automotive manufacturing, tours & travels, e-learning, retail and telecommunication industries.

By 2026, the speech-to-text market is projected to grow at 19.2% CAGR with a market size of USD $5.4 billion, and the text-to-speech market is projected to grow at 14.6% CAGR with a market size of USD $5.0 billion. 

These speech platforms have found increased adoption in the areas of virtual assistants and voice bots and also some effective implementations in the healthcare industry. One such example is the adoption of speech recognition for more efficient clinical documentation which has also considerably reduced the burden of clinicians.

The emerging and resulting Personal Health Information (PHI) and Electronic Health Record (EHR) in these solutions are very much sensitive in nature and should be protected and secured under privacy rules and regulations like HIPAA. Data breaches has both direct and indirect impacts, affecting organizations, clients, businesses and all stakeholders.

The cost incurred in data breaches is huge as organizations must pay hefty fines for each data breach plus downsizing of brand name & shares comes as a complimentary. Having said this, industries are still experiencing an increase in data breaches year over year. The following data states the intensity of healthcare data breaches - 

HIPAA Reported Healthcare Data Breaches. 

Year 

Number of Data Breaches 

Exposed Records in Millions 

             Cost Per Record 

2010 

199 

5.530 

$294 

2011 

200 

13.150 

$240 

2012 

217 

2.800 

$233 

2013 

278 

6.950 

$296 

2014 

314 

17.450 

$359 

2015 

269 

113.270 

$363 

2016 

327 

16.400 

$355 

2017 

359 

5.100 

$380 

2018 

365 

33.200 

$408 

2019 

505 

41.200 

$429 

Total 

3033 

255.18 

 

Source:  National Center for Biotechnology Information, U.S 

 Reported Healthcare Data Breaches. 

Year 

Number of Data Breaches 

Individuals Affected in Millions 

2010 

207 

5.400 

2011 

236 

11.410 

2012 

222 

3.270 

2013 

294 

8.170 

2014 

277 

21.340 

2015 

289 

110.700 

2016 

334 

14.570 

2017 

385 

5.740 

2018 

 

 

2019 

 

 

Total 

2244 

108.80 

Source:  National Center for Biotechnology Information, U.S 

Following graph indicates that the healthcare industry is the preferred target of attackers because of high commercial value of EHR's. 

 data breach stats in healthcare industry

Source:  National Center for Biotechnology Information, U.S 

The same is the story with other industries like banking, finance, insurance, etc. Now it is on organizations to protect and secure data in order to avoid any type of attacks and theft of sensitive data.

As a standard, industries are following the below practices to protect their data and business in turn – 

  • Privacy
  • Access control policies
  • Security
  • Encryption of data stores 
  • Encrypted transit  
  • Authorization and authentication
  • Standards and Regulations 
  • Govt. rules and regulations 

That does help to some extent. But do you know there is still a blind spot where your data can be breached or stolen? Any guess? 

Encryption does take care of the security of data when data is at rest and while data is in transit. What about data in use i.e., when the data is decrypted and is on RAM for computation? Is it safe? Did you ever think about security of data when it is in use?

Let me answer these questions - Data in use is not safe and is vulnerable to theft and attacks. Cross site injections, memory scrapping malwares and more can easily expose data that is in use.

Also, the most common of all which is an insider attack or your own trusted administrator misusing the privileges can access the data in use.

In the era of "Cloud" this is getting even worse as you must have trust and faith in the cloud vendors, their infrastructure, and administrators. This leaves data encryption cycle incomplete. 

 incomplete data encryption cycle

Then how to save your business from this kind of security breach? How to employ zero-trust solutions? How to adhere to strict regulations? 

Here comes Fortanix for your rescue with its Confidential AI offering. Fortanix offers an end-to-end data security solution including the runtime data protection which is the focus of this blog.

Fortanix Confidential Computing technology enables you to secure your data in use by running your application within a secure enclave.

Secure enclaves ensure that your application runs in a trusted execution environment by encrypting data when it is in use. The data, that attackers or malicious administrators can access, is encrypted data which is of no use.

Awesome, now you have secured the missing part of your data journey - encryption of data when in Use. 

 complete data encryption cycle

Below image shows an example where each endpoint in a solution encrypts data in use. The example taken here is a Speech Recognition Application which is being used for natural language understanding within confidential computing.

Each endpoint (STT/NLP/ML/TTS) of following solution is being run inside the secure enclave where data gets encrypted at runtime. The speech from actor gets processed within secure Enclave OS, STT application.

STT then leverage Fortanix DSM (Data Security Manager) to tokenize PHI generated via medical transcribe. The NLP application receives tokenized PHI as an input from STT application. NLP application talks to Fortanix DSM to de-tokenize PHI while running in confidential environment and generate consumable output for further AI/ML application.

The AI/ML application also does inference confidentially and push encrypted results to TTS application. TTS application decrypts results at runtime in confidential environment and speaks out result to end user. This is how you can secure your data and application pipeline by running your solution within confidential computing environment.  

   Fortanix DSM architecture

With the advent of Speech Recognition technology, many STT/TTS SaaS, API and Solutions offerings are growing in numbers. Protecting speech recognition in its data journey is a need of an hour.

Running this technology within confidential computing would definitely give an edge to the business and would definitely help to reduce data breaches.

Confidential Speech Recognition is going to be a norm in future. Why to wait when you have platform set. For more details on adapting Confidential Computing for your offerings, contact Fortanix. We will love to help you to make your business Confidential. 

Reference - https://www.ncbi.nlm.nih.gov/ 

Share this post: