Amazon Transcribe is an Amazon Web Services (AWS) tool that converts speech to text using machine learning technologies. It can be used for a wide range of applications such as transcribing customer care calls, voicemail messages, and generating subtitles for videos. Some customers may entrust sensitive and confidential data with the service, such as personal health information and payment card details, so Amazon Transcribe includes strong security measures for data in transit and at rest.
These measures adhere to the AWS shared responsibility model, which differentiates the security of the cloud, handled by AWS, from the security in the cloud, which is the customers’ responsibility. AWS protects the global infrastructure of the cloud, while customers are responsible for the security configuration and management tasks for the AWS services they use.
Data encryption is used to ensure the confidentiality of data communicated between an application and Amazon Transcribe. The tool can operate in one of two modes: streaming transcriptions for real-time transcriptions and batch transcription for asynchronous jobs. In streaming transcription mode, a bidirectional streaming connection is established over Transport Layer Security (TLS), a widely accepted cryptographic protocol. Alternatively, in batch transcription mode, an audio file is uploaded to an Amazon Simple Storage Service (Amazon S3) bucket and a batch transcription job is created in Amazon Transcribe.
To protect data at rest, Amazon Transcribe uses encrypted Amazon Elastic Block Store volumes to temporarily store customer data during media processing. The data is then cleaned up once the process is completed, regardless of whether it was successful or not.
Amazon Transcribe also allows customers to communicate via a private network path using interface VPC endpoints powered by AWS PrivateLink, which meets security requirements of applications not connected to the internet. Another key feature is the capability to detect and redact personally identifiable information (PII) from transcripts and audio files, a useful function for customers aiming to achieve Payment Card Industry (PCI) compliance.
In addition to these functions, Amazon Transcribe incorporates several best practices for enhanced security. These include using IAM roles, tag-based access control, AWS monitoring tools, and AWS Config. However, these guidelines are a general measure and may need to be supplemented depending on the specific environment.
Compliance validation for applications built on AWS can be achieved through programs like SOC, PCI, FedRAMP, and HIPAA. AWS uses third-party auditors to evaluate its services for compliance with these programs, and audit reports can be downloaded from AWS Artifact.
Through these extensive security features and recommendations, Amazon Transcribe offers a robust solution to convert speech to text while maintaining high levels of data security and privacy.