The General Data Protection Regulation (GDPR) right to be forgotten gives individuals the power to request the deletion of their personally identifiable information (PII) from the systems of organisations, including third parties with whom the data was shared. This has implications for the usage of artificial intelligence (AI) frameworks in data handling, particularly those using Retrieval Augmented Generation (RAG) architectures like Amazon Bedrock.
Amazon Bedrock is a managed service that offers foundational models from leading AI companies, including Amazon, for various uses. These models function by training on large quantities of data to answer a range of questions. However, to utilise such a model for data stored in your Amazon Simple Storage Service (S3) bucket, a technique known as RAG is needed.
Knowledge Bases for Amazon Bedrock is a fully managed RAG feature allowing for the customisation of Foundational Model responses with company-specific data. It automates the end-to-end RAG process, eliminating the need for custom code to integrate data sources and manage queries, something particularly useful for organisations using RAG-based designs for generative AI systems.
But challenges arise in response to GDPR’s right to be forgotten within the RAG framework. This article discusses how to construct a GDPR compliant RAG architecture using Amazon Bedrock’s Knowledge Bases and offers best practices in dealing with right to be forgotten requests.
GDPR applies to all organisations within or dealing with EU residents’ personal data. Typical RAG architecture consists of three stages: data pre-processing, generation of embeddings using an embedding LLM, and storage of embeddings in a vector store. Any challenges with these steps are mitigated with Knowledge Bases for Amazon Bedrock’s fully managed RAG solution, converging the complex processes into one system and securing the GDPR right to be forgotten compliance.
The article then provides a step-by-step guide to implement a simplified RAG using Amazon Bedrock’s Knowledge Bases which involve data preparation, S3 bucket configuration and the creation of a knowledge base. It also addresses how to delete customer information from the system in compliance with GDPR.
Besides, several considerations are addressed, including audit tracking, data discoverability, backup and restoral of data, communication processes, security controls, and GDPR compliance aids provided by AWS. For complete GDPR compliance, the implementation must be done alongside the organisation’s privacy officer or legal counsel.