Retrieval Augmented Generation (RAG) is a cutting-edge approach to creating question answering systems that combines the strengths of information retrieval and fundamental models (FMs). RAG models first retrieve relevant information from a large corpus of text and then use an FM to synthesize an answer based on the retrieved information.
An end-to-end RAG solution involves several components, including a knowledge base, a retrieval system, and a generation system. Building and deploying these components can be complex and error-prone, especially when dealing with data and models at scale.
This article demonstrates how to seamlessly automate the deployment of an end-to-end RAG solution using Knowledge Bases for Amazon Bedrock and AWS CloudFormation, allowing organizations to quickly and effortlessly set up a powerful RAG system.
Solution Overview
The solution provides automated end-to-end deployment of a RAG workflow using Knowledge Bases for Amazon Bedrock. We use AWS CloudFormation to configure the necessary resources, including:
- An AWS Identity and Access Management (IAM) role.
- A serverless collection and index of Amazon OpenSearch.
- A knowledge base with its associated data source.
The RAG workflow allows you to use data from documents stored in an Amazon Simple Storage Service (Amazon S3) bucket and integrate them with the powerful natural language processing capabilities of the FMs provided in Amazon Bedrock. The solution simplifies the setup process, allowing you to quickly deploy and start querying the data using the chosen FM.
Prerequisites
To implement the solution, you must have:
- An active AWS account and familiarity with FMs, Amazon Bedrock, and OpenSearch Serverless.
- An S3 bucket where your documents are stored in a compatible format (.txt, .md, .html, .doc/docx, .csv, .xls/.xlsx, .pdf).
- The Amazon Titan Embeddings G1-Text model enabled in Amazon Bedrock.
Solution Configuration
Once the prerequisites are met, you are ready to configure the solution:
- Clone the GitHub repository containing the solution files:
git clone https://github.com/aws-samples/amazon-bedrock-samples.git
- Navigate to the solution directory:
cd knowledge-bases/features-examples/04-infrastructure/e2e-rag-deployment-using-bedrock-kb-cfn
- Run the
sh
script, which will create the deployment bucket, prepare the CloudFormation templates, and upload the necessary templates and artifacts to the deployment bucket:bash deploy.sh
While running deploy.sh
, if you provide a bucket name as an argument, a bucket with the specified name will be created. Otherwise, the default naming format will be used.
- After the script completes, make a note of the S3 URL for
main-template-out.yml
. - In the AWS CloudFormation console, create a new stack using the copied URL.
- Provide a stack name and specify the RAG workflow details according to your use case.
Testing the Solution
Once the deployment is successful (takes between 7 and 10 minutes), you can start testing the solution.
- In the Amazon Bedrock console, navigate to the created knowledge base and choose “Sync” to start the data ingestion job.
- After data synchronization, select the desired FM for retrieval and generation.
- Begin querying your data using natural language queries.
Cleanup
To avoid future charges, delete the resources used in this solution:
- In the Amazon S3 console, manually delete the contents within the created deployment bucket, then delete the bucket.
- In the AWS CloudFormation console, select the main stack and choose “Delete”.
Conclusion
In this article, an automated solution for deploying an end-to-end RAG workflow using Knowledge Bases for Amazon Bedrock and AWS CloudFormation was presented. By leveraging AWS services and preconfigured CloudFormation templates, you can quickly set up a powerful question answering system without the complexities of building and deploying individual components for RAG applications. This approach not only saves time and effort but also provides a consistent and reproducible setup, allowing you to focus on using the RAG workflow to extract valuable insights from your data.
About the Authors
Sandeep Singh is a Senior Generative AI Data Scientist at Amazon Web Services, specializing in generative AI and machine learning. Yanyan Zhang is also a Senior Generative AI Data Scientist at Amazon Web Services, working on advanced AI/ML technologies. Mani Khanuja is a Tech Lead – Generative AI Specialists and author of the book “Applied Machine Learning and High Performance Computing on AWS”.
via: AWS machine learning blog
via: MiMub in Spanish