Sure! Here is the translation to American English:
—
Amazon has introduced a new feature in its Amazon Bedrock service, making it easier for businesses to utilize high-performance artificial intelligence models through a single interface. This initiative is designed to promote the creation of generative AI applications, prioritizing aspects such as security, privacy, and responsible technology use.
The new batch inference functionality focuses on managing large-scale workloads where response time is not critical. With this batch processing, organizations can analyze large datasets more efficiently, achieving up to a 50% reduction in costs compared to on-demand options, which is particularly appealing for those handling large volumes of information.
As companies expand their use of Amazon Bedrock for big data analysis, it becomes essential to implement adequate monitoring and management practices for batch inference jobs. To address this need, a solution has been developed that utilizes AWS serverless services, such as Lambda, DynamoDB, and EventBridge. This integration not only reduces operational burden but also ensures reliable performance in processing large-scale data.
A clear example is a financial services company that handles millions of customer interactions, ranging from credit histories to spending patterns. Although the company has identified the possibility of leveraging advanced AI capabilities for personalized recommendations, processing these enormous volumes of data in real time is not always a requirement.
The proposed architecture uses batch inference in Amazon Bedrock, starting by uploading credit and product data to an Amazon S3 bucket. From there, an initial set of Lambda functions creates JSONL files with the necessary information for inference. A batch inference job is then triggered in Bedrock, while an automated monitoring system through EventBridge ensures that any changes in the job’s status initiate appropriate actions, such as logging that status in DynamoDB.
The benefits of this automated solution are varied, including real-time visibility, streamlined operations, and better resource allocation, resulting in greater efficiency in the use of batch inference.
To implement this solution, an active AWS account and the appropriate permissions to create resources, access models, and deploy in a compatible region are required. The process is supported by an AWS CloudFormation template, allowing for repeatable deployments and optimizing the use of this innovative functionality.
Additionally, best practices are proposed to enhance monitoring of operations, such as setting up CloudWatch alarms for failed jobs and utilizing custom metrics to gain greater visibility into the performance of inference jobs.
The estimated cost to run this solution is under one dollar, considering the use of the Claude 3.5 model from Anthropic. This proposal not only improves the ability to process large volumes of financial data but also opens up a range of new applications, such as fraud pattern detection and financial trend analysis, always ensuring real-time operational visibility.
—
via: MiMub in Spanish