Sure, here’s the translation to American English:
—
Organizations are increasingly turning to generative artificial intelligence capabilities to enrich the customer experience, optimize their operations, and stimulate innovation. However, the sustained growth of generative AI workloads presents significant challenges in terms of performance, reliability, and availability. Companies are seeking solutions that enable them to scale their inference workloads across multiple AWS regions, thereby ensuring consistency in their application performance.
To address this growing need, Amazon Bedrock has introduced cross-region inference (CRIS). This feature enables inference requests to be automatically redirected across various regions, allowing applications to adapt to traffic spikes and maximize performance without developers having to anticipate fluctuations in demand. CRIS operates by creating “inference profiles,” which specify a base model and the regions to which requests can be directed.
Recently, global cross-region inference with Claude Sonnet 4.5 from Anthropic has been made available on Amazon Bedrock. Users can now choose between a geography-specific inference profile or a global one. This advancement provides greater flexibility for organizations, as Amazon Bedrock automatically selects the optimal region to process each inference request. Global CRIS also enhances request management towards commercial regions worldwide, optimizing resources and improving model performance, especially during times of unplanned high demand.
The global cross-region inference system allows for managing unexpected traffic spikes by utilizing computational resources in different regions. This intelligent request routing mechanism considers various factors such as model availability, capacity, and latency to send requests to the most suitable region.
Additionally, thanks to cross-region inference, tools like Amazon CloudWatch and AWS CloudTrail continue to log information only in the source region. This simplifies monitoring and management, enabling organizations to maintain a centralized view of their application performance. Data security is preserved as the information transmitted during the inference process is encrypted and kept within AWS’s secure network.
To leverage global cross-region inference with Claude Sonnet 4.5, developers need to take several key steps, such as specifying the global inference profile ID when making API calls to Amazon Bedrock and configuring the appropriate permissions using AWS Identity and Access Management.
Implementing this capability not only optimizes the performance and reliability of AI applications but also results in cost efficiency improvements, with savings of approximately 10% on input and output token pricing compared to traditional geographic cross-region inference. This way, companies can maximize the value of their investment in Amazon Bedrock and utilize their resources more efficiently, achieving superior performance without incurring additional costs.
With this evolution in global cross-region inference, organizations that adopt this capability will be able to experience significant improvements in their AI applications, managing large-volume workloads more effectively and approaching disaster recovery scenarios in innovative ways.
—
Let me know if you need any further assistance!
via: MiMub in Spanish