Here’s the American English translation:
—
Amazon has launched new capabilities for its Amazon Bedrock platform, enabling organizations to more effectively evaluate foundational models and Retrieval-Augmented Generation systems. This update provides users with the ability to assess both models hosted on Amazon Bedrock and those running on other platforms through the new Amazon Bedrock Assessments.
One of the standout innovations is the technique known as “LLM-as-a-judge,” which performs automated evaluations with human-comparable quality. This approach allows for the assessment of various dimensions of responsible artificial intelligence, such as accuracy and comprehensiveness, without the need for manual intervention. Additionally, organizations can use custom metrics that align with their specific business requirements, facilitating a more meaningful evaluation of their generative artificial intelligence applications.
The new system also includes predefined templates and metrics built according to general criteria, while allowing users to design custom metrics that better reflect their needs. Among the available features is the ability to integrate dynamic content into assessments, as well as advanced options for defining customized output formats.
This update aims to help businesses maintain the quality of their artificial intelligence systems and continuously improve them in alignment with their strategic objectives. The incorporation of custom metrics not only expands evaluation capabilities but also promotes a more robust and contextualized analysis of results, generating a more significant impact on business performance.
via: MiMub in Spanish