At present, the protection of personally identifiable information (PII) is not only a regulatory requirement, but also a cornerstone for consumer trust and business integrity. Organizations use advanced natural language processing services like Amazon Lex to build conversational interfaces and Amazon CloudWatch to monitor and analyze operational data. However, one of the risks many organizations face is the inadvertent exposure of sensitive data through logs, voice chat transcriptions, and metrics. This risk is exacerbated by the increasing sophistication of cyber threats and the strict penalties associated with data protection breaches.
To address this critical challenge, a solution has been developed that utilizes slot obfuscation in Amazon Lex and the data protection capabilities of CloudWatch Logs, specifically designed for detecting and protecting PII in logs.
In Amazon Lex, slots are used to capture and store user input during a conversation. Slot obfuscation ensures that any information collected through Amazon Lex’s conversational interfaces, such as names, addresses, or other PII entered by users, is obfuscated at the point of capture. This method reduces the risk of exposing sensitive data in chat logs and recordings.
In CloudWatch Logs, data protection and custom identifiers add an additional layer of security by allowing the concealment of PII within session attributes, input transcriptions, and other sensitive log data specific to your organization. This approach minimizes the footprint of sensitive information across these services and helps comply with data protection regulations.
To implement this protection, several detailed steps must be followed:
Amazon Lex: Monitor and protect data using slot obfuscation and selective logging of conversation records.
CloudWatch Logs: Monitor and protect data with log streams and log group policies.
Amazon S3: Monitor and protect data using bucket security settings and encryption.
Service Control Policies (SCPs): Use data governance controls and risk management policies to prevent changes in Amazon Lex chatbots and CloudWatch log groups, and restrict the viewing of unhidden data in CloudWatch Logs Insights.
The first step is to identify and classify data flowing through your systems. Then, locate where this information is stored or processed in your systems and applications. For services involving Amazon Lex and CloudWatch, it is crucial to identify all data sources and their roles in handling PII.
In Amazon Lex, slot obfuscation provides a mechanism for automatically obscuring PII within conversation logs. To enable obfuscation for a slot from the Amazon Lex console, follow the steps outlined in the document.
For selective logging of conversation records, Amazon Lex allows you to choose how conversation records with text and audio data from live conversations are captured by filtering certain types of information from the logs. This helps minimize the risk of exposure of private or confidential information.
In CloudWatch Logs, the log stream feature and log group data protection policy settings can be used to audit and hide sensitive data appearing in log events.
For storage services like Amazon S3, it is important to properly configure buckets with encryption, access controls, and lifecycle policies to protect stored data. It is also recommended to use Amazon Kinesis to capture, process, and analyze real-time audio data, and then export it to a secure and compliant storage solution like Amazon S3.
In summary, securing PII within AWS services like Amazon Lex and CloudWatch requires a proactive and comprehensive approach. By following these steps and adopting data protection practices, organizations can create a robust security framework that not only protects sensitive data but also complies with regulatory standards and mitigates risks associated with data breaches and unauthorized access.
via: MiMub in Spanish