Site icon Business with blogging!

KMS key rotation caused Decrypt error: AccessDenied while decrypting secrets and the key policy update + grant audit that restored secrets access

In modern cloud infrastructure, data security and access control are of utmost importance. Companies often use encryption and key management to protect sensitive resources, including application secrets, credentials, and configuration settings. AWS Key Management Service (KMS) plays a pivotal role in managing encryption keys. However, automatic or manual key rotation processes can sometimes cause service disruptions. This article explores a real-world incident triggered by KMS key rotation, which led to a Decrypt error: AccessDenied when accessing encrypted secrets, and how administrators restored access through a meticulous key policy and grant audit.

TL;DR

A recent issue involving AWS KMS key rotation caused access denial errors during decryption of stored secrets. The root cause was found to be missing permissions in the new key’s policy and lack of appropriate grants to the consuming service. By conducting a comprehensive review and auditing KMS key policies and grants, access was successfully restored. The incident highlights the importance of maintaining continuity in key permissions across key rotations.

The Incident: Secrets Fail to Decrypt After Key Rotation

A mid-sized enterprise using AWS Secrets Manager suddenly faced failures across multiple applications. The log files pointed to a common issue: “Decrypt error: AccessDenied”. This occurred when services attempted to access secrets encrypted with AWS KMS keys. The behavior was unexpected, as the applications had shown no issues with secrets access just a day earlier.

The underlying configuration used AWS Secrets Manager with customer-managed keys (CMKs) for encryption. These CMKs were subject to a regular key rotation policy for compliance with security best practices. The rotation generated a new key version, but failed to migrate all the required permissions, which became evident only when the consumer services could no longer decrypt the secrets.

Identifying the Root Cause

The DevOps team started their investigation by reviewing the error message details. The critical parts included:

A quick check in the AWS KMS console showed that key rotation had been triggered less than 24 hours before the issue began. Each time AWS rotates a CMK, it creates a new version of the key, but under the same logical CMK identifier. This should, in principle, be seamless, but in this case, the key users and policies were not correctly propagated forward to the new key version.

Where Things Went Wrong

The CMK used to encrypt secrets had tightly scoped Key Policies and additional grants to allow access for specific IAM roles. Upon key rotation, the key’s materials were rotated, but an inadvertent misconfiguration occurred:

  1. The new key version lacked explicit permission grants to the ECS Task and Lambda IAM roles.
  2. Automated grant propagation in the Dev team’s KMS rotation script failed due to a bug.
  3. No alerts were in place to detect KMS decryption failures during CI/CD deployment.

As a result, attempts to decrypt secrets in Secrets Manager failed since the services could not use the new key version, producing the AccessDenied error.

The Fix: Auditing Key Policies and Grants

After isolating the failure to the recent key rotation event, the next step was to restore access. The process involved two parallel investigations:

Review of Key Policies

The team used the AWS CLI to retrieve the key policy of the rotated key:

aws kms get-key-policy --key-id key-id --policy-name default

They compared the key policy of the pre-rotation key with the current one and identified missing principal ARNs for critical IAM roles. These were manually reinstated using a revised policy document.

Audit of KMS Grants

KMS grants often run in the background to provide temporary access to specific AWS principals. The team inspected the active grants:

aws kms list-grants --key-id key-id

The output confirmed that grants for Lambda functions and ECS tasks were absent. They used the following command to recreate the necessary grants:

aws kms create-grant \
  --key-id key-id \
  --grantee-principal arn:aws:iam::123456789012:role/LambdaRole \
  --operations Decrypt \
  --name LambdaSecretsAccess

After grants were reinstated and policies updated, the affected services were restarted, and secret access was restored.

Image not found in postmeta

Preventing Future Incidents

To safeguard against similar issues, the team implemented several preventive measures:

Lessons Learned

This incident highlights vital security and operational lessons:

FAQ

1. What is the Decrypt AccessDenied error in AWS KMS?

This error means the IAM principal (such as a Lambda function or ECS task role) doesn’t have permission to use the KMS key to decrypt data. It often relates to missing grants or an incomplete key policy.

2. Does rotating a KMS key require updating all secrets or data?

No, CMMKs in AWS are rotated under the same logical key ID, so existing encrypted resources do not need re-encryption. However, permissions to use the new key material must remain intact.

3. What’s the difference between a key policy and a grant in AWS KMS?

Key policies are primary authorization mechanisms for KMS keys. Grants provide scoped, often temporary permissions to IAM users or roles for specific KMS operations like Decrypt.

4. How can I verify which IAM roles have access to my KMS key?

You can inspect the key policy document and also list grants using the AWS CLI. Additionally, AWS Access Analyzer for KMS can audit policies to provide insight into effective permissions.

5. How do I safely rotate KMS CMKs without causing outages?

Always test service access before and after the rotation. Automate grant comparison during rotation, and use IAM simulation tools to validate that intended services retain decryption permissions.

By learning from this incident, teams can improve the reliability of their secrets management workflows and make their systems more resilient to KMS-related configuration errors.

Exit mobile version