Security in the Cloud: AWS Encrypted EBS with MFA Policy in Terraform

Phelan Guan
8 min readMar 14, 2021

Intro

As organizations empower DevOps practices within the ranks, it becomes necessary to emplace security controls, ensuring a compromise or permission abuse does not cripple on going development and deployments. However, finding the right balance between the principles of least privilege and empowering the action levels can be tricky, especially as dependencies are layered on top of one another.

TL;DR

Hey I get it, sometimes you are itching to test out something, anything different than what you’ve got right now, so here’s the code, and the analysis to follow afterwards:

MFA_policy.json (Setting the MFA requirement)

group_policy.json (The group policy to allow a user to create encrypted EBS volumes with EC2 )

The Problem

I wanted to ensure users of the organization’s AWS account adhered to a MFA requirement, and any subsequent group policies. Part of the organization’s policy was to encrypt all hard disks. Writing these requirements indepdent of each other is straightforward through AWS’ thorough documentation. However, I ran into a raft of permissions issues trying to implement this seemingly innocent and straightforward code block as one of my restricted permissions users:

resource “aws_instance” “instance” {
ami = data.aws_ami.ubuntu_bionic.id
instance_type = var.instance_size
availability_zone = data.aws_availability_zones.available.names[0]
key_name = var.instance_ssh_key_pair_name
subnet_id = aws_subnet.public.id

metadata_options {
http_tokens = “required”
http_endpoint = “enabled”
}
associate_public_ip_address = true root_block_device {
encrypted = true
}
}

Terraform made it so easy! I just say encrypted = true and my EBS is encrypted! Unfortunately, this was not one of those stories. Running terraform apply gave me the below error:

Error: Error waiting for instance (i-xxxxxxxxxxxxxxxxx) to become ready: Failed to reach target state. Reason: Client.InternalError: Client error on launchon ../main.tf line xx, in resource “aws_instance” “instance”:
14: resource “aws_instance” “instance” {

What AWS Recommends

In this article, AWS recommends implementing Multi-Factor Authentication (MFA) by defining actions for IAM self-management. At the very end, the recommendation to enforce MFA is in this snippet:

{
"Sid": "DenyAllExceptListedIfNoMFA",
"Effect": "Deny",
"NotAction": [
"iam:CreateVirtualMFADevice",
"iam:EnableMFADevice",
"iam:GetUser",
"iam:ListMFADevices",
"iam:ListVirtualMFADevices",
"iam:ResyncMFADevice",
"sts:GetSessionToken"
],
"Resource": "*",
"Condition": {
"BoolIfExists": {
"aws:MultiFactorAuthPresent": "false"
}
}
}

After the policy had applied, this had led to the same nondescript error:

Error: Error waiting for instance (i-xxxxxxxxxxxxxxxxx) to become ready: Failed to reach target state. Reason: Client.InternalError: Client error on launch on ../main.tf line xx, in resource “aws_instance” “instance”:
14: resource “aws_instance” “instance” {

Client internal error?? That’s all AWS could give me?? As it would turn out, it’s actually far more specific than implied. A quick google search yielded these results:

Client Internal Error

Encrypted Volumes Stops Immediately

This failure is due to the EBS volume not attaching or encrypting properly. When the EC2 instance tries to boot up, it can’t access the EBS Volume, so it gives up. In fact, when I check the console, my instances would start the creation process, then self-terminate without ever indicating it had an attached volume. The second link gave me a step by step permissions walk-through for attaching encrypted EBS volumes. At this point, let us first talk about what happens under the hood for an EC2 instance to use an encrypted EBS Volume.

To tell you about our Lord and Savior, the AWS Key Management System

Behind the Scenes

Here is the full process involving AWS Key Management System(KMS):

  1. A volume is defined as ‘encrypted’ in EBS
  2. EBS calls KMS to request a Data Encryption Key (DEK)
  3. KMS generates a DEK from the specified Customer Master Key (CMK)**
  4. The CMK encrypts the DEK
  5. The DEK is then stored on the encrypted EBS volume as metadata
  6. The EBS volume is then attached to an EC2 instance
  7. EC2 sends a ‘decrypt’ request to KMS with the encrypted DEK from the volume
  8. KMS decrypts the DEK into a plaintext DEK and sends it back to the EC2 instance
  9. EC2 stores the plaintext DEK in its hypervisor memory for as long as the EBS volume is attached to the instance
  10. EC2 uses the DEK to perform I/O encryption to the volume using the AES-256 algorithm

**AWS makes life slightly confusing because in the console it makes reference to Customer Managed Keys as a CMK as opposed to AWS managed keys. However, for all intents and purposes, whether a key is managed by the customer or AWS, it is still a Customer Master Key when AWS Documentation refers to CMK.

Back to the policy

So from the above steps, we can see why we have to grant the following permissions to the user in order to properly encrypt an EBS Volume:

/user-groups/group_policy.json{
“Sid”: “AllowKMS”,
“Effect”: “Allow”,
“Action”: [
“kms:Decrypt”,
“kms:Encrypt”,
“kms:RevokeGrant”,
“kms:GenerateDataKey”,
“kms:GenerateDataKeyWithoutPlaintext”,
“kms:DescribeKey”,
“kms:CreateGrant”,
“kms:ListGrants”
],
“Resource”: “arn:aws:kms:*:*:key/*”
}

I created an instance through the console and lo and behold, I could create an EC2 instance with attached encrypted EBS Volume. Problem solved, let’s do a terraform apply, merge to the origin repository and call it a day!

….

Except one small problem.

….

When using Terraform to launch an instance, it was still producing the Client.InternalError error. How can this be??

Devil in the (documentation) details

Here is the offending block from what AWS recommends:

{
“Sid”: “DenyAllExceptListedIfNoMFA",
“Effect”: “Deny”,
“NotAction”: [
“iam:CreateVirtualMFADevice”,
“iam:EnableMFADevice”,
“iam:GetUser”,
“iam:ListMFADevices”,
“iam:ListVirtualMFADevices”,
“iam:ResyncMFADevice”,
“sts:GetSessionToken”,
“iam:ChangePassword”
],
“Resource”: “*”,
“Condition”: {
“Bool”: {
“aws:MultiFactorAuthPresent”: “false”
}
}
}

At first glance it is straightforward and innocuous: deny the user the ability to do anything except setting up MFA when the user does not have an MFA set. However, there are several key things you have to know:

1. The aws: MultiFactorAuthPresent key isn’t available under all circumstances. Even though it is a global context conditions key, aws: MultiFactorAuthPresent is only available to ‘temporary credentials’ such as session tokens as opposed to ‘long term credentials’ such as access keys, CLI or API. Oddly enough, when a user logs into the console, behind the scenes AWS generates a session token for the user. Documentation here.

2. Using the boolean statement Bool. Since Terraform utilized ‘long term credentials’ through access keys, the aws: MultiFactorAuthPresentkey was not available for the policy to evaluate even though the user had set up MFA. Thus, according to our condition statement, aws: MultiFactorAuthPresent remains false.

3. Order of evaluations. AWS IAM will evaluate DENY statements before any ALLOW statements. In an explicit deny statement (when we set ”Effect”:”Deny”) once we meet ANY deny criteria, the policy stops the evaluation, even though we set the proper KMS permissions later on. Documentation here.

These three important details is what causes our terraform failures, even though we were successful in creating an encrypted EBS EC2 using the console.

The Solution, part 1

First off, we need to ensure MFA is still in effect. While the below will not force terraform to utilize MFA to make programmatic requests, it will not allow programmatic requests to be processed without MFA for console access. This worked for my purposes and organization, but it may not work for yours.

What’s important to note here is in ”Sid”: “AllowMFAEnableWhenNoMFA” the statement is no longer an explicit deny, but a default deny. As you can see from the chart below, by replacing the DENY statement with ALLOW logic, we allow the statement to move through the various stages of KMS policy, as opposed to a blanket DENY, which would immediately end the evaluation flow. This is important if you are using Customer Managed CMK, where you only want certain users to be able to encrypt their EBS volumes. For my use case, I wanted all EC2 instances to have encrypted EBS using the default region’s AWS managed CMK.

Policy evaluation flow. Source: AWS Documentation
/user-create/mfa_policy.json
...
{
“Sid”: “AllowMFAEnableWhenNoMFA”,
“Effect”: “Allow”,
“Action”: [
“iam:CreateVirtualMFADevice”,
“iam:EnableMFADevice”,
“iam:GetUser”,
“iam:ListMFADevices”,
“iam:ListVirtualMFADevices”,
“iam:ResyncMFADevice”,
“sts:GetSessionToken”,
“iam:ChangePassword”
],
“Resource”: “*”,
“Condition”: {
“BoolIfExists”: {
“aws:MultiFactorAuthPresent”: “false”
}
}
}

The Solution, part 2

Previously, we had included the ”Sid”: “AllowServices” as part of the base user policy, but AWS best practice recommends breaking out the policy by functions, so I decided to leave the MFA requirement policy attached to each user, and then I allow users to actually do things in AWS via group membership. The following policy is attached to a group:

/user-groups/group_policy.json
{
“Version”: “2012–10–17”,
“Statement”: [
{
“Sid”: “AllowServices”,
“Action”: [
“ec2:*”,
“s3:*”
],
“Effect”: “Allow”,
“Resource”: “*”,
“Condition”: {
“BoolIfExists”: {
“aws:MultiFactorAuthPresent”: “true”
}
}
},
{
“Sid”: “AllowKMSAttachment”,
“Action”: [
“kms:CreateGrant”,
“kms:ListGrants”,
“kms:RevokeGrant”
],
“Effect”: “Allow”,
“Resource”: “*”,
“Condition”: {
“Bool”: {
“kms:GrantIsForAWSResource”: true
},
“BoolIfExists”:{
“aws:MultiFactorAuthPresent”: “true”
}
}
},
{
“Sid”: “AllowKMS”,
“Action”: [
“kms:Decrypt”,
“kms:Encrypt”,
“kms:RevokeGrant”,
“kms:GenerateDataKey”,
“kms:GenerateDataKeyPair”,
“kms:GenerateDataKeyWithoutPlaintext”,
“kms:DescribeKey”,
“kms:ReEncrypt*”
],
“Effect”: “Allow”,
“Resource”: “arn:aws:kms:*:*:key/*”,
“Condition”: {
“BoolIfExists”: {
“aws:MultiFactorAuthPresent”: “true”
}
}
}
]
}

Miscellaneous notes

  • KMS CMK management is a two way street when it comes to administration and use permissions. The user must be authorized to use the key via grants, which is controlled by the policy attached to the user. However, the CMK itself has a policy attached that governs who is able to administer it, given in the Principal section. Because my use case was for AWS managed default encryption keys, it allows all users to utilize the CMK for encryption. See more here.(https://docs.aws.amazon.com/kms/latest/developerguide/key-policies.html#key-policy-default-allow-root-enable-iam)
  • You may have noticed that the AWS provided policy listed managed resource like so:
“Resource”: “arn:aws:iam::*:user/${aws:username}”,

But in my final policy version, the managed resource was listed as such:

“Resource”: “arn:aws:iam::*:user/${username}”,

The reason for the difference is in Terraform’s interpolation, which creates a cleaner way of generating the permission boundaries for this policy. The final version of the policy resource was created as such in main.tf:

/main.tfdata “template_file” “iam_policy” {
template = file(“${path.module}/mfa_policy.json”)
vars = {
username = aws_iam_user.basic_user.name
}
}
resource “aws_iam_user” “basic_user” {
name = var.user_name
path = “/”
force_destroy = true
}
resource “aws_iam_user_policy” “mfa_requirement” {
name = “require_mfa”
user = aws_iam_user.basic_user.name
policy = data.template_file.iam_policy.rendered
}

Parting thoughts

It’s not uncommon for cloud practitioners to complain about AWS IAM management, because the devil is truly in the details and any combination of factors that work in other contexts can unexpectedly combine, resulting in non-functioning code. In future situations where I have to troubleshoot permissions, I would use this workflow:

  1. Check if the required permission was explicitly granted
  2. Check using the AWS Policy Simulator if the permission applies to the specified resource
  3. Evaluate the Actions allowed/disallowed by DENY statements
  4. Evaluate the Condition logic of any DENY statements

I expended a lot of time in a doom loop between steps 1 and 2 without ever moving on to 3 and 4 until a fresh pair of eyes mentioned that they could create the encrypted EBS EC2 instance when the MFA requirement policy was removed.

Thanks for reading! You can find complete code at my Gitlab repo!

--

--

Phelan Guan

Give a man fire and he will be warm for a night. Set a man on fire, and he will be warm for the rest of his life.