Terraform Cloud and Vault Provider Integration for Admin and Operator — AWS STS Assume Dynamic Credentials, Rotate Secret Engine Mounts Automatically, and Sentinel Workspace Policy Checks on Variable Modification and STS Assume

One of the challenges of delivering infrastructure in a multi-cloud, zero-trust world is developing robust framework to allow operators the ability to develop at speed without being blocked as well as allowing admins to control access to resources so operators don’t have access to the entire world. This article clarifies some of those aspects.

Trust Model

The admin workspaces will create the necessary infrastructure for the operator workspace to consume. The operator will only be allowed to provision ec2 resources. The STS assume credentials that are generated are attached with a policy from the admin workspace that allows ec2-only. The operator does not have access to modify this.

The data the operator workspace pulls is scoped to creating ec2-only instances meaning:

  • The approle role_id and secret_id have an auto generated policy from the Vault policy admin workspace scoped to only the secrets it needs. Example below.
path "auth/aws-creds-approle-ec2-only" {

Operation

The vault-policy workspace provisions Vault policies, approles, secrets backends, the TFC manager workspace provisions the operator workspace, and the operator workspace consumes the values provided from the admin workspaces.

When the operator triggers the workspace, it will generate dynamic AWS credentials assuming the STS role that allows ec2-only provisioning. It will call out to Vault to grab values to retrieve the backend and role needed to set the values for the AWS provider.

Sentinel Checks

Using sentinel third-generator common functions.

https://medium.com/hashicorp-engineering/using-new-sentinel-features-in-terraform-cloud-c1ade728cbb

If, for any reason, values change then we want to enforce the workspace to fail when it finds data that the admin considers insecure for the workspace.

Variable check

The first check will look at the approle role_id that the TFC manager workspace pulls from Vault and sets as a sensitive value in the operator workspace. If an operator attemps to overwrite the role_id then Sentinel will enforce a mandatory policy check and won’t allow the run to continue until an administrator has a chance to check what’s going on.

# Import the tfplan/v2 import, but use the alias "tfplan"

Assume role check

The second check Sentinel will do is verify the assume role matches the ARN that we want for the workspace. This is a break-the-glass check in case Vault somehow fails. Since, if the operator attempts to hardcore the ARN value then Vault will throw an error back since the backend role only is set for ec2-only.

# This policy restricts accounts that can be assumed by the AWS provider.

Workspaces

The vault policy admin workspace does the following:

  • Provisions an ec2-only iam role from an assume role policy file.
  • Attaches an iam role policy to the ec2-only iam role.
  • Creates an AWS secret backend in Vault.
  • Creates an AWS secret backend role in Vault.
  • Creates a Vault auth backend for approle that attaches to an ec2-only policy.
  • Creates 3 specific Vault secret mounts:
  1. One to store ec2-only secrets for the operator workspace.
  2. One to store ec2-only-admin secrets for the TFC manager workspace.
  3. One to store secret paths for the TFC manager workspace.
  • Data template files to reference input values that will rendered the correct output values to write to Vault for the above 3 mounts.

The TFC manager workspace does the following:

  1. Provisions the operator workspace.
  2. Pulls secret path data from vault to write sensitive variable values to the operator workspace.
  3. This includes the approle role_id and secret_id that is scoped to a token that is only capable of retrieving secrets from the operator path.

The operator workspace does the following:

  1. Pulls the AWS access credentials needed for STS assume from Vault for ec2-only.
  2. Generates short-lived, dynamic AWS credentials for the run.
  3. Only allows ec2 provisioning.
  4. Operator has limited access to do anything else but provision with the scope provided by the admin.

AWS STS Credentials

https://www.vaultproject.io/docs/secrets/aws

Each run of the operator workspace will trigger short-lived, dynamic AWS STS credentials to be generated scoped to the assume role set in the Vault policy admin workspace.

Vault policy admin workspace backend. Example below.

resource "vault_aws_secret_backend" "aws" {

Custom Operator Workspace Permissions

https://www.hashicorp.com/blog/introducing-custom-workspace-permissions

The following permissions are granted to the operator workspace, which allows them to edit variables and start runs.

TFC Manager Workspace (Admin)

The TFC manager workspace will create the operator workspace, and pull the needed values for it from Vault and set them as sensitive. Because of a chicken-egg problem of referencing itself, we create a separate secrets mount to store the secret paths so that we then reference their values to push to the Terraform variables.

https://registry.terraform.io/providers/hashicorp/tfe/latest/docs

data "vault_generic_secret" "secret-paths" {

Rotating Secret Paths

Since the Vault policy admin workspace generates the backends required for these operators, we can rotate the secret path names whenever we please. This is helpful if someone were able to grab the plain values for the secret paths for nefarious use — albeit paranoid — this gives us the ability to ensure the data changes so the path name is always referenced by changing values to grab values from Vault.

For example: The path secret1234567890/ec2-only can because secret1gmf054gjka/ec2-only and the workspaces can run off of triggers to automatically update the operator variables when a random_string is changed. As we see, we assign a random_string value to each secret backend to allow for the case of rotating the secret engine mount path.

# generate random string to affix to secret engine used to store values that the operator workspace will consume

K/V Values Stored in Vault

The following secret information is used by the admin to set intended variable values. The admin workspaces write the appropriate values and set them. The operator workspace consumes the approle information that’s scoped to its secret path only — as well as only being allowed to generate AWS STS credentials scoped to ec2.

Wrapping & Scaling Up

The admin workspaces provide an admin workflow that allows administrators the ability to provision policy and secret data and propagate that to operator workspaces with a code change and automated Terraform Cloud run.

This model will allow for scale as an administrator can deliver an ec2-only provisional workspace or any workspace scope to an IAM role to a developer based on an admin workspace creation using Vault as the broker for data retrieval and security.

Allow Vault to generate dynamic, short-lived AWS STS credentials allows for tightly controlled access per-run without the need to store static keys and allows for scoping access to the required run permission of the operator.

Accessing Vault with generated approle information allows for short-lived, rotating credentials with IDs to be referenced by data IDs in Terraform code and storing in kv/ mounts allows to data retrieval between workspaces using scoped approle information.

Setting a run trigger for the vault policy, TFC manager, and operator workspaces allows for admins to make changes that propagate automatically (if needed).

I am a devoted learner and writer of technical articles with a devops focus.