<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Julius Oh]]></title><description><![CDATA[Julius Oh]]></description><link>https://juliusoh.tech</link><generator>RSS for Node</generator><lastBuildDate>Mon, 13 Apr 2026 22:22:38 GMT</lastBuildDate><atom:link href="https://juliusoh.tech/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Deploying Applications to AWS ECS with Terraform: Infrastructure as Code Guide]]></title><description><![CDATA[Introduction
 Deploying containerized applications to AWS can be complex, involving multiple interconnected services: ECS task definitions, load balancers, target  groups, listener rules, and DNS configuration. While you could manage these with CLI c...]]></description><link>https://juliusoh.tech/deploying-applications-to-aws-ecs-with-terraform-infrastructure-as-code-guide</link><guid isPermaLink="true">https://juliusoh.tech/deploying-applications-to-aws-ecs-with-terraform-infrastructure-as-code-guide</guid><category><![CDATA[Devops]]></category><category><![CDATA[AWS]]></category><dc:creator><![CDATA[Julius Oh]]></dc:creator><pubDate>Wed, 01 Oct 2025 16:15:09 GMT</pubDate><content:encoded><![CDATA[<p>  <strong>Introduction</strong></p>
<p> Deploying containerized applications to AWS can be complex, involving multiple interconnected services: ECS task definitions, load balancers, target  groups, listener rules, and DNS configuration. While you could manage these with CLI commands, <strong>Infrastructure as Code (IaC)</strong> with Terraform provides versioning, reproducibility, and team collaboration.</p>
<p>In this guide, I'll show you how to deploy any containerized application to AWS ECS with Application Load Balancer and custom domain configuration—all defined in Terraform.</p>
<p><strong>What We'll Build</strong></p>
<p>By the end of this tutorial, you'll have Terraform code that creates:</p>
<ul>
<li><p>✅ ECS Task Definition (containerized app blueprint)</p>
</li>
<li><p>✅ ECS Service (manages running containers)</p>
</li>
<li><p>✅ ALB Target Group (health checks &amp; routing)</p>
</li>
<li><p>✅ ALB Listener Rule (custom domain routing)</p>
</li>
<li><p>✅ Route53 DNS Record (points domain to ALB)</p>
</li>
<li><p>✅ CloudWatch Log Group (centralized logging)</p>
<p><strong>Best part</strong>: All infrastructure is versioned, reviewable, and reproducible!</p>
<p><strong>Architecture Overview</strong></p>
</li>
</ul>
<pre><code class="lang-plaintext">  User Request (https://myapp.yourdomain.com)

           ↓

  Route53 DNS (Terraform managed)

           ↓

  Application Load Balancer

           ↓

  ALB Listener Rule (Terraform managed)

           ↓

  Target Group (Terraform managed)

           ↓

   ECS Service (Terraform managed)

           ↓

   ECS Tasks (Docker Containers)
</code></pre>
<p>  <strong>Prerequisites</strong></p>
<ul>
<li><p>Terraform installed (v1.0+)</p>
</li>
<li><p>AWS CLI configured with credentials</p>
</li>
<li><p>Docker image in ECR or Docker Hub</p>
</li>
<li><p>Existing AWS infrastructure:</p>
</li>
<li><p>VPC with subnets</p>
</li>
<li><p>ECS cluster</p>
</li>
<li><p>Application Load Balancer</p>
</li>
<li><p>Route53 hosted zone</p>
<p><strong>Project Structure</strong></p>
</li>
</ul>
<pre><code class="lang-plaintext">  terraform/

  ├── main.tf                 # Main resource definitions

  ├── variables.tf            # Input variables

  ├── outputs.tf              # Output values

  ├── terraform.tfvars        # Variable values (gitignored)

  ├── versions.tf             # Provider versions

  └── data.tf                 # Data sources (existing resources)
</code></pre>
<p>  <strong>Step 1: Set Up Terraform Configuration</strong></p>
<p>  <a target="_blank" href="http://versions.tf">versions.tf</a> - Provider Configuration</p>
<pre><code class="lang-plaintext"> terraform {

    required_version = "&gt;= 1.0"

    required_providers {

      aws = {

        source  = "hashicorp/aws"

        version = "~&gt; 5.0"

      }

    }

    # Optional: Remote state backend

    backend "s3" {

      bucket         = "my-terraform-state"

      key            = "ecs/my-app/terraform.tfstate"

      region         = "us-east-1"

      encrypt        = true

      dynamodb_table = "terraform-state-lock"

    }

  }

  provider "aws" {

    region = var.aws_region


    default_tags {

      tags = {

        Environment = var.environment

        Project     = var.project_name

        ManagedBy   = "Terraform"

      }

    }

  }
</code></pre>
<p>  <strong>Why remote backend?</strong></p>
<ul>
<li><p>✅ Team collaboration (shared state)</p>
</li>
<li><p>✅ State locking (prevents conflicts)</p>
</li>
<li><p>✅ Encryption at rest</p>
</li>
<li><p>✅ Version history</p>
<p><a target="_blank" href="http://variables.tf">variables.tf</a> - Input Variables</p>
</li>
</ul>
<pre><code class="lang-plaintext">  variable "aws_region" {

    description = "AWS region"

    type        = string

    default     = "us-east-1"

  }

  variable "environment" {

    description = "Environment name (e.g., prod, staging)"

    type        = string

  }

  variable "project_name" {

    description = "Project name for resource naming"

    type        = string

  }

  variable "app_name" {

    description = "Application name"

    type        = string

  }

  variable "app_image" {

    description = "Docker image URL"

    type        = string

  }

  variable "app_port" {

    description = "Port the application listens on"

    type        = number

    default     = 8080

  }

  variable "app_cpu" {

    description = "CPU units for the task (1024 = 1 vCPU)"

    type        = number

    default     = 256

  }

  variable "app_memory" {

    description = "Memory for the task (MiB)"

    type        = number

    default     = 512

  }

  variable "desired_count" {

    description = "Number of tasks to run"

    type        = number

    default     = 2

  }

  variable "health_check_path" {

    description = "Health check endpoint"

    type        = string

    default     = "/"

  }

  variable "health_check_matcher" {

    description = "Expected HTTP status codes"

    type        = string

    default     = "200"

  }

  variable "domain_name" {

    description = "Custom domain for the app (e.g., myapp.example.com)"

    type        = string

  }

  variable "hosted_zone_name" {

    description = "Route53 hosted zone (e.g., example.com)"

    type        = string

  }

  variable "vpc_id" {

    description = "VPC ID where resources will be created"

    type        = string

  }

  variable "ecs_cluster_name" {

    description = "Name of existing ECS cluster"

    type        = string

  }

  variable "alb_arn" {

    description = "ARN of existing Application Load Balancer"

    type        = string

  }

  variable "alb_listener_arn" {

    description = "ARN of ALB HTTPS listener (port 443)"

    type        = string

  }

  variable "environment_variables" {

    description = "Environment variables for the container"

    type        = map(string)

    default     = {}

  }

  variable "secrets" {

    description = "Secrets from AWS Secrets Manager"

    type = list(object({

      name      = string

      valueFrom = string

    }))

    default = []

  }

  terraform.tfvars - Variable Values

  aws_region      = "us-east-1"

  environment     = "production"

  project_name    = "my-company"

  app_name        = "my-app"

  # Docker image

  app_image = "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:latest"

  # Container configuration

  app_port        = 8080

  app_cpu         = 512

  app_memory      = 1024

  desired_count   = 2

  # Health check

  health_check_path    = "/health"

  health_check_matcher = "200,302"  # Accept 200 OK and 302 redirects

  # DNS

  domain_name       = "myapp.example.com"

  hosted_zone_name  = "example.com"

  # Existing infrastructure

  vpc_id              = "vpc-0123456789abcdef0"

  ecs_cluster_name    = "production-cluster"

  alb_arn             = "arn:aws:elasticloadbalancing:us-east-1:123456789012:loadbalancer/app/production-alb/abc123"

  alb_listener_arn    = "arn:aws:elasticloadbalancing:us-east-1:123456789012:listener/app/production-alb/abc123/def456"

  # Environment variables

  environment_variables = {

    ENV           = "production"

    LOG_LEVEL     = "info"

    PORT          = "8080"

  }

  # Secrets (stored in AWS Secrets Manager)

  secrets = [

    {

      name      = "DATABASE_URL"

      valueFrom = "arn:aws:secretsmanager:us-east-1:123456789012:secret:prod/database-url-abc123"

    },

    {

      name      = "API_KEY"

      valueFrom = "arn:aws:secretsmanager:us-east-1:123456789012:secret:prod/api-key-def456"

    }

  ]
</code></pre>
<p>  <strong>🔒 Important</strong>: Add terraform.tfvars to .gitignore - it contains sensitive values!</p>
<p>  <strong>Step 2: Data Sources for Existing Resources</strong></p>
<p>  <a target="_blank" href="http://data.tf">data.tf</a> - Query Existing Infrastructure</p>
<pre><code class="lang-plaintext"> # Get existing ECS cluster

  data "aws_ecs_cluster" "main" {

    cluster_name = var.ecs_cluster_name

  }

  # Get existing ALB

  data "aws_lb" "main" {

    arn = var.alb_arn

  }

  # Get Route53 hosted zone

  data "aws_route53_zone" "main" {

    name         = var.hosted_zone_name

    private_zone = false

  }

  # Get VPC

  data "aws_vpc" "main" {

    id = var.vpc_id

  }

  # Get current AWS account

  data "aws_caller_identity" "current" {}

  # Get current AWS region

  data "aws_region" "current" {}
</code></pre>
<p>  <strong>Why use data sources?</strong></p>
<ul>
<li><p>✅ Reference existing infrastructure without hardcoding ARNs</p>
</li>
<li><p>✅ Validate resources exist before creating new ones</p>
</li>
<li><p>✅ Get dynamic values (like ALB DNS name)</p>
<p><strong>Step 3: Main Infrastructure Resources</strong></p>
<p><a target="_blank" href="http://main.tf">main.tf</a> - Core Resources</p>
</li>
</ul>
<pre><code class="lang-plaintext"> # ============================================================

  # CloudWatch Log Group for Container Logs

  # ============================================================

  resource "aws_cloudwatch_log_group" "app" {

    name              = "/ecs/${var.environment}/${var.app_name}"

    retention_in_days = 30

    tags = {

      Name = "${var.environment}-${var.app_name}-logs"

    }

  }

  # ============================================================

  # IAM Role for ECS Task Execution

  # ============================================================

  resource "aws_iam_role" "ecs_task_execution" {

    name = "${var.environment}-${var.app_name}-ecs-task-execution"

    assume_role_policy = jsonencode({

      Version = "2012-10-17"

      Statement = [

        {

          Action = "sts:AssumeRole"

          Effect = "Allow"

          Principal = {

            Service = "ecs-tasks.amazonaws.com"

          }

        }

      ]

    })

  }

  resource "aws_iam_role_policy_attachment" "ecs_task_execution" {

    role       = aws_iam_role.ecs_task_execution.name

    policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"

  }

  # Additional policy for Secrets Manager access

  resource "aws_iam_role_policy" "secrets_access" {

    count = length(var.secrets) &gt; 0 ? 1 : 0

    name = "${var.environment}-${var.app_name}-secrets-access"

    role = aws_iam_role.ecs_task_execution.id

    policy = jsonencode({

      Version = "2012-10-17"

      Statement = [

        {

          Effect = "Allow"

          Action = [

            "secretsmanager:GetSecretValue",

            "kms:Decrypt"

          ]

          Resource = [for secret in var.secrets : secret.valueFrom]

        }

      ]

    })

  }

  # ============================================================

  # IAM Role for ECS Task (Application Runtime)

  # ============================================================

  resource "aws_iam_role" "ecs_task" {

    name = "${var.environment}-${var.app_name}-ecs-task"

    assume_role_policy = jsonencode({

      Version = "2012-10-17"

      Statement = [

        {

          Action = "sts:AssumeRole"

          Effect = "Allow"

          Principal = {

            Service = "ecs-tasks.amazonaws.com"

          }

        }

      ]

    })

  }

  # Add custom policies for your app (e.g., S3 access, DynamoDB, etc.)

  # resource "aws_iam_role_policy" "app_permissions" { ... }

  # ============================================================

  # ECS Task Definition

  # ============================================================

  resource "aws_ecs_task_definition" "app" {

    family                   = "${var.environment}-${var.app_name}"

    network_mode             = "bridge"  # Use "awsvpc" for Fargate

    requires_compatibilities = ["EC2"]   # Use ["FARGATE"] for Fargate

    cpu                      = var.app_cpu

    memory                   = var.app_memory

    execution_role_arn       = aws_iam_role.ecs_task_execution.arn

    task_role_arn            = aws_iam_role.ecs_task.arn

    container_definitions = jsonencode([

      {

        name      = var.app_name

        image     = var.app_image

        cpu       = var.app_cpu

        memory    = var.app_memory

        essential = true

        portMappings = [

          {

            containerPort = var.app_port

            hostPort      = 0  # Dynamic port mapping (use var.app_port for Fargate)

            protocol      = "tcp"

          }

        ]

        environment = [

          for key, value in var.environment_variables : {

            name  = key

            value = value

          }

        ]

        secrets = var.secrets

        logConfiguration = {

          logDriver = "awslogs"

          options = {

            "awslogs-group"         = aws_cloudwatch_log_group.app.name

            "awslogs-region"        = data.aws_region.current.name

            "awslogs-stream-prefix" = var.app_name

          }

        }

        healthCheck = {

          command = [

            "CMD-SHELL",

            "curl -f http://localhost:${var.app_port}${var.health_check_path} || exit 1"

          ]

          interval    = 30

          timeout     = 5

          retries     = 3

          startPeriod = 60

        }

      }

    ])

    tags = {

      Name = "${var.environment}-${var.app_name}"

    }

  }

  # ============================================================

  # ALB Target Group

  # ============================================================

  resource "aws_lb_target_group" "app" {

    name                 = "${var.environment}-${var.app_name}-tg"

    port                 = var.app_port

    protocol             = "HTTP"

    vpc_id               = var.vpc_id

    target_type          = "instance"  # Use "ip" for Fargate

    deregistration_delay = 30

    health_check {

      enabled             = true

      healthy_threshold   = 2

      unhealthy_threshold = 3

      timeout             = 5

      interval            = 30

      path                = var.health_check_path

      protocol            = "HTTP"

      matcher             = var.health_check_matcher

    }

    tags = {

      Name = "${var.environment}-${var.app_name}-tg"

    }

  }

  # ============================================================

  # ALB Listener Rule

  # ============================================================

  resource "aws_lb_listener_rule" "app" {

    listener_arn = var.alb_listener_arn

    priority     = 100  # Adjust as needed (lower = higher priority)

    action {

      type             = "forward"

      target_group_arn = aws_lb_target_group.app.arn

    }

    condition {

      host_header {

        values = [var.domain_name]

      }

    }

    tags = {

      Name = "${var.environment}-${var.app_name}-rule"

    }

  }

  # ============================================================

  # ECS Service

  # ============================================================

  resource "aws_ecs_service" "app" {

    name            = "${var.environment}-${var.app_name}"

    cluster         = data.aws_ecs_cluster.main.id

    task_definition = aws_ecs_task_definition.app.arn

    desired_count   = var.desired_count



    # Launch type (EC2 or FARGATE)

    launch_type = "EC2"  # Change to "FARGATE" if using Fargate

    # Deployment configuration

    deployment_maximum_percent         = 200

    deployment_minimum_healthy_percent = 50

    # Load balancer configuration

    load_balancer {

      target_group_arn = aws_lb_target_group.app.arn

      container_name   = var.app_name

      container_port   = var.app_port

    }

    # Placement constraints (EC2 only)

    placement_constraints {

      type = "distinctInstance"

    }

    # Depends on listener rule to avoid race condition

    depends_on = [aws_lb_listener_rule.app]

    tags = {

      Name = "${var.environment}-${var.app_name}"

    }

    lifecycle {

      ignore_changes = [desired_count]  # Allow manual scaling without Terraform drift

    }

  }

  # ============================================================

  # Route53 DNS Record

  # ============================================================

  resource "aws_route53_record" "app" {

    zone_id = data.aws_route53_zone.main.zone_id

    name    = var.domain_name

    type    = "A"

    alias {

      name                   = data.aws_lb.main.dns_name

      zone_id                = data.aws_lb.main.zone_id

      evaluate_target_health = false

    }

  }

  Step 4: Outputs

  outputs.tf - Export Important Values

  output "task_definition_arn" {

    description = "ARN of the ECS task definition"

    value       = aws_ecs_task_definition.app.arn

  }

  output "service_name" {

    description = "Name of the ECS service"

    value       = aws_ecs_service.app.name

  }

  output "target_group_arn" {

    description = "ARN of the target group"

    value       = aws_lb_target_group.app.arn

  }

  output "cloudwatch_log_group" {

    description = "CloudWatch log group name"

    value       = aws_cloudwatch_log_group.app.name

  }

  output "app_url" {

    description = "Application URL"

    value       = "https://${var.domain_name}"

  }

  output "dns_name" {

    description = "DNS record created"

    value       = aws_route53_record.app.fqdn

  }
</code></pre>
<p>  <strong>Step 5: Deploy Your Infrastructure</strong></p>
<p>  Initialize Terraform</p>
<p>  cd terraform/</p>
<p>  terraform init</p>
<p>  <strong>What this does</strong>:</p>
<ul>
<li><p>Downloads AWS provider plugins</p>
</li>
<li><p>Initializes remote backend (if configured)</p>
</li>
<li><p>Prepares working directory</p>
<p>Plan Changes</p>
<p>terraform plan -out=tfplan</p>
<p><strong>What this shows</strong>:</p>
</li>
<li><p>Resources to be created (green +)</p>
</li>
<li><p>Resources to be modified (yellow ~)</p>
</li>
<li><p>Resources to be destroyed (red -)</p>
</li>
<li><p>Total changes</p>
<p><strong>Review carefully!</strong> This is your preview before applying changes.</p>
<p>Apply Changes</p>
</li>
</ul>
<pre><code class="lang-plaintext">terraform apply tfplan
</code></pre>
<p>  <strong>What happens</strong>:</p>
<ol>
<li><p>Creates CloudWatch log group</p>
</li>
<li><p>Creates IAM roles and policies</p>
</li>
<li><p>Registers ECS task definition</p>
</li>
<li><p>Creates ALB target group</p>
</li>
<li><p>Creates ALB listener rule</p>
</li>
<li><p>Creates ECS service (launches containers)</p>
</li>
<li><p>Creates Route53 DNS record</p>
<p><strong>Typical completion time</strong>: 2-5 minutes</p>
<p>Verify Deployment</p>
</li>
</ol>
<pre><code class="lang-plaintext">  # Check outputs

  terraform output

  # Check service status

  aws ecs describe-services \

    --cluster production-cluster \

    --services $(terraform output -raw service_name)

  # Check target health

  aws elbv2 describe-target-health \

    --target-group-arn $(terraform output -raw target_group_arn)

  # View logs

  aws logs tail $(terraform output -raw cloudwatch_log_group) --follow
</code></pre>
<p>  <strong>Advanced: Terraform Modules</strong></p>
<p>  For reusability across multiple apps, create a module:</p>
<p>  Module Structure</p>
<pre><code class="lang-plaintext">  modules/

  └── ecs-app/

      ├── main.tf

      ├── variables.tf

      ├── outputs.tf

      └── README.md

  environments/

  ├── production/

  │   ├── main.tf          # Uses module

  │   ├── variables.tf

  │   └── terraform.tfvars

  └── staging/

      ├── main.tf

      ├── variables.tf

      └── terraform.tfvars
</code></pre>
<p>  modules/ecs-app/<a target="_blank" href="http://main.tf">main.tf</a></p>
<p>  Move all resources from previous <a target="_blank" href="http://main.tf">main.tf</a> into this module.</p>
<p>  environments/production/<a target="_blank" href="http://main.tf">main.tf</a> - Use Module</p>
<pre><code class="lang-plaintext">  module "my_app" {

    source = "../../modules/ecs-app"

    aws_region       = var.aws_region

    environment      = "production"

    project_name     = "my-company"

    app_name         = "my-app"

    app_image        = "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:v1.2.3"

    app_port         = 8080

    desired_count    = 3

    domain_name       = "myapp.example.com"

    hosted_zone_name  = "example.com"



    vpc_id            = data.aws_vpc.main.id

    ecs_cluster_name  = "production-cluster"

    alb_arn           = data.aws_lb.production.arn

    alb_listener_arn  = data.aws_lb_listener.https.arn



    environment_variables = {

      ENV = "production"

    }

  }

  module "another_app" {

    source = "../../modules/ecs-app"



    # Different configuration for another app

    app_name = "api-service"

    app_port = 3000

    # ...

  }
</code></pre>
<p>  <strong>Benefits</strong>:</p>
<ul>
<li><p>✅ Deploy multiple apps with same pattern</p>
</li>
<li><p>✅ Consistent configuration across environments</p>
</li>
<li><p>✅ Easy to maintain and update</p>
</li>
<li><p>✅ Reusable across projects</p>
<p><strong>Deployment Workflow with CI/CD</strong></p>
<p>GitHub Actions Example</p>
<p>.github/workflows/deploy.yml:</p>
</li>
</ul>
<pre><code class="lang-plaintext">name: Deploy to ECS

  on:

    push:

      branches: [main]

  env:

    AWS_REGION: us-east-1

    ECR_REPOSITORY: my-app

  jobs:

    deploy:

      runs-on: ubuntu-latest

      steps:

        - name: Checkout code

          uses: actions/checkout@v3

        - name: Configure AWS credentials

          uses: aws-actions/configure-aws-credentials@v2

          with:

            aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}

            aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

            aws-region: ${{ env.AWS_REGION }}

        - name: Login to Amazon ECR

          id: login-ecr

          uses: aws-actions/amazon-ecr-login@v1

        - name: Build, tag, and push image

          env:

            ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}

            IMAGE_TAG: ${{ github.sha }}

          run: |

            docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG .

            docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG

            echo "IMAGE=$ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG" &gt;&gt; $GITHUB_ENV

        - name: Setup Terraform

          uses: hashicorp/setup-terraform@v2

          with:

            terraform_version: 1.5.0

        - name: Terraform Init

          working-directory: ./terraform

          run: terraform init

        - name: Terraform Plan

          working-directory: ./terraform

          run: |

            terraform plan \

              -var="app_image=${{ env.IMAGE }}" \

              -out=tfplan

        - name: Terraform Apply

          working-directory: ./terraform

          run: terraform apply -auto-approve tfplan

        - name: Wait for deployment

          run: |

            aws ecs wait services-stable \

              --cluster production-cluster \

              --services my-app
</code></pre>
<p>  Deployment Process</p>
<ol>
<li><p><strong>Push to main branch</strong> → Triggers workflow</p>
</li>
<li><p><strong>Build Docker image</strong> → Tag with Git SHA</p>
</li>
<li><p><strong>Push to ECR</strong> → Store image</p>
</li>
<li><p><strong>Terraform plan</strong> → Show changes</p>
</li>
<li><p><strong>Terraform apply</strong> → Update infrastructure</p>
</li>
<li><p><strong>Wait for stability</strong> → Ensure deployment succeeds</p>
<p><strong>Managing Updates</strong></p>
<p>Update Application Code</p>
<h1 id="heading-1-update-appimage-in-terraformtfvars">1. Update app_image in terraform.tfvars</h1>
</li>
</ol>
<pre><code class="lang-plaintext">  app_image = "123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:v1.2.4"
</code></pre>
<h1 id="heading-2-plan-and-apply">2. Plan and apply</h1>
<pre><code class="lang-plaintext">  terraform plan -out=tfplan

  terraform apply tfplan
</code></pre>
<p>  <strong>What Terraform does</strong>:</p>
<ul>
<li><p>Creates new task definition revision (:2)</p>
</li>
<li><p>Updates ECS service to use new task definition</p>
</li>
<li><p>ECS performs rolling deployment (zero downtime)</p>
<p>Scale Application</p>
<h1 id="heading-update-desiredcount-in-terraformtfvars">Update desired_count in terraform.tfvars</h1>
<p>desired_count = 5</p>
<h1 id="heading-apply-changes">Apply changes</h1>
<p>terraform apply -var="desired_count=5"</p>
<p>Update Environment Variables</p>
<h1 id="heading-edit-terraformtfvars">Edit terraform.tfvars</h1>
<p>environment_variables = {</p>
<p>ENV       = "production"</p>
<p>LOG_LEVEL = "debug"  # Added</p>
<p>NEW_FEATURE_FLAG = "true"  # Added</p>
<p>}</p>
<h1 id="heading-apply">Apply</h1>
<p>terraform apply</p>
<p><strong>Important</strong>: Changing environment variables creates a new task definition and triggers deployment.</p>
<p><strong>State Management Best Practices</strong></p>
<p>Remote State with S3</p>
</li>
</ul>
<pre><code class="lang-plaintext">  terraform {

    backend "s3" {

      bucket         = "my-company-terraform-state"

      key            = "production/ecs/my-app/terraform.tfstate"

      region         = "us-east-1"

      encrypt        = true

      dynamodb_table = "terraform-state-lock"

      # Enable versioning on S3 bucket

      # Enable encryption

      # Enable bucket logging

    }

  }
</code></pre>
<p>  Create State Backend</p>
<h1 id="heading-create-s3-bucket">Create S3 bucket</h1>
<pre><code class="lang-plaintext">  aws s3 mb s3://my-company-terraform-state --region us-east-1
</code></pre>
<h1 id="heading-enable-versioning">Enable versioning</h1>
<pre><code class="lang-plaintext">  aws s3api put-bucket-versioning \

    --bucket my-company-terraform-state \

    --versioning-configuration Status=Enabled
</code></pre>
<h1 id="heading-enable-encryption">Enable encryption</h1>
<pre><code class="lang-plaintext">  aws s3api put-bucket-encryption \

    --bucket my-company-terraform-state \

    --server-side-encryption-configuration '{

      "Rules": [{

        "ApplyServerSideEncryptionByDefault": {

          "SSEAlgorithm": "AES256"

        }

      }]

    }'
</code></pre>
<h1 id="heading-create-dynamodb-table-for-locking">Create DynamoDB table for locking</h1>
<pre><code class="lang-plaintext">  aws dynamodb create-table \

    --table-name terraform-state-lock \

    --attribute-definitions AttributeName=LockID,AttributeType=S \

    --key-schema AttributeName=LockID,KeyType=HASH \

    --billing-mode PAY_PER_REQUEST
</code></pre>
<p>  State Commands</p>
<h1 id="heading-view-current-state">View current state</h1>
<p>  terraform state list</p>
<h1 id="heading-show-specific-resource">Show specific resource</h1>
<p>  terraform state show aws_ecs_<a target="_blank" href="http://service.app">service.app</a></p>
<h1 id="heading-import-existing-resource">Import existing resource</h1>
<p>  terraform import aws_ecs_<a target="_blank" href="http://service.app">service.app</a> arn:aws:ecs:...</p>
<h1 id="heading-remove-resource-from-state-doesnt-delete">Remove resource from state (doesn't delete)</h1>
<p>  terraform state rm aws_ecs_<a target="_blank" href="http://service.app">service.app</a></p>
<h1 id="heading-move-resource-to-different-address">Move resource to different address</h1>
<p>  terraform state mv aws_ecs_<a target="_blank" href="http://service.app">service.app</a> aws_ecs_service.renamed</p>
<p>  <strong>Troubleshooting Common Issues</strong></p>
<p>  Issue 1: Task Definition Already Exists</p>
<p>  <strong>Error</strong>:</p>
<p>  Error: creating ECS Task Definition: ClientException: Family already exists</p>
<p>  <strong>Solution</strong>: Import existing task definition or use a different family name</p>
<p>  terraform import aws_ecs_task_<a target="_blank" href="http://definition.app">definition.app</a> my-app</p>
<p>  Issue 2: Target Group In Use</p>
<p>  <strong>Error</strong>:</p>
<p>  Error: deleting Target Group: ResourceInUse: Target group is in use</p>
<p>  <strong>Solution</strong>: Remove listener rule first, then target group. Terraform handles this with depends_on.</p>
<p>  Issue 3: Service Won't Stabilize</p>
<p>  <strong>Symptoms</strong>: Terraform times out waiting for service to become stable</p>
<p>  <strong>Check</strong>:</p>
<h1 id="heading-service-events">Service events</h1>
<p>  aws ecs describe-services --cluster production-cluster --services my-app</p>
<h1 id="heading-target-health">Target health</h1>
<p>  aws elbv2 describe-target-health --target-group-arn arn:...</p>
<h1 id="heading-container-logs">Container logs</h1>
<p>  aws logs tail /ecs/production/my-app --follow</p>
<p>  <strong>Common causes</strong>:</p>
<ul>
<li><p>Health check path returning wrong status code</p>
</li>
<li><p>Container port mismatch</p>
</li>
<li><p>Security group blocking traffic</p>
</li>
<li><p>Container crashing on startup</p>
<p>Issue 4: DNS Not Resolving</p>
<p><strong>Check</strong>:</p>
<h1 id="heading-verify-record-created">Verify record created</h1>
</li>
</ul>
<pre><code class="lang-plaintext"> aws route53 list-resource-record-sets \

    --hosted-zone-id Z1234567890ABC \

    --query "ResourceRecordSets[?Name=='myapp.example.com.']"
</code></pre>
<h1 id="heading-test-dns-resolution">Test DNS resolution</h1>
<p>  dig <a target="_blank" href="http://myapp.example.com">myapp.example.com</a></p>
<p>  nslookup <a target="_blank" href="http://myapp.example.com">myapp.example.com</a></p>
<p>  <strong>Cost Optimization</strong></p>
<ol>
<li><p>Right-Size Resources</p>
<h1 id="heading-monitor-cloudwatch-metrics">Monitor CloudWatch metrics</h1>
<h1 id="heading-adjust-based-on-actual-usage">Adjust based on actual usage</h1>
<p>app<em>_cpu    = 256   # Start small</em></p>
<p><em>app_</em>memory = 512   # Increase if needed</p>
</li>
<li><p>Use Spot Instances (Non-Production)</p>
<h1 id="heading-in-ecs-capacity-provider">In ECS capacity provider</h1>
<p>capacity<em>_providers = ["FARGATE_</em>SPOT"]</p>
<h1 id="heading-70-cheaper-than-fargate-on-demand">70% cheaper than Fargate on-demand</h1>
</li>
<li><p>Log Retention</p>
</li>
</ol>
<pre><code class="lang-plaintext">  resource "aws_cloudwatch_log_group" "app" {

    retention_in_days = 7  # vs 30 or 90

  }
</code></pre>
<ol start="4">
<li><p>Cleanup Unused Resources</p>
<h1 id="heading-remove-old-task-definition-revisions">Remove old task definition revisions</h1>
</li>
</ol>
<pre><code class="lang-plaintext">  aws ecs list-task-definitions --family-prefix my-app --status INACTIVE
</code></pre>
<p>  <strong>Security Best Practices</strong></p>
<ol>
<li><p>Use Secrets Manager</p>
<h1 id="heading-never-put-secrets-in-environment-variables">Never put secrets in environment variables!</h1>
<h1 id="heading-use-secrets-parameter-instead">Use secrets parameter instead:</h1>
</li>
</ol>
<pre><code class="lang-plaintext">  secrets = [

    {

      name      = "DATABASE_PASSWORD"

      valueFrom = aws_secretsmanager_secret.db_password.arn

    }

  ]
</code></pre>
<ol start="2">
<li><p>Least Privilege IAM</p>
<h1 id="heading-task-role-only-what-app-needs">Task role - only what app needs</h1>
</li>
</ol>
<pre><code class="lang-plaintext">  resource "aws_iam_role_policy" "app" {

    policy = jsonencode({

      Statement = [

        {

          Effect = "Allow"

          Action = ["s3:GetObject"]

          Resource = ["arn:aws:s3:::my-bucket/*"]

        }

      ]

    })

  }
</code></pre>
<ol start="3">
<li>Enable Container Insights</li>
</ol>
<pre><code class="lang-plaintext">  resource "aws_ecs_cluster" "main" {

    setting {

      name  = "containerInsights"

      value = "enabled"

    }

  }
</code></pre>
<ol start="4">
<li><p>Network Isolation</p>
<h1 id="heading-use-awsvpc-network-mode-with-private-subnets">Use awsvpc network mode with private subnets</h1>
</li>
</ol>
<pre><code class="lang-plaintext">  network_mode = "awsvpc"

  network_configuration {

    subnets         = var.private_subnet_ids

    security_groups = [aws_security_group.app.id]

  }
</code></pre>
<p>  <strong>Terraform vs AWS CLI: Comparison</strong></p>
<pre><code class="lang-plaintext">  | Aspect             | AWS CLI                  | Terraform                               |

  |--------------------|--------------------------|-----------------------------------------|

  | Reproducibility    | Manual re-execution      | Declarative, version-controlled         |

  | Team Collaboration | Difficult (manual docs)  | Easy (code review, shared state)        |

  | Rollback           | Manual, error-prone      | terraform apply previous version        |

  | Drift Detection    | None                     | terraform plan shows drift              |

  | Dependencies       | Manual ordering          | Automatic dependency graph              |

  | Documentation      | Separate                 | Infrastructure as code IS documentation |

  | Learning Curve     | Moderate (many commands) | Steeper initially, easier long-term     |

  | Multi-cloud        | AWS only                 | Works across providers                  |
</code></pre>
<p>  <strong>Key Takeaways</strong></p>
<ol>
<li><p><strong>Infrastructure as Code</strong>: Terraform makes infrastructure reproducible, version-controlled, and collaborative</p>
</li>
<li><p><strong>Modules</strong>: Reuse common patterns across multiple apps and environments</p>
</li>
<li><p><strong>Remote State</strong>: Essential for team collaboration and state locking</p>
</li>
<li><p><strong>Variables</strong>: Separate configuration from code for environment-specific values</p>
</li>
<li><p><strong>Secrets Management</strong>: Never hardcode secrets - use Secrets Manager</p>
</li>
<li><p><strong>CI/CD Integration</strong>: Automate deployments with GitHub Actions or similar</p>
</li>
<li><p><strong>Incremental Adoption</strong>: Can import existing resources with terraform import</p>
<p><strong>Next Steps</strong></p>
</li>
<li><p>Set up remote state backend (S3 + DynamoDB)</p>
</li>
<li><p>Create reusable modules for common patterns</p>
</li>
<li><p>Implement CI/CD pipeline with Terraform</p>
</li>
<li><p>Add auto-scaling with Application Auto Scaling</p>
</li>
<li><p>Enable Container Insights for monitoring</p>
</li>
<li><p>Implement blue/green deployments with CodeDeploy</p>
</li>
<li><p>Add WAF rules to ALB for security</p>
<p><strong>Resources</strong></p>
</li>
<li><p><a target="_blank" href="https://registry.terraform.io/providers/hashicorp/aws/latest/docs">https://registry.terraform.io/providers/hashicorp/aws/latest/docs</a></p>
</li>
<li><p><a target="_blank" href="https://docs.aws.amazon.com/AmazonECS/latest/bestpracticesguide/">https://docs.aws.amazon.com/AmazonECS/latest/bestpracticesguide/</a></p>
</li>
<li><p><a target="_blank" href="https://www.terraform-best-practices.com/">https://www.terraform-best-practices.com/</a></p>
</li>
<li><p><a target="_blank" href="https://github.com/terraform-aws-modules/terraform-aws-ecs">https://github.com/terraform-aws-modules/terraform-aws-ecs</a></p>
</li>
</ol>
<hr />
<p>  <strong>Questions or suggestions?</strong> Drop a comment below!</p>
<p>  <strong>Found this helpful?</strong> Share with your team and follow for more DevOps content!</p>
]]></content:encoded></item><item><title><![CDATA[Building a Secure DevOps Pipeline: Managing Secrets with GCP KMS and Terraform]]></title><description><![CDATA[Introduction
In modern DevOps practices, managing secrets securely is one of the most critical aspects of infrastructure management. Whether it's API keys, database credentials, or JWT private keys, these sensitive pieces of information need to be ha...]]></description><link>https://juliusoh.tech/building-a-secure-devops-pipeline-managing-secrets-with-gcp-kms-and-terraform</link><guid isPermaLink="true">https://juliusoh.tech/building-a-secure-devops-pipeline-managing-secrets-with-gcp-kms-and-terraform</guid><category><![CDATA[Devops]]></category><category><![CDATA[GCP]]></category><category><![CDATA[Terraform]]></category><category><![CDATA[secrets]]></category><category><![CDATA[Security]]></category><dc:creator><![CDATA[Julius Oh]]></dc:creator><pubDate>Thu, 25 Sep 2025 21:10:06 GMT</pubDate><content:encoded><![CDATA[<h3 id="heading-introduction">Introduction</h3>
<p>In modern DevOps practices, managing secrets securely is one of the most critical aspects of infrastructure management. Whether it's API keys, database credentials, or JWT private keys, these sensitive pieces of information need to be handled with care. In this article, I'll walk through how I implemented a comprehensive secrets management solution using Google Cloud Platform's Key Management Service (KMS) and Secret Manager, automated with Terraform and integrated into a CI/CD pipeline.</p>
<h3 id="heading-the-challenge">The Challenge</h3>
<p>When managing multiple services in a cloud environment, you face several challenges:</p>
<p>- Storing hundreds of secrets securely</p>
<p>- Maintaining different configurations for staging and production environments</p>
<p>- Ensuring secrets are encrypted at rest and in transit</p>
<p>- Automating secret rotation and deployment</p>
<p>- Providing secure access to applications without exposing credentials</p>
<h3 id="heading-the-architecture">The Architecture</h3>
<p>My solution leverages several GCP services and tools:</p>
<pre><code class="lang-plaintext">┌─────────────────┐
│   Developer     │
│   Encrypts      │
│   Secrets       │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   GCP KMS       │
│  (Encryption)   │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   Terraform     │
│   Variables     │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   Terraform     │
│   Decrypts &amp;    │
│   Provisions    │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  GCP Secret     │
│   Manager       │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│External Secrets │
│   Operator      │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   Kubernetes    │
│     Pods        │
└─────────────────┘
</code></pre>
<h3 id="heading-how-encryption-works">How Encryption Works</h3>
<h3 id="heading-step-1-encrypting-secrets-with-gcp-kms">Step 1: Encrypting Secrets with GCP KMS</h3>
<p>The first layer of security comes from Google Cloud KMS. Before any secret enters our repository, it's encrypted using environment-specific KMS keys:</p>
<pre><code class="lang-plaintext">echo -n "your-secret-value" | gcloud kms encrypt \
  --location=global \
  --keyring=stage-keyring \
  --key=key-name \
  --plaintext-file=- \
  --ciphertext-file=- \
  --project=your-project-id \
  | base64
</code></pre>
<p>This command:</p>
<p>1. Takes your plaintext secret</p>
<p>2. Encrypts it using a specific KMS key ring and key</p>
<p>3. Outputs the ciphertext in base64 format</p>
<p>The resulting encrypted value looks something like:</p>
<pre><code class="lang-plaintext">CiQA0TkZJRZqLSLT2w+OLxu1imQozdof+xBGTjepJbwrUW/bynESPQBCuzVB...
</code></pre>
<h3 id="heading-step-2-storing-encrypted-values-in-terraform">Step 2: Storing Encrypted Values in Terraform</h3>
<p>The encrypted values are stored in Terraform variable files (`*.tfvars`). Here's the structure:</p>
<pre><code class="lang-plaintext">services_config = {

  "hge-django" = {

    secrets = {

      AWS_ACCESS_KEY_ID     = "CiQA0TkZJRZqLSLT2w+OLxu1imQo..."

      AWS_SECRET_ACCESS_KEY = "CiQA0TkZJUeuYXTwfli7eaNdDg9s..."

      DATABASE_URL          = "CiQA0TkZJRJ/hBTBsbb4UFp+O3JG..."

      API_HOST              = "https://api.com/api"  # Plaintext

    }

    encrypted_keys = [

      "AWS_ACCESS_KEY_ID",

      "AWS_SECRET_ACCESS_KEY",

      "DATABASE_URL"

    ]

  }

}
</code></pre>
<p>Notice how:</p>
<p>- Sensitive values are encrypted (AWS keys, database URLs)</p>
<p>- Non-sensitive values can remain in plaintext (API endpoints)</p>
<p>- The <code>encrypted_keys</code> array tells Terraform which values need decryption</p>
<h3 id="heading-the-terraform-magic">The Terraform Magic</h3>
<p>Dynamic Decryption</p>
<p>The heart of the system is in the Terraform configuration. Here's how it works:</p>
<pre><code class="lang-plaintext">locals {

  # Flatten and decrypt secrets for all services

  decrypted_secrets = {
    for service_name, config in var.services_config : service_name =&gt; {
      for key, value in config.secrets : key =&gt; (
        contains(config.encrypted_keys, key) ?
        data.google_kms_secret.secrets["${service_name}-${key}"].plaintext :
        value
      )
    }
  }

  # Prepare encrypted secrets for KMS decryption

  encrypted_secrets = merge([
    for service_name, config in var.services_config : {
      for key in config.encrypted_keys :
      "${service_name}-${key}" =&gt; {
        service = service_name
        key     = key
        value   = config.secrets[key]
      }
    }
  ]...)
}

# Decrypt using KMS

data "google_kms_secret" "secrets" {
  for_each = local.encrypted_secrets
  crypto_key = "projects/${var.project_id}/locations/global/keyRings/${var.environment}-keyring/cryptoKeys/${var.environment}-key"
  ciphertext = each.value.value

}
</code></pre>
<p>This Terraform code:</p>
<p>1. Iterates through all service configurations</p>
<p>2. Identifies which secrets are encrypted</p>
<p>3. Decrypts them using GCP KMS</p>
<p>4. Combines decrypted and plaintext values</p>
<h3 id="heading-provisioning-to-secret-manager">Provisioning to Secret Manager</h3>
<p>Once decrypted, the secrets are provisioned to GCP Secret Manager:</p>
<pre><code class="lang-plaintext">
module "secret-manager" {
  source     = "GoogleCloudPlatform/secret-manager/google"
  version    = "~&gt; 0.5"
  project_id = var.project_id
  secrets = nonsensitive(flatten([
    for service_name, secrets in local.decrypted_secrets : [
      for key, value in secrets : {
        name        = "${service_name}-${lower(key)}"
        secret_data = value
      }
      if value != ""
    ]
  ]))
  secret_accessors_list = [
    "group:gcp-organization-admins@company.com"
  ]
}
</code></pre>
<p>This creates secrets in GCP Secret Manager with:</p>
<p>- Consistent naming: <code>service-name-secret-key</code></p>
<p>- Proper access controls</p>
<p>- Automatic versioning</p>
<h3 id="heading-the-workflow">The Workflow</h3>
<p>Here's the complete workflow for adding a new secret:</p>
<p>1. <strong>Developer encrypts the secret</strong> using GCP KMS</p>
<p>2. <strong>Updates the Terraform variables</strong> file with the encrypted value</p>
<p>3. <strong>Creates a pull request</strong> for review</p>
<p>4. <strong>CI runs Terraform plan</strong> to validate changes</p>
<p>5. <strong>After approval and merge</strong>, CI/CD deploys to staging</p>
<p>6. <strong>Manual approval</strong> triggers production deployment</p>
<p>7. <strong>External Secrets Operator</strong> syncs to Kubernetes</p>
<p>8. <strong>Applications</strong> access secrets from Kubernetes secrets</p>
<h3 id="heading-security-best-practices">Security Best Practices</h3>
<h3 id="heading-1-encryption-at-rest">1. Encryption at Rest</h3>
<p>All sensitive values are encrypted before entering version control. Even if someone gains access to the repository, they can't decrypt without KMS access.</p>
<h3 id="heading-2-principle-of-least-privilege">2. Principle of Least Privilege</h3>
<p>- KMS keys are environment-specific</p>
<p>- Secret Manager access is controlled via IAM groups</p>
<p>- Applications only access secrets they need</p>
<h3 id="heading-3-audit-trail">3. Audit Trail</h3>
<p>- GCP Cloud Audit Logs track all KMS operations</p>
<p>- Secret Manager maintains access logs</p>
<p>- Terraform state is stored securely in GCS</p>
<h3 id="heading-4-separation-of-concerns">4. Separation of Concerns</h3>
<p>- Developers can add secrets without accessing production systems</p>
<p>- DevOps team controls deployment</p>
<p>- Applications never see encryption keys</p>
<h3 id="heading-5-automated-rotation">5. Automated Rotation</h3>
<p>With everything in Terraform, rotating secrets is as simple as:</p>
<p>1. Generate new secret</p>
<p>2. Encrypt with KMS</p>
<p>3. Update tfvars</p>
<p>4. Deploy through CI/CD</p>
<h3 id="heading-handling-different-secret-types">Handling Different Secret Types</h3>
<p>The system handles various types of secrets elegantly:</p>
<p>Simple Credentials</p>
<pre><code class="lang-plaintext">hcl
DB_USER = "app"  # Can be plaintext
DB_PASS = "CiQA0TkZJabdTVyzUvmU..."  # Encrypted
</code></pre>
<p>Complex JSON Structures</p>
<pre><code class="lang-plaintext">ADMINS = "['admin@company.com']"  # Python list as string
</code></pre>
<p>Multi-line Keys (like RSA Private Keys)</p>
<pre><code class="lang-plaintext">DOCUSIGN_JWT_PRIVATE_KEY = "CiQA0TkZJevqtcXkUTOY3Ddo..."  # Base64 encoded
</code></pre>
<p>The system handles these by:</p>
<p>1. Encrypting the entire content (including newlines)</p>
<p>2. Base64 encoding for storage</p>
<p>3. Proper decryption maintaining format</p>
<p>Kubernetes Integration</p>
<p>The final piece is getting secrets to applications. External Secrets Operator handles this:</p>
<p>1. <strong>Watches Secret Manager</strong> for changes</p>
<p>2. <strong>Creates Kubernetes Secrets</strong> automatically</p>
<p>3. <strong>Mounts to pods</strong> as environment variables or files</p>
<p>4. <strong>Handles rotation</strong> without pod restarts</p>
<p>Example pod configuration:</p>
<pre><code class="lang-plaintext">apiVersion: v1
kind: Pod
spec:
  containers:
  - name: app
    envFrom:
    - secretRef:
        name: secrets
</code></pre>
<h3 id="heading-lessons-learned">Lessons Learned</h3>
<p>1. Start with Encryption Early</p>
<p>Retrofitting encryption is harder than starting with it. Design your secret management from day one.</p>
<p>2. Environment Separation is Critical</p>
<p>Using separate KMS keys for staging and production prevents accidental cross-environment secret usage.</p>
<p>3. Automate Everything</p>
<p>Manual secret management doesn't scale. Automation reduces errors and improves security.</p>
<p>4. Monitor and Audit</p>
<p>Set up alerts for:</p>
<p>- Failed decryption attempts</p>
<p>- Unusual secret access patterns</p>
<p>- Secret rotation reminders</p>
<p>5. Documentation is Security</p>
<p>Well-documented processes mean fewer mistakes. Our README provides clear instructions for the entire team.</p>
<h3 id="heading-performance-considerations">Performance Considerations</h3>
<p>Caching</p>
<p>Terraform caches decrypted values during runs, avoiding repeated KMS calls.</p>
<p>Batch Operations</p>
<p>The system processes all secrets in parallel, reducing deployment time.</p>
<p>Secret Manager Quotas</p>
<p>Be aware of GCP quotas:</p>
<p>- 60,000 secret versions per project</p>
<p>- 90,000 access requests per minute</p>
<h3 id="heading-cost-optimization">Cost Optimization</h3>
<p>The solution is cost-effective:</p>
<p>- <strong>KMS</strong>: $0.06 per key per month + $0.03 per 10,000 operations</p>
<p>- <strong>Secret Manager</strong>: $0.06 per secret per month + $0.03 per 10,000 operations</p>
<p>- <strong>Total monthly cost</strong>: ~$50 for 100 secrets with moderate usage</p>
<h3 id="heading-future-improvements">Future Improvements</h3>
<p>Looking ahead, potential enhancements include:</p>
<p>1. <strong>Secret Rotation Automation</strong>: Implementing automatic rotation for database passwords and API keys</p>
<p>2. <strong>Break-glass Procedures</strong>: Emergency access workflows for critical situations</p>
<p>3. <strong>Multi-region Replication</strong>: For disaster recovery</p>
<p>4. <strong>Secret Usage Analytics</strong>: Track which services use which secrets</p>
<h3 id="heading-conclusion">Conclusion</h3>
<p>Building a secure, scalable secrets management system requires careful planning and the right tools. By combining GCP KMS for encryption, Terraform for infrastructure as code, and automated CI/CD pipelines, we've created a solution that:</p>
<p>- Keeps secrets secure at every stage</p>
<p>- Scales with our infrastructure</p>
<p>- Maintains compliance requirements</p>
<p>- Reduces operational overhead</p>
<p>The key takeaway? Security doesn't have to be complicated. With the right architecture and automation, you can build a secrets management system that's both secure and developer-friendly.</p>
<p>Remember: <strong>Your secrets are only as secure as your weakest link</strong>. Invest in proper secrets management early, and your future self (and security team) will thank you.</p>
<p>---</p>
<p><em>Have questions or suggestions? Feel free to reach out. Security is a community effort, and sharing knowledge makes us all stronger. at juliusoh@gmail.com</em></p>
<h3 id="heading-resources">Resources</h3>
<p>- [GCP KMS Documentation](<a target="_blank" href="https://cloud.google.com/kms/docs">https://cloud.google.com/kms/docs</a>)</p>
<p>- [GCP Secret Manager](<a target="_blank" href="https://cloud.google.com/secret-manager/docs">https://cloud.google.com/secret-manager/docs</a>)</p>
<p>- [Terraform Google Provider](<a target="_blank" href="https://registry.terraform.io/providers/hashicorp/google/latest">https://registry.terraform.io/providers/hashicorp/google/latest</a>)</p>
<p>- [External Secrets Operator](<a target="_blank" href="https://external-secrets.io/">https://external-secrets.io/</a>)</p>
]]></content:encoded></item></channel></rss>