Back to Blog
AWSECSContainersDevOps

AWS ECS Production Deployment: The Complete Guide

Deploy containerized applications on AWS ECS with auto-scaling, blue/green deployments, and production-grade monitoring.

Azynth Team
14 min read

AWS ECS Production Deployment: The Complete Guide

Amazon ECS is AWS's container orchestration service. It's simpler than Kubernetes but powerful enough for most production workloads. Here's how to deploy containers on ECS like a pro.

ECS vs Kubernetes vs Fargate

ECS EC2:

  • You manage EC2 instances
  • Full control over instance types
  • Lower cost for steady workloads

ECS Fargate:

  • Serverless containers
  • No instance management
  • Pay per vCPU/GB/second
  • ~30% cost premium

Kubernetes (EKS):

  • More complex, more powerful
  • Better for multi-cloud
  • Larger ecosystem

Recommendation: Start with Fargate, move to EC2 if cost or customization needs demand it.

Core Concepts

ECS Cluster
├── Services (long-running tasks)
│   ├── Task Definition (container specs)
│   ├── Tasks (running containers)
│   └── Load Balancer
└── Scheduled Tasks (cron jobs)

Task Definition: The Blueprint

{ "family": "api", "networkMode": "awsvpc", "requiresCompatibilities": ["FARGATE"], "cpu": "512", "memory": "1024", "taskRoleArn": "arn:aws:iam::ACCOUNT:role/api-task-role", "executionRoleArn": "arn:aws:iam::ACCOUNT:role/ecs-execution-role", "containerDefinitions": [{ "name": "api", "image": "123456789.dkr.ecr.us-east-1.amazonaws.com/api:v1.2.3", "portMappings": [{ "containerPort": 8080, "protocol": "tcp" }], "environment": [ {"name": "NODE_ENV", "value": "production"} ], "secrets": [ { "name": "DATABASE_URL", "valueFrom": "arn:aws:secretsmanager:us-east-1:ACCOUNT:secret:db-url" } ], "logConfiguration": { "logDriver": "awslogs", "options": { "awslogs-group": "/ecs/api", "awslogs-region": "us-east-1", "awslogs-stream-prefix": "ecs" } }, "healthCheck": { "command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"], "interval": 30, "timeout": 5, "retries": 3, "startPeriod": 60 } }] }

Infrastructure as Code with Terraform

# VPC for ECS module "vpc" { source = "terraform-aws-modules/vpc/aws" name = "ecs-vpc" cidr = "10.0.0.0/16" azs = ["us-east-1a", "us-east-1b", "us-east-1c"] private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"] public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"] enable_nat_gateway = true single_nat_gateway = false # HA: one per AZ } # ECS Cluster resource "aws_ecs_cluster" "main" { name = "production" setting { name = "containerInsights" value = "enabled" } } # Application Load Balancer resource "aws_lb" "api" { name = "api-lb" load_balancer_type = "application" subnets = module.vpc.public_subnets security_groups = [aws_security_group.alb.id] } resource "aws_lb_target_group" "api" { name = "api-tg" port = 8080 protocol = "HTTP" vpc_id = module.vpc.vpc_id target_type = "ip" health_check { path = "/health" healthy_threshold = 2 unhealthy_threshold = 3 timeout = 5 interval = 30 matcher = "200" } deregistration_delay = 30 } resource "aws_lb_listener" "api" { load_balancer_arn = aws_lb.api.arn port = "443" protocol = "HTTPS" ssl_policy = "ELBSecurityPolicy-TLS-1-2-2017-01" certificate_arn = aws_acm_certificate.main.arn default_action { type = "forward" target_group_arn = aws_lb_target_group.api.arn } } # ECS Service resource "aws_ecs_service" "api" { name = "api" cluster = aws_ecs_cluster.main.id task_definition = aws_ecs_task_definition.api.arn desired_count = 3 launch_type = "FARGATE" network_configuration { subnets = module.vpc.private_subnets security_groups = [aws_security_group.api.id] assign_public_ip = false } load_balancer { target_group_arn = aws_lb_target_group.api.arn container_name = "api" container_port = 8080 } deployment_configuration { maximum_percent = 200 minimum_healthy_percent = 100 deployment_circuit_breaker { enable = true rollback = true } } # Graceful deployments depends_on = [aws_lb_listener.api] }

Auto-Scaling

Scale based on CPU, memory, or custom metrics:

# Target tracking: Maintain 70% CPU resource "aws_appautoscaling_target" "api" { service_namespace = "ecs" scalable_dimension = "ecs:service:DesiredCount" resource_id = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.api.name}" min_capacity = 3 max_capacity = 20 } resource "aws_appautoscaling_policy" "api_cpu" { name = "api-cpu-scaling" policy_type = "TargetTrackingScaling" service_namespace = aws_appautoscaling_target.api.service_namespace scalable_dimension = aws_appautoscaling_target.api.scalable_dimension resource_id = aws_appautoscaling_target.api.resource_id target_tracking_scaling_policy_configuration { predefined_metric_specification { predefined_metric_type = "ECSServiceAverageCPUUtilization" } target_value = 70.0 scale_in_cooldown = 300 scale_out_cooldown = 60 } } # Custom metric: Scale on request count resource "aws_appautoscaling_policy" "api_requests" { name = "api-request-scaling" policy_type = "TargetTrackingScaling" service_namespace = aws_appautoscaling_target.api.service_namespace scalable_dimension = aws_appautoscaling_target.api.scalable_dimension resource_id = aws_appautoscaling_target.api.resource_id target_tracking_scaling_policy_configuration { predefined_metric_specification { predefined_metric_type = "ALBRequestCountPerTarget" resource_label = "${aws_lb.api.arn_suffix}/${aws_lb_target_group.api.arn_suffix}" } target_value = 1000 # Requests per target per minute } }

Blue/Green Deployments

Zero-downtime deployments with CodeDeploy:

resource "aws_codedeploy_app" "api" { name = "api" compute_platform = "ECS" } resource "aws_codedeploy_deployment_group" "api" { app_name = aws_codedeploy_app.api.name deployment_group_name = "api-deployment-group" service_role_arn = aws_iam_role.codedeploy.arn deployment_config_name = "CodeDeployDefault.ECSAllAtOnce" blue_green_deployment_config { terminate_blue_instances_on_deployment_success { action = "TERMINATE" termination_wait_time_in_minutes = 5 } deployment_ready_option { action_on_timeout = "CONTINUE_DEPLOYMENT" } } ecs_service { cluster_name = aws_ecs_cluster.main.name service_name = aws_ecs_service.api.name } load_balancer_info { target_group_pair_info { prod_traffic_route { listener_arns = [aws_lb_listener.api.arn] } target_group { name = aws_lb_target_group.api_blue.name } target_group { name = aws_lb_target_group.api_green.name } } } }

Secrets Management

Never hardcode secrets:

# Store secrets in Secrets Manager resource "aws_secretsmanager_secret" "db_url" { name = "production/database-url" } resource "aws_secretsmanager_secret_version" "db_url" { secret_id = aws_secretsmanager_secret.db_url.id secret_string = var.database_url } # Grant ECS task execution role access resource "aws_iam_role_policy" "ecs_secrets" { role = aws_iam_role.ecs_execution.id policy = jsonencode({ Version = "2012-10-17" Statement = [{ Effect = "Allow" Action = [ "secretsmanager:GetSecretValue" ] Resource = [ aws_secretsmanager_secret.db_url.arn ] }] }) }

Scheduled Tasks (Cron Jobs)

# EventBridge rule for scheduled task resource "aws_cloudwatch_event_rule" "daily_report" { name = "daily-report" description = "Run daily report at 2 AM UTC" schedule_expression = "cron(0 2 * * ? *)" } resource "aws_cloudwatch_event_target" "daily_report" { rule = aws_cloudwatch_event_rule.daily_report.name target_id = "daily-report-task" arn = aws_ecs_cluster.main.arn role_arn = aws_iam_role.events.arn ecs_target { task_count = 1 task_definition_arn = aws_ecs_task_definition.report.arn launch_type = "FARGATE" network_configuration { subnets = module.vpc.private_subnets security_groups = [aws_security_group.tasks.id] } } }

Logging and Monitoring

# CloudWatch Log Group resource "aws_cloudwatch_log_group" "api" { name = "/ecs/api" retention_in_days = 30 } # Container Insights metrics resource "aws_cloudwatch_dashboard" "ecs" { dashboard_name = "ECS-Production" dashboard_body = jsonencode({ widgets = [ { type = "metric" properties = { metrics = [ ["AWS/ECS", "CPUUtilization", {stat = "Average"}], [".", "MemoryUtilization", {stat = "Average"}] ] period = 300 stat = "Average" region = "us-east-1" title = "ECS Resource Utilization" } } ] }) }

CI/CD Pipeline

# GitHub Actions: Build and deploy name: Deploy to ECS on: push: branches: [main] jobs: deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v2 with: role-to-assume: ${{ secrets.AWS_ROLE_ARN }} aws-region: us-east-1 - name: Login to ECR id: ecr-login uses: aws-actions/amazon-ecr-login@v1 - name: Build and push image env: ECR_REGISTRY: ${{ steps.ecr-login.outputs.registry }} IMAGE_TAG: ${{ github.sha }} run: | docker build -t $ECR_REGISTRY/api:$IMAGE_TAG . docker push $ECR_REGISTRY/api:$IMAGE_TAG - name: Update task definition id: task-def uses: aws-actions/amazon-ecs-render-task-definition@v1 with: task-definition: task-definition.json container-name: api image: ${{ steps.ecr-login.outputs.registry }}/api:${{ github.sha }} - name: Deploy to ECS uses: aws-actions/amazon-ecs-deploy-task-definition@v1 with: task-definition: ${{ steps.task-def.outputs.task-definition }} service: api cluster: production wait-for-service-stability: true

Cost Optimization

1. Fargate Spot

70% discount for interruptible workloads:

resource "aws_ecs_service" "batch" { capacity_provider_strategy { capacity_provider = "FARGATE_SPOT" weight = 100 base = 0 } }

2. Right-sizing

Monitor and adjust:

# CloudWatch metrics show actual usage aws cloudwatch get-metric-statistics \ --namespace AWS/ECS \ --metric-name CPUUtilization \ --dimensions Name=ServiceName,Value=api \ --start-time 2024-01-01T00:00:00Z \ --end-time 2024-01-07T23:59:59Z \ --period 3600 \ --statistics Average

3. Savings Plans

Commit to usage for 30-50% savings.

Production Checklist

  • Health checks configured
  • Auto-scaling enabled
  • Secrets in Secrets Manager/Parameter Store
  • Logs exported to CloudWatch
  • Container Insights enabled
  • Task role follows least privilege
  • Deployment circuit breaker enabled
  • Multi-AZ deployment
  • Load balancer in front
  • Resource limits set (CPU/memory)
  • Blue/green deployments for critical services
  • Monitoring and alerting configured

When NOT to Use ECS

  • Heavy Kubernetes investment: Stick with K8s
  • Multi-cloud requirement: Use Kubernetes
  • Complex service mesh needs: Consider EKS + Istio
  • Self-hosted requirement: Use Docker Swarm or Nomad

Conclusion

ECS strikes the balance between simplicity and power. It's the sweet spot for teams that want containers without Kubernetes complexity.

Start with Fargate for simplicity, optimize costs with EC2 launch type when needed, and leverage AWS integrations for a seamless production experience.


Need help architecting your ECS deployment? Schedule a consultation to discuss your container strategy.

You might also like