Phase 2 of 6

Backend Deployment

Deploy FastAPI backend to ECS Fargate with load balancing and auto-scaling

2-3 days
Intermediate Level
7 main steps
🚀 2025 Updates Applied!
This phase includes Graviton3 processors (40% cost savings), platform version 1.4.0, and enhanced IAM security with ABAC.

Prerequisites

Step 1: Store Secrets in AWS Secrets Manager 2025 Update

🔐 Why Secrets Manager?
Never hardcode API keys or passwords! Secrets Manager securely stores sensitive data and supports automatic rotation.
1.1
Create secrets.json File

First, create a JSON file with all your secrets. Never commit this file to Git!

json
{
  "SUPABASE_URL": "https://your-project.supabase.co",
  "SUPABASE_ANON_KEY": "your-anon-key",
  "SUPABASE_SERVICE_ROLE_KEY": "your-service-role-key",
  "ANTHROPIC_API_KEY": "your-anthropic-key",
  "OPENAI_API_KEY": "your-openai-key",
  "GEMINI_API_KEY": "your-gemini-key",
  "TAVILY_API_KEY": "your-tavily-key",
  "FIRECRAWL_API_KEY": "your-firecrawl-key",
  "DAYTONA_API_KEY": "your-daytona-key",
  "DAYTONA_SERVER_URL": "https://app.daytona.io/api",
  "DAYTONA_TARGET": "us",
  "REDIS_HOST": "will-be-updated-in-phase-4",
  "REDIS_PORT": "6379",
  "REDIS_PASSWORD": "",
  "REDIS_SSL": "False",
  "SENTRY_DSN": "your-sentry-dsn",
  "LANGFUSE_PUBLIC_KEY": "your-langfuse-public-key",
  "LANGFUSE_SECRET_KEY": "your-langfuse-secret-key",
  "LANGFUSE_HOST": "https://cloud.langfuse.com",
  "STRIPE_SECRET_KEY": "your-stripe-secret-key",
  "STRIPE_WEBHOOK_SECRET": "your-stripe-webhook-secret",
  "ENV_MODE": "production"
}
1.2
Upload Secrets to AWS
bash
# Create secret in Secrets Manager
aws secretsmanager create-secret \
  --name helium/backend/production \
  --description "Helium backend production secrets" \
  --secret-string file://secrets.json \
  --region us-east-1

# Verify secret was created
aws secretsmanager describe-secret \
  --secret-id helium/backend/production \
  --region us-east-1
✅ 2025 Security Enhancement:
Secrets are automatically encrypted with AWS KMS. You can enable automatic rotation in Step 1.3.
1.3
Create IAM Roles for ECS Enhanced 2025

ECS needs two types of roles: Task Role (for your application) and Execution Role (for ECS agent).

bash
# Create trust policy for ECS tasks
cat > ecs-trust-policy.json < ecs-secrets-policy.json <
💡 Understanding IAM Roles:
Task Role: What your application can do (access S3, call APIs)
Execution Role: What ECS can do (pull images, write logs)

Step 2: Create ECS Cluster Graviton3 Support

2.1
Create Cluster with Fargate
⚡ 2025 Performance Boost:
We'll configure the cluster to support both Fargate (standard) and Fargate Spot (70% cheaper for non-critical workloads).
bash
# Create ECS cluster
aws ecs create-cluster \
  --cluster-name helium-production-cluster \
  --capacity-providers FARGATE FARGATE_SPOT \
  --default-capacity-provider-strategy \
    capacityProvider=FARGATE,weight=1 \
    capacityProvider=FARGATE_SPOT,weight=0 \
  --region us-east-1

# Verify cluster
aws ecs describe-clusters \
  --clusters helium-production-cluster \
  --region us-east-1

Step 3: Create ECS Task Definition Platform 1.4.0

3.1
Build and Push Docker Image
bash
# Get your AWS account ID
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)

# Navigate to backend directory
cd backend

# Build Docker image
docker build -t helium-backend:latest .

# Login to ECR
aws ecr get-login-password --region us-east-1 | \
  docker login --username AWS --password-stdin \
  $ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com

# Tag image
docker tag helium-backend:latest \
  $ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/helium-backend:latest

# Push to ECR
docker push $ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/helium-backend:latest
3.2
Create Task Definition with Graviton3
💰 Cost Savings Alert!
Using Graviton3 (ARM64) processors saves 40% on compute costs with 25-30% better performance. Highly recommended!

Create task-definition-backend.json:

json
{
  "family": "helium-backend",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "2048",
  "memory": "4096",
  "runtimePlatform": {
    "cpuArchitecture": "ARM64",
    "operatingSystemFamily": "LINUX"
  },
  "executionRoleArn": "arn:aws:iam::ACCOUNT_ID:role/helium-ecs-execution-role",
  "taskRoleArn": "arn:aws:iam::ACCOUNT_ID:role/helium-ecs-task-role",
  "containerDefinitions": [
    {
      "name": "helium-backend",
      "image": "ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/helium-backend:latest",
      "essential": true,
      "portMappings": [
        {
          "containerPort": 8000,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {
          "name": "ENV_MODE",
          "value": "production"
        },
        {
          "name": "AWS_REGION",
          "value": "us-east-1"
        }
      ],
      "secrets": [
        {
          "name": "SUPABASE_URL",
          "valueFrom": "arn:aws:secretsmanager:us-east-1:ACCOUNT_ID:secret:helium/backend/production:SUPABASE_URL::"
        },
        {
          "name": "SUPABASE_ANON_KEY",
          "valueFrom": "arn:aws:secretsmanager:us-east-1:ACCOUNT_ID:secret:helium/backend/production:SUPABASE_ANON_KEY::"
        },
        {
          "name": "ANTHROPIC_API_KEY",
          "valueFrom": "arn:aws:secretsmanager:us-east-1:ACCOUNT_ID:secret:helium/backend/production:ANTHROPIC_API_KEY::"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/helium-backend",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs",
          "awslogs-create-group": "true"
        }
      },
      "healthCheck": {
        "command": [
          "CMD-SHELL",
          "curl -f http://localhost:8000/api/health || exit 1"
        ],
        "interval": 30,
        "timeout": 5,
        "retries": 3,
        "startPeriod": 60
      }
    }
  ]
}
⚠️ Important:
Replace ACCOUNT_ID with your actual AWS account ID. Add all secrets from your secrets.json file to the "secrets" array.
3.3
Register Task Definition
bash
# Register task definition
aws ecs register-task-definition \
  --cli-input-json file://task-definition-backend.json \
  --region us-east-1

# Verify registration
aws ecs describe-task-definition \
  --task-definition helium-backend \
  --region us-east-1

Step 4: Create Application Load Balancer

🌐 What is an ALB?
Application Load Balancer distributes incoming traffic across multiple ECS tasks, provides SSL termination, and performs health checks.
4.1
Request SSL Certificate

First, request an SSL certificate for your domain using AWS Certificate Manager (ACM).

bash
# Request certificate
aws acm request-certificate \
  --domain-name api.he2.ai \
  --subject-alternative-names "*.he2.ai" \
  --validation-method DNS \
  --region us-east-1

# Note the certificate ARN from the output
# You'll need to add DNS records to validate the certificate
⏳ Validation Required:
After requesting the certificate, you must add DNS validation records to your domain. This usually takes 5-30 minutes.
4.2
Create Target Group
bash
# Create target group
aws elbv2 create-target-group \
  --name helium-backend-tg \
  --protocol HTTP \
  --port 8000 \
  --vpc-id vpc-xxxxxxxxx \
  --target-type ip \
  --health-check-path /api/health \
  --health-check-interval-seconds 30 \
  --health-check-timeout-seconds 5 \
  --healthy-threshold-count 2 \
  --unhealthy-threshold-count 3 \
  --region us-east-1
4.3
Create Application Load Balancer
bash
# Create ALB
aws elbv2 create-load-balancer \
  --name helium-production-alb \
  --subnets subnet-public-1a subnet-public-1b \
  --security-groups sg-alb-xxxxxxxxx \
  --scheme internet-facing \
  --type application \
  --ip-address-type ipv4 \
  --region us-east-1

# Create HTTPS listener (after certificate is validated)
aws elbv2 create-listener \
  --load-balancer-arn arn:aws:elasticloadbalancing:us-east-1:ACCOUNT_ID:loadbalancer/app/helium-production-alb/xxxxxxxxx \
  --protocol HTTPS \
  --port 443 \
  --certificates CertificateArn=arn:aws:acm:us-east-1:ACCOUNT_ID:certificate/xxxxxxxxx \
  --default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:us-east-1:ACCOUNT_ID:targetgroup/helium-backend-tg/xxxxxxxxx

# Create HTTP listener (redirect to HTTPS)
aws elbv2 create-listener \
  --load-balancer-arn arn:aws:elasticloadbalancing:us-east-1:ACCOUNT_ID:loadbalancer/app/helium-production-alb/xxxxxxxxx \
  --protocol HTTP \
  --port 80 \
  --default-actions Type=redirect,RedirectConfig='{Protocol=HTTPS,Port=443,StatusCode=HTTP_301}'

Step 5: Create ECS Service

5.1
Create Backend Service

The ECS service ensures your desired number of tasks are always running.

bash
# Create ECS service
aws ecs create-service \
  --cluster helium-production-cluster \
  --service-name helium-backend-service \
  --task-definition helium-backend \
  --desired-count 2 \
  --launch-type FARGATE \
  --platform-version LATEST \
  --network-configuration "awsvpcConfiguration={subnets=[subnet-private-1a,subnet-private-1b],securityGroups=[sg-ecs-xxxxxxxxx],assignPublicIp=DISABLED}" \
  --load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:us-east-1:ACCOUNT_ID:targetgroup/helium-backend-tg/xxxxxxxxx,containerName=helium-backend,containerPort=8000" \
  --health-check-grace-period-seconds 60 \
  --enable-execute-command \
  --region us-east-1

# Check service status
aws ecs describe-services \
  --cluster helium-production-cluster \
  --services helium-backend-service \
  --region us-east-1
✅ Platform Version LATEST:
Using LATEST ensures you automatically get security updates and new features. This is the 2025 best practice!
5.2
Test Health Endpoint
bash
# Get ALB DNS name
ALB_DNS=$(aws elbv2 describe-load-balancers \
  --names helium-production-alb \
  --query 'LoadBalancers[0].DNSName' \
  --output text \
  --region us-east-1)

echo "ALB DNS: $ALB_DNS"

# Test health endpoint (wait 2-3 minutes for tasks to start)
curl https://$ALB_DNS/api/health

# Expected response: {"status": "healthy"}

Step 6: Configure Auto Scaling Predictive Scaling

📈 What is Auto Scaling?
Auto scaling automatically adjusts the number of running tasks based on CPU, memory, or custom metrics. This ensures your app handles traffic spikes without manual intervention.
6.1
Register Scalable Target
bash
# Register scalable target
aws application-autoscaling register-scalable-target \
  --service-namespace ecs \
  --scalable-dimension ecs:service:DesiredCount \
  --resource-id service/helium-production-cluster/helium-backend-service \
  --min-capacity 2 \
  --max-capacity 20 \
  --region us-east-1
6.2
Create CPU-Based Scaling Policy
bash
# Create target tracking scaling policy
aws application-autoscaling put-scaling-policy \
  --service-namespace ecs \
  --scalable-dimension ecs:service:DesiredCount \
  --resource-id service/helium-production-cluster/helium-backend-service \
  --policy-name helium-backend-cpu-scaling \
  --policy-type TargetTrackingScaling \
  --target-tracking-scaling-policy-configuration '{
    "TargetValue": 70.0,
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ECSServiceAverageCPUUtilization"
    },
    "ScaleInCooldown": 300,
    "ScaleOutCooldown": 60
  }' \
  --region us-east-1
💡 How This Works:
When average CPU exceeds 70%, ECS adds more tasks. When it drops below 70%, it removes tasks. Scale-out is fast (60s), scale-in is slow (300s) to avoid flapping.

Step 7: Create Background Worker Service

🔄 What are Workers?
Workers handle background tasks (like agent execution) asynchronously using Dramatiq and Redis. They run separately from your API.
7.1
Create Worker Task Definition

Workers use the same Docker image but with a different command.

json
{
  "family": "helium-worker",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "2048",
  "memory": "4096",
  "runtimePlatform": {
    "cpuArchitecture": "ARM64",
    "operatingSystemFamily": "LINUX"
  },
  "executionRoleArn": "arn:aws:iam::ACCOUNT_ID:role/helium-ecs-execution-role",
  "taskRoleArn": "arn:aws:iam::ACCOUNT_ID:role/helium-ecs-task-role",
  "containerDefinitions": [
    {
      "name": "helium-worker",
      "image": "ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/helium-backend:latest",
      "essential": true,
      "command": [
        "uv", "run", "dramatiq",
        "--skip-logging",
        "--processes", "8",
        "--threads", "8",
        "run_agent_background"
      ],
      "environment": [
        {
          "name": "ENV_MODE",
          "value": "production"
        }
      ],
      "secrets": [
        // Same secrets as backend
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/helium-worker",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs",
          "awslogs-create-group": "true"
        }
      }
    }
  ]
}
7.2
Deploy Worker Service
bash
# Register worker task definition
aws ecs register-task-definition \
  --cli-input-json file://task-definition-worker.json \
  --region us-east-1

# Create worker service
aws ecs create-service \
  --cluster helium-production-cluster \
  --service-name helium-worker-service \
  --task-definition helium-worker \
  --desired-count 2 \
  --launch-type FARGATE \
  --platform-version LATEST \
  --network-configuration "awsvpcConfiguration={subnets=[subnet-private-1a,subnet-private-1b],securityGroups=[sg-ecs-xxxxxxxxx],assignPublicIp=DISABLED}" \
  --region us-east-1
💰 Cost Optimization Tip:
Consider using Fargate Spot for worker tasks to save up to 70%. Workers can tolerate interruptions better than API servers.

Phase 2 Verification Checklist

  • Secrets stored in AWS Secrets Manager
  • IAM roles created (task role and execution role)
  • ECS cluster created
  • Docker image built and pushed to ECR
  • Task definitions registered (backend + worker)
  • SSL certificate requested and validated
  • Application Load Balancer created
  • Target group created and healthy
  • ECS backend service running (2+ tasks)
  • Health endpoint responding via ALB
  • Auto scaling configured
  • Worker service running
  • All logs appearing in CloudWatch