Chapter 14: Deploying to AWS ECS Fargate & Secrets Management

Welcome to Chapter 14! So far, we’ve built a robust, containerized Node.js API. In this chapter, we take a significant leap towards production by deploying our application to a scalable, serverless environment: AWS Elastic Container Service (ECS) with Fargate. This move shifts our operational burden, allowing us to focus more on development rather than infrastructure management.

Deploying to a cloud environment like AWS ECS Fargate is crucial for real-world applications. It provides high availability, scalability, and integration with other AWS services, ensuring our API can handle varying loads and remain resilient. We’ll leverage Fargate’s serverless compute engine to run our Docker containers without provisioning or managing servers. A critical aspect of production deployment is secure secrets management. We will integrate AWS Secrets Manager to handle sensitive environment variables like database credentials and API keys, ensuring they are never hardcoded or exposed.

By the end of this chapter, you will have your Node.js API running live on AWS ECS Fargate, accessible via an Application Load Balancer, with all sensitive configurations securely managed. This setup forms the foundation for a production-grade, highly available backend service. Before proceeding, ensure you have an active AWS account and the AWS CLI configured on your local machine.

Planning & Design

Deploying to AWS ECS Fargate involves several interconnected AWS services. Understanding their roles and how they interact is key to a successful deployment. Our application will be containerized, pushed to a container registry, and then orchestrated by ECS. An Application Load Balancer (ALB) will distribute incoming traffic across multiple instances of our application, and AWS Secrets Manager will inject sensitive configurations securely.

Component Architecture

Here’s a visual representation of our deployment architecture on AWS:

flowchart TD Client[Client Application] --> InternetGateway[Internet Gateway] InternetGateway --> ALB[Application Load Balancer] ALB --->|HTTP HTTPS| ECSService[ECS Service] ECSService --> ECSCluster[ECS Cluster Fargate] ECSCluster --> ECSTask1[ECS Task Container 1] ECSCluster --> ECSTask2[ECS Task Container 2] ECSTask1 --> ECR_Repository[ECR Repository] ECSTask2 --> ECR_Repository ECSTask1 --> SecretsManager[AWS Secrets Manager] ECSTask2 --> SecretsManager ECSTask1 --> CloudWatchLogs[CloudWatch Logs] ECSTask2 --> CloudWatchLogs ECSService --> CloudWatchMetrics[CloudWatch Metrics] ECSTask1 <--> RDS[RDS Database] ECSTask2 <--> RDS

Client Application: Your frontend (or Postman/Insomnia) making requests.
Application Load Balancer (ALB): Distributes incoming traffic to the ECS tasks. Handles SSL termination and health checks.
ECS Cluster (Fargate): A logical grouping of tasks. Fargate is the serverless compute engine that runs your containers without server management.
ECS Service: Maintains the desired number of tasks (instances of your application) within the cluster, handles scaling, and integrates with the ALB.
ECS Task: An instance of your application running in a Docker container, defined by a Task Definition.
ECR Repository: Amazon Elastic Container Registry, where our Docker images are stored.
AWS Secrets Manager: Securely stores and manages sensitive information (e.g., database credentials, API keys) which are injected into ECS tasks as environment variables.
CloudWatch Logs: Collects logs from our ECS tasks for monitoring and debugging.
RDS Database: Our relational database (e.g., PostgreSQL) from previous chapters, now securely accessed by ECS tasks.

File Structure Additions

We’ll be creating a new directory and some configuration files to define our AWS resources:

.
├── src/
├── ...
├── Dockerfile                  # Updated for production
├── .env                        # Local environment variables
├── .dockerignore
└── ecs/                        # New directory for ECS configurations
    ├── task-definition.json    # Defines our container and its settings
    └── service-definition.json # Defines how our service runs in the cluster

Step-by-Step Implementation

We’ll break down the deployment process into manageable steps, starting with preparing our AWS environment, securing our secrets, defining our container orchestrations, and finally deploying and verifying.

Feature 1: Prepare AWS Infrastructure

Before we deploy our application, we need to set up the foundational AWS services: an Elastic Container Registry (ECR) to store our Docker image, an ECS Cluster to host our service, and appropriate IAM roles for permissions.

a) Setup/Configuration

We’ll use the AWS CLI for most of these steps. Ensure your AWS CLI is configured with credentials that have sufficient permissions to create these resources.

Create an ECR Repository: This is where your Docker images will live.
Create an ECS Cluster: This is the logical grouping where your Fargate tasks will run.
Create IAM Roles:
- ECS Task Execution Role: Grants permissions for ECS to pull images from ECR, send logs to CloudWatch, and fetch secrets from Secrets Manager.
- ECS Task Role (Optional but recommended): If your application needs to interact with other AWS services (e.g., S3, DynamoDB) directly using AWS SDK, this role would grant those permissions. For now, we’ll focus on the execution role.

b) Core Implementation

Let’s execute the AWS CLI commands. Replace your-project-name with a unique identifier for your project.

1. Create ECR Repository:

# Create an ECR repository for our application
aws ecr create-repository \
    --repository-name your-project-name-api \
    --image-tag-mutability MUTABLE \
    --image-scanning-configuration scanOnPush=true \
    --region us-east-1

--repository-name: A unique name for your Docker image repository.
--image-tag-mutability MUTABLE: Allows tags to be overwritten (useful for latest tag in development, but consider IMMUTABLE for production releases).
--image-scanning-configuration scanOnPush=true: Enables vulnerability scanning when images are pushed.
--region: Specify your desired AWS region.

2. Create ECS Cluster:

# Create an ECS cluster
aws ecs create-cluster \
    --cluster-name your-project-name-cluster \
    --region us-east-1

--cluster-name: A descriptive name for your ECS cluster.

3. Create IAM Task Execution Role and Policy:

First, create a trust policy JSON file (ecs-task-execution-trust-policy.json) that allows the ecs-tasks.amazonaws.com service to assume this role.

// ecs-task-execution-trust-policy.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "ecs-tasks.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Now, create the role using the AWS CLI:

# Create the IAM role
aws iam create-role \
    --role-name your-project-name-ecsTaskExecutionRole \
    --assume-role-policy-document file://ecs-task-execution-trust-policy.json \
    --description "Allows ECS tasks to call AWS services on your behalf."

# Attach the managed policy for ECS task execution
aws iam attach-role-policy \
    --role-name your-project-name-ecsTaskExecutionRole \
    --policy-arn arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy

# Attach a policy to allow Secrets Manager access (inline policy for simplicity)
# IMPORTANT: Replace YOUR_AWS_ACCOUNT_ID and YOUR_AWS_REGION
aws iam put-role-policy \
    --role-name your-project-name-ecsTaskExecutionRole \
    --policy-name SecretsManagerAccessPolicy \
    --policy-document '{
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "secretsmanager:GetSecretValue",
                    "kms:Decrypt"
                ],
                "Resource": [
                    "arn:aws:secretsmanager:YOUR_AWS_REGION:YOUR_AWS_ACCOUNT_ID:secret:your-project-name-api-*"
                ]
            }
        ]
    }'

The AmazonECSTaskExecutionRolePolicy grants permissions for ECR pulls and CloudWatch logs.
The SecretsManagerAccessPolicy specifically allows the role to retrieve secret values from Secrets Manager, using a wildcard * for secrets prefixed with your-project-name-api-. This adheres to the principle of least privilege by limiting access to only relevant secrets. Remember to replace YOUR_AWS_ACCOUNT_ID and YOUR_AWS_REGION.

c) Testing This Component

ECR: Go to the AWS Console -> ECR. Verify your-project-name-api repository exists.
ECS Cluster: Go to the AWS Console -> ECS. Verify your-project-name-cluster cluster is listed.
IAM Role: Go to the AWS Console -> IAM -> Roles. Search for your-project-name-ecsTaskExecutionRole. Verify it has AmazonECSTaskExecutionRolePolicy and SecretsManagerAccessPolicy attached.

Feature 2: Configure Secrets Management with AWS Secrets Manager

Hardcoding sensitive information in configuration files or environment variables is a major security risk. AWS Secrets Manager provides a secure way to store, manage, and retrieve your credentials.

a) Setup/Configuration

Identify all sensitive environment variables from your .env file that should not be directly exposed in your Task Definition. Common examples include:

DATABASE_URL
JWT_SECRET
API_KEY (for external services)
CLOUDINARY_API_KEY, CLOUDINARY_API_SECRET (from Chapter 11)

We will store these as separate secrets in Secrets Manager or as a single JSON secret. For simplicity and granular control, let’s store them individually.

b) Core Implementation

Using the AWS CLI, create individual secrets for your application. Replace placeholders with your actual values.

# Store DATABASE_URL
aws secretsmanager create-secret \
    --name your-project-name-api-DATABASE_URL \
    --secret-string "postgresql://user:password@host:port/database" \
    --region us-east-1

# Store JWT_SECRET
aws secretsmanager create-secret \
    --name your-project-name-api-JWT_SECRET \
    --secret-string "your_super_secret_jwt_key" \
    --region us-east-1

# Store CLOUDINARY_API_KEY
aws secretsmanager create-secret \
    --name your-project-name-api-CLOUDINARY_API_KEY \
    --secret-string "your_cloudinary_api_key" \
    --region us-east-1

# Store CLOUDINARY_API_SECRET
aws secretsmanager create-secret \
    --name your-project-name-api-CLOUDINARY_API_SECRET \
    --secret-string "your_cloudinary_api_secret" \
    --region us-east-1

--name: Follow a consistent naming convention, e.g., your-project-name-api-VARIABLE_NAME. This helps with IAM policy filtering.
--secret-string: The actual sensitive value.

Important: Note down the ARNs (Amazon Resource Names) of these secrets. You can retrieve them via the AWS Console or by running:

aws secretsmanager list-secrets --filters Key=name,Values=your-project-name-api- \
  --query 'SecretList[*].ARN' --output text --region us-east-1

c) Testing This Component

Secrets Manager: Go to the AWS Console -> Secrets Manager. Verify that secrets with names like your-project-name-api-DATABASE_URL, your-project-name-api-JWT_SECRET, etc., are listed. You can click on them (carefully) to verify their content.

Feature 3: Create ECS Task Definition

The Task Definition is a blueprint for your application. It describes how your Docker container should run, including the Docker image to use, CPU and memory allocation, port mappings, and environment variables (including secrets).

a) Setup/Configuration

Create a new directory ecs at the root of your project and inside it, create task-definition.json:

your-project/
├── ecs/
│   └── task-definition.json
└── ...

b) Core Implementation

Populate ecs/task-definition.json with the following content. Remember to replace placeholders like YOUR_AWS_ACCOUNT_ID, YOUR_AWS_REGION, your-project-name, and the actual Secret ARNs.

// ecs/task-definition.json
{
  "family": "your-project-name-api-task",
  "networkMode": "awsvpc",
  "cpu": "256",
  "memory": "512",
  "executionRoleArn": "arn:aws:iam::YOUR_AWS_ACCOUNT_ID:role/your-project-name-ecsTaskExecutionRole",
  "requiresCompatibilities": [
    "FARGATE"
  ],
  "containerDefinitions": [
    {
      "name": "your-project-name-api-container",
      "image": "YOUR_AWS_ACCOUNT_ID.dkr.ecr.YOUR_AWS_REGION.amazonaws.com/your-project-name-api:latest",
      "portMappings": [
        {
          "containerPort": 3000,
          "hostPort": 3000,
          "protocol": "tcp"
        }
      ],
      "essential": true,
      "environment": [
        {
          "name": "NODE_ENV",
          "value": "production"
        },
        {
          "name": "PORT",
          "value": "3000"
        },
        {
          "name": "LOG_LEVEL",
          "value": "info"
        }
        // Add any other non-sensitive environment variables here
      ],
      "secrets": [
        {
          "name": "DATABASE_URL",
          "valueFrom": "arn:aws:secretsmanager:YOUR_AWS_REGION:YOUR_AWS_ACCOUNT_ID:secret:your-project-name-api-DATABASE_URL-xxxxxx"
        },
        {
          "name": "JWT_SECRET",
          "valueFrom": "arn:aws:secretsmanager:YOUR_AWS_REGION:YOUR_AWS_ACCOUNT_ID:secret:your-project-name-api-JWT_SECRET-xxxxxx"
        },
        {
          "name": "CLOUDINARY_API_KEY",
          "valueFrom": "arn:aws:secretsmanager:YOUR_AWS_REGION:YOUR_AWS_ACCOUNT_ID:secret:your-project-name-api-CLOUDINARY_API_KEY-xxxxxx"
        },
        {
          "name": "CLOUDINARY_API_SECRET",
          "valueFrom": "arn:aws:secretsmanager:YOUR_AWS_REGION:YOUR_AWS_ACCOUNT_ID:secret:your-project-name-api-CLOUDINARY_API_SECRET-xxxxxx"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/your-project-name-api",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      }
    }
  ]
}

family: A unique name for your task definition.
networkMode: awsvpc is required for Fargate.
cpu / memory: Define the compute resources for your task. Start small (e.g., 256 CPU units = 0.25 vCPU, 512 MB memory) and scale up if needed.
executionRoleArn: The IAM role we created earlier that allows ECS to pull images and access secrets.
image: The full ECR path to your Docker image.
portMappings: Our Node.js app listens on port 3000 inside the container.
environment: Non-sensitive environment variables.
secrets: This is where we link to our AWS Secrets Manager secrets. The name is the environment variable name your application expects, and valueFrom is the ARN of the secret.
logConfiguration: Configures logs to be sent to AWS CloudWatch Logs. We’ll automatically create a log group /ecs/your-project-name-api.

# Register the task definition
aws ecs register-task-definition \
    --cli-input-json file://ecs/task-definition.json \
    --region us-east-1

c) Testing This Component

ECS Task Definitions: Go to the AWS Console -> ECS -> Task Definitions. Verify your-project-name-api-task is listed. Click on it and review its details, especially the container definition, image, ports, and ensure the secrets are correctly referenced by their ARNs.

Feature 4: Create ECS Service and Load Balancer

The ECS Service maintains a desired count of tasks, handles health checks, and integrates with a load balancer to distribute traffic. We’ll use an Application Load Balancer (ALB) for this.

a) Setup/Configuration

Create an ALB: This will be the entry point for our application.
Create a Target Group: The ALB forwards traffic to a Target Group, which registers our ECS tasks.
Create a new file ecs/service-definition.json.

b) Core Implementation

1. Create ALB and Target Group (using AWS Console or CLI):

For simplicity, let’s outline the steps using the AWS Console, as ALB setup can be complex via CLI.

VPC and Subnets: Ensure you have a default VPC or a custom VPC with at least two public subnets. Our ALB will reside in these public subnets.
Security Group for ALB: Create a security group that allows inbound HTTP (port 80) and/or HTTPS (port 443) traffic from anywhere (0.0.0.0/0).
Create Target Group:
- Go to EC2 -> Target Groups -> Create target group.
- Choose IP addresses as the target type.
- Protocol: HTTP, Port: 3000 (our app’s port).
- VPC: Select your VPC.
- Health checks: Path /health (if you have one, or / otherwise). Protocol: HTTP.
- Name: your-project-name-api-tg.
Create Application Load Balancer:
- Go to EC2 -> Load Balancers -> Create load balancer -> Application Load Balancer.
- Name: your-project-name-api-alb.
- Scheme: Internet-facing.
- IP address type: ipv4.
- VPC: Select your VPC.
- Mappings: Select at least two public subnets.
- Security groups: Select the ALB security group you created.
- Listeners and routing:
  - Listener 1: HTTP:80. Default action: Forward to your-project-name-api-tg.
  - (Optional) Listener 2: HTTPS:443. Requires an SSL certificate from AWS Certificate Manager (ACM). Forward to your-project-name-api-tg.

After creating the ALB and Target Group, note down the ARN of the Target Group.

2. Create ECS Service Definition:

Populate ecs/service-definition.json. Replace placeholders like your-project-name, YOUR_AWS_ACCOUNT_ID, YOUR_AWS_REGION, and YOUR_TARGET_GROUP_ARN.

// ecs/service-definition.json
{
  "cluster": "your-project-name-cluster",
  "serviceName": "your-project-name-api-service",
  "taskDefinition": "your-project-name-api-task",
  "desiredCount": 2,
  "launchType": "FARGATE",
  "networkConfiguration": {
    "awsvpcConfiguration": {
      "subnets": [
        "subnet-xxxxxxxxxxxxxxxxx",
        "subnet-yyyyyyyyyyyyyyyyy"
      ],
      "securityGroups": [
        "sg-zzzzzzzzzzzzzzzzzzz"
      ],
      "assignPublicIp": "ENABLED"
    }
  },
  "loadBalancers": [
    {
      "targetGroupArn": "arn:aws:elasticloadbalancing:YOUR_AWS_REGION:YOUR_AWS_ACCOUNT_ID:targetgroup/your-project-name-api-tg/abcdef1234567890",
      "containerName": "your-project-name-api-container",
      "containerPort": 3000
    }
  ],
  "healthCheckGracePeriodSeconds": 60,
  "schedulingStrategy": "REPLICA",
  "deploymentConfiguration": {
    "maximumPercent": 200,
    "minimumHealthyPercent": 100
  },
  "propagateTags": "SERVICE",
  "enableECSManagedTags": true
}

cluster: The name of your ECS cluster.
serviceName: A unique name for your service.
taskDefinition: The family name of the task definition we registered.
desiredCount: The number of tasks (instances of your application) you want to run. Start with 2 for high availability.
networkConfiguration:
- subnets: IDs of the private subnets in your VPC where your Fargate tasks will run. These should ideally be different from your ALB’s public subnets for better security.
- securityGroups: A security group for your Fargate tasks that allows inbound traffic on port 3000 only from the ALB’s security group. This is crucial for security.
- assignPublicIp: ENABLED if your tasks need direct outbound internet access (e.g., to fetch packages, connect to external APIs).
loadBalancers: Links the service to your ALB Target Group.
healthCheckGracePeriodSeconds: Time for a task to start up before health checks begin.

Create/Update ECS Service:

# Create the ECS service
aws ecs create-service \
    --cli-input-json file://ecs/service-definition.json \
    --region us-east-1

c) Testing This Component

ALB: In the AWS Console -> EC2 -> Load Balancers. Check the DNS name of your ALB.
Target Group: In the AWS Console -> EC2 -> Target Groups. Check that the your-project-name-api-tg is healthy and registers tasks. Initially, it will be unhealthy as tasks are not yet running.
ECS Service: In the AWS Console -> ECS -> Clusters -> your-project-name-cluster -> Services. Verify your-project-name-api-service is listed, and its desiredCount and runningCount match.

Feature 5: Update `Dockerfile` for Production Readiness

Our existing Dockerfile might be good for local development, but for production, we want a smaller, more secure image. We’ll implement a multi-stage build, run as a non-root user, and add a health check.

a) Setup/Configuration

Open your Dockerfile at the root of your project.

b) Core Implementation

Modify your Dockerfile to use a multi-stage build. This significantly reduces the final image size by separating build dependencies from runtime dependencies.

# Dockerfile
# Stage 1: Build dependencies
FROM node:20-alpine AS builder

WORKDIR /app

# Copy package.json and package-lock.json first to leverage Docker cache
COPY package.json package-lock.json ./
RUN npm install --omit=dev

# Copy the rest of the application code
COPY . .

# Build TypeScript code
RUN npm run build

# Stage 2: Production runtime
FROM node:20-alpine

WORKDIR /app

# Copy only necessary files from the builder stage
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package.json ./package.json
COPY --from=builder /app/.env.production ./.env.production # If you have a separate prod env file for non-secrets

# Expose the port our app runs on
EXPOSE 3000

# Run as a non-root user for security best practices
RUN addgroup --system appgroup && adduser --system --ingroup appgroup appuser
USER appuser

# Health check for the container (adjust endpoint if needed)
HEALTHCHECK --interval=30s --timeout=10s --retries=3 CMD curl -f http://localhost:3000/health || exit 1

# Command to run the application
CMD ["node", "dist/src/main.js"]

Multi-stage build: builder stage for compilation, node:20-alpine for the final slim runtime.
--omit=dev: Only installs production dependencies.
dist directory: Assumes your TypeScript project compiles to dist.
Non-root user: addgroup, adduser, and USER appuser enhance security by not running the application as root.
HEALTHCHECK: Defines an endpoint (/health is common) that Docker (and ECS) can use to determine if the container is healthy and responsive. Ensure you have a /health endpoint in your application that returns a 200 OK. If not, create one in src/app.ts or src/main.ts.

Example health endpoint (add this to your main app file, e.g., src/app.ts):

// src/app.ts (or wherever your main Fastify/Express app is initialized)

// ... existing imports and app setup ...

// Add a health check endpoint
app.get('/health', async (request, reply) => {
  // You might want to check database connection, external services here
  // For now, a simple 200 OK is sufficient
  reply.status(200).send({ status: 'healthy', timestamp: new Date().toISOString() });
});

// ... rest of your routes and error handling ...

c) Testing This Component

Build locally: docker build -t your-project-name-api:latest .
Check image size: docker images your-project-name-api. The final image should be significantly smaller than a single-stage build.
Run locally: docker run -p 3000:3000 your-project-name-api:latest. Verify the app starts and the /health endpoint is reachable.

Feature 6: Build and Push Docker Image to ECR

Now that our Dockerfile is optimized, we’ll build the production image and push it to the ECR repository we created earlier.

a) Setup/Configuration

Ensure your AWS CLI is configured and you have Docker installed and running.

b) Core Implementation

1. Authenticate Docker to ECR: Replace YOUR_AWS_ACCOUNT_ID and YOUR_AWS_REGION.

# Get the ECR login command
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin YOUR_AWS_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com

2. Build and Tag the Docker Image:

# Build the Docker image
docker build -t your-project-name-api .

# Tag the image for ECR
docker tag your-project-name-api:latest YOUR_AWS_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/your-project-name-api:latest

3. Push the Image to ECR:

# Push the image to ECR
docker push YOUR_AWS_ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/your-project-name-api:latest

c) Testing This Component

ECR Console: Go to AWS Console -> ECR -> Repositories -> your-project-name-api. You should see your latest image with the current timestamp.

Production Considerations

Deploying to production involves more than just getting the code to run. We need to consider scalability, performance, security, and observability.

Scalability: ECS Fargate inherently supports scaling. Configure Service Auto Scaling on your ECS service to automatically adjust desiredCount based on metrics like CPU utilization or request count. This ensures your application can handle traffic spikes.
Performance:
- Right-sizing Fargate tasks: Monitor CPU and memory usage in CloudWatch. Adjust the cpu and memory parameters in your task-definition.json to optimize cost and performance.
- ALB Optimization: Ensure your ALB is configured with appropriate idle timeouts and connection settings.
Security:
- IAM Roles: Always use IAM roles with the principle of least privilege. Our ecsTaskExecutionRole is a good example.
- VPC and Subnets: Deploy ECS tasks into private subnets and restrict inbound traffic using security groups, allowing only the ALB to access the application port.
- Secrets Manager: Never hardcode credentials. Rotate secrets regularly.
- Network ACLs: Add an extra layer of security at the subnet level.
Logging and Monitoring:
- CloudWatch Logs: Our logConfiguration in the Task Definition automatically sends container logs to CloudWatch. Use CloudWatch Logs Insights for powerful log analysis.
- CloudWatch Metrics: ECS integrates with CloudWatch to provide metrics on CPU utilization, memory utilization, and network I/O for your tasks and services. Set up alarms to be notified of critical issues.
Database Connectivity: Ensure your RDS instance’s security group allows inbound traffic from your ECS task’s security group.

Code Review Checkpoint

At this stage, you’ve successfully containerized your application for production and deployed it to AWS ECS Fargate.

Summary of what was built:

AWS ECR Repository: To host your Docker images.
AWS ECS Cluster: To manage your Fargate tasks.
IAM Task Execution Role: With policies for ECR, CloudWatch, and Secrets Manager access.
AWS Secrets Manager Secrets: Securely storing sensitive application configurations.
Optimized Dockerfile: Using multi-stage build, non-root user, and health checks.
ECS Task Definition (ecs/task-definition.json): Describes how to run your container, linking to ECR image and Secrets Manager.
AWS Application Load Balancer (ALB) & Target Group: To expose your service to the internet.
ECS Service (ecs/service-definition.json): Manages the desired number of tasks and integrates with the ALB.
Docker image pushed to ECR.
ECS Service running tasks on Fargate.

Files created/modified:

Dockerfile (modified for multi-stage build, health check, non-root user)
ecs/task-definition.json (new)
ecs/service-definition.json (new)
ecs-task-execution-trust-policy.json (temporary for IAM role creation)
src/app.ts (or main app file, added /health endpoint)

How it integrates with existing code:

The Node.js application itself does not need significant changes, as it continues to read environment variables (which are now securely injected from Secrets Manager by ECS). The Dockerfile and ECS configurations wrap around our existing application code to provide a production-ready deployment.

Common Issues & Solutions

Deploying to a cloud environment can present unique challenges. Here are some common issues and how to troubleshoot them:

ECS Task Fails to Start or Stays in Pending State:
- Issue: The task definition might be incorrect, or the ecsTaskExecutionRole lacks necessary permissions.
- Debugging:
  - Check CloudWatch Logs for the /ecs/your-project-name-api log group. Look for errors during container startup.
  - Go to ECS Console -> Clusters -> your-project-name-cluster -> Tasks. Select the failed task and check the “Stopped reason” and “Events” tab for detailed error messages.
  - Verify the executionRoleArn in task-definition.json is correct and the role has AmazonECSTaskExecutionRolePolicy attached.
  - Ensure the ECR image path in task-definition.json is correct and the ecsTaskExecutionRole has permissions to pull from ECR.
  - Check if cpu and memory allocations are sufficient for your application.
- Prevention: Thoroughly review IAM policies and task definition parameters. Start with simple task definitions and add complexity incrementally.
Application Load Balancer (ALB) Shows Unhealthy Targets:
- Issue: The ALB cannot connect to your ECS tasks, or your application’s health check is failing.
- Debugging:
  - Check the Security Group associated with your ECS tasks. It must allow inbound traffic on port 3000 from the ALB’s security group.
  - Verify the Target Group health check path and port. Ensure your application’s /health endpoint is actually returning a 200 OK.
  - Check CloudWatch logs for your ECS tasks for any application-level errors preventing it from starting or responding to health checks.
  - Ensure the tasks are running in the correct subnets and have network connectivity.
- Prevention: Design a robust health check endpoint in your application. Carefully configure security groups to allow necessary traffic while maintaining least privilege.
Secrets Are Not Loaded into the Container (Environment Variables Missing):
- Issue: The ECS task definition might have incorrect secret ARNs, or the ecsTaskExecutionRole lacks permissions to access AWS Secrets Manager.
- Debugging:
  - In the ECS Console -> Task Definitions -> your-project-name-api-task, review the secrets section. Ensure the valueFrom ARNs exactly match your Secrets Manager secret ARNs.
  - Check the IAM policy attached to your-project-name-ecsTaskExecutionRole. It needs secretsmanager:GetSecretValue and kms:Decrypt permissions on the specific secret ARNs (or a broader resource if necessary, but prefer specific ARNs).
  - Temporarily add a command to your Dockerfile (e.g., CMD ["sh", "-c", "env && node dist/src/main.js"]) to print all environment variables on startup and check if the secrets are present (then remove this for production).
- Prevention: Double-check secret ARNs. Use the aws secretsmanager list-secrets command to get the exact ARNs. Ensure IAM policies are correctly scoped.

Testing & Verification

Now that our application is deployed, let’s verify everything is working as expected.

Access the API:
- Get the DNS name of your Application Load Balancer from the AWS EC2 Console -> Load Balancers.
- Open your browser or Postman/Insomnia and try to access your API endpoints using the ALB’s DNS name. For example, http://<ALB-DNS-NAME>/health should return a 200 OK with { "status": "healthy" }.
- Test a few core API endpoints (e.g., user registration, login, data retrieval) to ensure full functionality.
Verify Secrets Injection:
- While you can’t directly inspect environment variables on a running Fargate task easily, if your application is working correctly with the database and other services that rely on secrets, it’s a good indication they are being injected.
- If you added temporary logging to print environment variables (as suggested in troubleshooting), ensure you remove it after verification for security.
Check CloudWatch Logs:
- Go to AWS Console -> CloudWatch -> Log Groups. Find the /ecs/your-project-name-api log group.
- You should see log streams from your running ECS tasks. Check for any errors or warnings from your application.
- Use CloudWatch Logs Insights to query and filter your logs for specific messages or errors.
Monitor ECS Service Health:
- Go to AWS Console -> ECS -> Clusters -> your-project-name-cluster -> Services.
- Observe the your-project-name-api-service. Ensure runningCount matches desiredCount and the service is in a RUNNING state. Check the “Events” tab for any deployment or scaling issues.
- In the “Metrics” tab, observe CPU and memory utilization for your tasks.

Summary & Next Steps

Congratulations! You have successfully deployed your production-ready Node.js API to AWS ECS Fargate, leveraging Docker for containerization and AWS Secrets Manager for secure configuration. This is a significant milestone, transforming your local application into a scalable, highly available cloud service. You’ve gained hands-on experience with:

Setting up core AWS services (ECR, ECS, IAM, Secrets Manager, ALB).
Optimizing Dockerfiles for production.
Defining ECS Task Definitions and Services.
Implementing secure secrets management.
Troubleshooting common cloud deployment issues.

This chapter provides a solid foundation for running your backend services in a modern cloud environment. While we deployed manually using the CLI, in a professional setting, this process would be automated.

In the next chapter, Chapter 15: Setting Up CI/CD with GitHub Actions, we will automate the build, test, and deployment process we performed manually in this chapter. This will enable faster, more reliable, and consistent deployments, bringing us to a truly production-grade workflow.

Chapter 14: Deploying to AWS ECS Fargate & Secrets Management

Table of Contents

Planning & Design

Component Architecture

File Structure Additions

Step-by-Step Implementation

Feature 1: Prepare AWS Infrastructure

a) Setup/Configuration

b) Core Implementation

c) Testing This Component

Feature 2: Configure Secrets Management with AWS Secrets Manager

a) Setup/Configuration

b) Core Implementation

c) Testing This Component

Feature 3: Create ECS Task Definition

a) Setup/Configuration

b) Core Implementation

c) Testing This Component

Feature 4: Create ECS Service and Load Balancer

a) Setup/Configuration

b) Core Implementation

c) Testing This Component

Feature 5: Update Dockerfile for Production Readiness

a) Setup/Configuration

b) Core Implementation

c) Testing This Component

Feature 6: Build and Push Docker Image to ECR

a) Setup/Configuration

b) Core Implementation

c) Testing This Component

Production Considerations

Code Review Checkpoint

Common Issues & Solutions

Testing & Verification

Summary & Next Steps

Feature 5: Update `Dockerfile` for Production Readiness