Using AWS ECS Fargate Horizontal Auto Scaling

Introduction

In previous articles, I have provided instructions on using AWS ECS Fargate to deploy a NestJS Docker image utilizing S3 Service; you can review them to understand the basic concepts before proceeding. In this article, I will guide you through configuring auto scaling to automatically increase or decrease the number of instances based on demand.

Prerequisites

In the NestJS project, to easily test the auto-scaling feature, it is necessary to create an API with a relatively long processing time to drive CPU usage up during execution. I will create a simple API as follows for testing; you can add it to your project or replace it with any equivalent API of your choice. Afterward, build the Docker image and push it to AWS ECR.

import {
  Controller,
  Get,
  ParseIntPipe,
  Query,
} from '@nestjs/common'

@Controller('test')
export class TestController {
  @Get('sum')
  async sum(@Query('value', ParseIntPipe) value: number) {
    const start = Date.now()
    let result = 0
    for (let i = 1; i < value; i++) result += i
    const now = Date.now()
    const duration = now - start
    return {duration, now, result}
  }
}


Detail

Still using AWS CDK, let's create the file lib/ecs-fargate-cloudfront-autoscale-stack.ts:

import * as cdk from "aws-cdk-lib"
import * as cloudfront from "aws-cdk-lib/aws-cloudfront"
import * as origins from "aws-cdk-lib/aws-cloudfront-origins"
import * as ec2 from "aws-cdk-lib/aws-ec2"
import * as ecs from "aws-cdk-lib/aws-ecs"
import * as ecs_patterns from "aws-cdk-lib/aws-ecs-patterns"
import * as iam from "aws-cdk-lib/aws-iam"
import * as s3 from "aws-cdk-lib/aws-s3"
import { Construct } from "constructs"

export class EcsFargateCloudfrontAutoScaleStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props)

    const vpc = new ec2.Vpc(this, "NestVpc", {
      maxAzs: 2,
      natGateways: 0,
      subnetConfiguration: [
        { name: "Public", subnetType: ec2.SubnetType.PUBLIC },
      ],
    })

    const imageUri = process.env.IMAGE || ""
    const bucket = process.env.BUCKET || ""
    const customHeader = process.env.CUSTOM_HEADER || ""
    const originVerifySecret = process.env.VERIFY_SECRET || ""

    const fargateService =
      new ecs_patterns.ApplicationLoadBalancedFargateService(
        this,
        "NestService",
        {
          vpc,
          cpu: 256,
          memoryLimitMiB: 512,
          assignPublicIp: true,
          circuitBreaker: { rollback: true },
          capacityProviderStrategies: [
            {
              capacityProvider: "FARGATE",
              base: 1,
              weight: 0,
            },
            {
              capacityProvider: "FARGATE_SPOT",
              base: 0,
              weight: 1,
            },
          ],
          taskImageOptions: {
            image: ecs.ContainerImage.fromRegistry(imageUri),
            containerPort: 3000,
            environment: {
              REGION: process.env.CDK_DEFAULT_REGION || "",
              BUCKET: bucket,
              VERIFY_SECRET: originVerifySecret,
            },
          },
          healthCheckGracePeriod: cdk.Duration.seconds(120),
        },
      )

    fargateService.taskDefinition.addToExecutionRolePolicy(
      new iam.PolicyStatement({
        actions: [
          "ecr:GetAuthorizationToken",
          "ecr:BatchCheckLayerAvailability",
          "ecr:GetDownloadUrlForLayer",
          "ecr:BatchGetImage",
        ],
        resources: ["*"],
      }),
    )

    fargateService.targetGroup.configureHealthCheck({
      path: "/health",
      port: "3000",
      healthyThresholdCount: 2,
      unhealthyThresholdCount: 5,
      interval: cdk.Duration.seconds(60),
      timeout: cdk.Duration.seconds(5),
    })

    const scaling = fargateService.service.autoScaleTaskCount({
      minCapacity: 1,
      maxCapacity: 10,
    })

    scaling.scaleOnCpuUtilization("CpuScaling", {
      targetUtilizationPercent: 50,
      scaleInCooldown: cdk.Duration.seconds(60),
      scaleOutCooldown: cdk.Duration.seconds(60),
    })

    const myBucket = s3.Bucket.fromBucketName(this, "ExistingBucket", bucket)
    fargateService.taskDefinition.addToTaskRolePolicy(
      new iam.PolicyStatement({
        actions: ["s3:PutObject", "s3:GetObject", "s3:ListBucket"],
        resources: [
          myBucket.bucketArn,
          myBucket.arnForObjects("*"),
        ],
      }),
    )

    const distribution = new cloudfront.Distribution(this, "NestDist", {
      defaultBehavior: {
        origin: new origins.LoadBalancerV2Origin(fargateService.loadBalancer, {
          customHeaders: { [customHeader]: originVerifySecret },
          protocolPolicy: cloudfront.OriginProtocolPolicy.HTTP_ONLY,
        }),
        viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
        allowedMethods: cloudfront.AllowedMethods.ALLOW_ALL,
        cachePolicy: cloudfront.CachePolicy.CACHING_DISABLED,
        originRequestPolicy:
          cloudfront.OriginRequestPolicy.ALL_VIEWER_EXCEPT_HOST_HEADER,
      },
    })

    new cdk.CfnOutput(this, "URL", {
      value: `https://${distribution.distributionDomainName}`,
    })
  }
}

  • The sections for creating the VPC, fargateService, granting permissions for ECS to pull images and use S3 Service, health checks, and creating Cloudfront are all similar to the previous article I used.
  • There is a change here in the capacityProviderStrategies section where I use two strategies: FARGATE and FARGATE_SPOT. Among them, FARGATE is used to ensure there is always one node available, while FARGATE_SPOT offers lower costs but can be reclaimed by Amazon at any time, so it will be used for scaling purposes.
  • distribution: In the CloudFront configuration, I added an originRequestPolicy to allow passing parameters in the URL.
  • scaling includes the following information:
    • minCapacity, maxCapacity: the minimum and maximum number of nodes when scaling.
    • scaleOnCpuUtilization: I configured this based on CPU; if CPU utilization increases to 50% or more for 60 seconds, scaling will be performed.


Update the file bin/aws-cdk.ts as follows:

#!/usr/bin/env node
import 'dotenv/config';
import * as cdk from "aws-cdk-lib/core"
import { EcsFargateCloudfrontAutoScaleStack } from '../lib/ecs-fargate/ecs-fargate-cloudfront-auto-scale-stack';

const app = new cdk.App()
new EcsFargateCloudfrontAutoScaleStack(app, "EcsFargateCloudfrontAutoScaleStack")


Results after deployment

 Deployment time: 484.35s

Outputs:
EcsFargateCloudfrontAutoScaleStack.NestServiceLoadBalancerDNSDB906E33 = EcsFar-NestS-VlhcVpMGxJaC-225255983.ap-southeast-1.elb.amazonaws.com
EcsFargateCloudfrontAutoScaleStack.NestServiceServiceURLA979D4F3 = http://EcsFar-NestS-VlhcVpMGxJaC-225255983.ap-southeast-1.elb.amazonaws.com
EcsFargateCloudfrontAutoScaleStack.URL = https://d3qe4tptisxo2c.cloudfront.net
 Total time: 551.74s


Resources have been created on the AWS Console.


Testing with Postman shows successful deployment.



When you send a large number of requests, observe the Cloudwatch Metric as CPU usage gradually increases and the system automatically scales accordingly.



Initially, there is only 1 task.


After automatic scaling, it will increase to 2 tasks.


Happy coding!

See more articles here.

Comments

Popular posts from this blog

All practice series

Deploying a NodeJS Server on Google Kubernetes Engine

Setting up Kubernetes Dashboard with Kind

Using Kafka with Docker and NodeJS

Monitoring with cAdvisor, Prometheus and Grafana on Docker

Kubernetes Practice Series

Kubernetes Deployment for Zero Downtime

Practicing with Google Cloud Platform - Google Kubernetes Engine to deploy nginx

NodeJS Practice Series

Helm for beginer - Deploy nginx to Google Kubernetes Engine