Using AWS ECS Fargate Horizontal Auto Scaling

Introduction

In previous articles, I have provided instructions on using AWS ECS Fargate to deploy a NestJS Docker image utilizing S3 Service; you can review them to understand the basic concepts before proceeding. In this article, I will guide you through configuring auto scaling to automatically increase or decrease the number of instances based on demand.

Prerequisites

In the NestJS project, to easily test the auto-scaling feature, it is necessary to create an API with a relatively long processing time to drive CPU usage up during execution. I will create a simple API as follows for testing; you can add it to your project or replace it with any equivalent API of your choice. Afterward, build the Docker image and push it to AWS ECR.

import {
  Controller,
  Get,
  ParseIntPipe,
  Query,
} from '@nestjs/common'

@Controller('test')
export class TestController {
  @Get('sum')
  async sum(@Query('value', ParseIntPipe) value: number) {
    const start = Date.now()
    let result = 0
    for (let i = 1; i < value; i++) result += i
    const now = Date.now()
    const duration = now - start
    return {duration, now, result}
  }
}


Detail

Still using AWS CDK, let's create the file lib/ecs-fargate-cloudfront-autoscale-stack.ts:

import * as cdk from "aws-cdk-lib"
import * as cloudfront from "aws-cdk-lib/aws-cloudfront"
import * as origins from "aws-cdk-lib/aws-cloudfront-origins"
import * as ec2 from "aws-cdk-lib/aws-ec2"
import * as ecs from "aws-cdk-lib/aws-ecs"
import * as ecs_patterns from "aws-cdk-lib/aws-ecs-patterns"
import * as iam from "aws-cdk-lib/aws-iam"
import * as s3 from "aws-cdk-lib/aws-s3"
import { Construct } from "constructs"

export class EcsFargateCloudfrontAutoScaleStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props)

    const vpc = new ec2.Vpc(this, "NestVpc", {
      maxAzs: 2,
      natGateways: 0,
      subnetConfiguration: [
        { name: "Public", subnetType: ec2.SubnetType.PUBLIC },
      ],
    })

    const imageUri = process.env.IMAGE || ""
    const bucket = process.env.BUCKET || ""
    const customHeader = process.env.CUSTOM_HEADER || ""
    const originVerifySecret = process.env.VERIFY_SECRET || ""

    const fargateService =
      new ecs_patterns.ApplicationLoadBalancedFargateService(
        this,
        "NestService",
        {
          vpc,
          cpu: 256,
          memoryLimitMiB: 512,
          assignPublicIp: true,
          circuitBreaker: { rollback: true },
          capacityProviderStrategies: [
            {
              capacityProvider: "FARGATE",
              base: 1,
              weight: 0,
            },
            {
              capacityProvider: "FARGATE_SPOT",
              base: 0,
              weight: 1,
            },
          ],
          taskImageOptions: {
            image: ecs.ContainerImage.fromRegistry(imageUri),
            containerPort: 3000,
            environment: {
              REGION: process.env.CDK_DEFAULT_REGION || "",
              BUCKET: bucket,
              VERIFY_SECRET: originVerifySecret,
            },
          },
          healthCheckGracePeriod: cdk.Duration.seconds(120),
        },
      )

    fargateService.taskDefinition.addToExecutionRolePolicy(
      new iam.PolicyStatement({
        actions: [
          "ecr:GetAuthorizationToken",
          "ecr:BatchCheckLayerAvailability",
          "ecr:GetDownloadUrlForLayer",
          "ecr:BatchGetImage",
        ],
        resources: ["*"],
      }),
    )

    fargateService.targetGroup.configureHealthCheck({
      path: "/health",
      port: "3000",
      healthyThresholdCount: 2,
      unhealthyThresholdCount: 5,
      interval: cdk.Duration.seconds(60),
      timeout: cdk.Duration.seconds(5),
    })

    const scaling = fargateService.service.autoScaleTaskCount({
      minCapacity: 1,
      maxCapacity: 10,
    })

    scaling.scaleOnCpuUtilization("CpuScaling", {
      targetUtilizationPercent: 50,
      scaleInCooldown: cdk.Duration.seconds(60),
      scaleOutCooldown: cdk.Duration.seconds(60),
    })

    const myBucket = s3.Bucket.fromBucketName(this, "ExistingBucket", bucket)
    fargateService.taskDefinition.addToTaskRolePolicy(
      new iam.PolicyStatement({
        actions: ["s3:PutObject", "s3:GetObject", "s3:ListBucket"],
        resources: [
          myBucket.bucketArn,
          myBucket.arnForObjects("*"),
        ],
      }),
    )

    const distribution = new cloudfront.Distribution(this, "NestDist", {
      defaultBehavior: {
        origin: new origins.LoadBalancerV2Origin(fargateService.loadBalancer, {
          customHeaders: { [customHeader]: originVerifySecret },
          protocolPolicy: cloudfront.OriginProtocolPolicy.HTTP_ONLY,
        }),
        viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
        allowedMethods: cloudfront.AllowedMethods.ALLOW_ALL,
        cachePolicy: cloudfront.CachePolicy.CACHING_DISABLED,
        originRequestPolicy:
          cloudfront.OriginRequestPolicy.ALL_VIEWER_EXCEPT_HOST_HEADER,
      },
    })

    new cdk.CfnOutput(this, "URL", {
      value: `https://${distribution.distributionDomainName}`,
    })
  }
}

  • The sections for creating the VPC, fargateService, granting permissions for ECS to pull images and use S3 Service, health checks, and creating Cloudfront are all similar to the previous article I used.
  • There is a change here in the capacityProviderStrategies section where I use two strategies: FARGATE and FARGATE_SPOT. Among them, FARGATE is used to ensure there is always one node available, while FARGATE_SPOT offers lower costs but can be reclaimed by Amazon at any time, so it will be used for scaling purposes.
  • distribution: In the CloudFront configuration, I added an originRequestPolicy to allow passing parameters in the URL.
  • scaling includes the following information:
    • minCapacity, maxCapacity: the minimum and maximum number of nodes when scaling.
    • scaleOnCpuUtilization: I configured this based on CPU; if CPU utilization increases to 50% or more for 60 seconds, scaling will be performed.


Update the file bin/aws-cdk.ts as follows:

#!/usr/bin/env node
import 'dotenv/config';
import * as cdk from "aws-cdk-lib/core"
import { EcsFargateCloudfrontAutoScaleStack } from '../lib/ecs-fargate/ecs-fargate-cloudfront-auto-scale-stack';

const app = new cdk.App()
new EcsFargateCloudfrontAutoScaleStack(app, "EcsFargateCloudfrontAutoScaleStack")


Results after deployment

 Deployment time: 484.35s

Outputs:
EcsFargateCloudfrontAutoScaleStack.NestServiceLoadBalancerDNSDB906E33 = EcsFar-NestS-VlhcVpMGxJaC-225255983.ap-southeast-1.elb.amazonaws.com
EcsFargateCloudfrontAutoScaleStack.NestServiceServiceURLA979D4F3 = http://EcsFar-NestS-VlhcVpMGxJaC-225255983.ap-southeast-1.elb.amazonaws.com
EcsFargateCloudfrontAutoScaleStack.URL = https://d3qe4tptisxo2c.cloudfront.net
 Total time: 551.74s


Resources have been created on the AWS Console.


Testing with Postman shows successful deployment.



When you send a large number of requests, observe the Cloudwatch Metric as CPU usage gradually increases and the system automatically scales accordingly.



Initially, there is only 1 task.


After automatic scaling, it will increase to 2 tasks.


Happy coding!

See more articles here.

Comments

Popular posts from this blog

All Practice Series

Kubernetes Deployment for Zero Downtime

Deploying a NodeJS Server on Google Kubernetes Engine

Setting up Kubernetes Dashboard with Kind

Using Kafka with Docker and NodeJS

Practicing with Google Cloud Platform - Google Kubernetes Engine to deploy nginx

Monitoring with cAdvisor, Prometheus and Grafana on Docker

Kubernetes Practice Series

NodeJS Practice Series

Sitemap