Autoscale Your API Builder Docker Application in the AMPLIFY Runtime Services (ARS)

The blog post, Publish Your API Builder Docker Image to the AMPLIFY Runtime Services (ARS) described how to publish your API Builder project to the AMPLIFY Runtime Services (ARS).

In this blog post, we’ll describe how to enable autoscaling in order to be able to deal with peak loads of API requests. We’ll also apply a load on the application and see how ARS adds more instances of API Builder nodes to handle the increased load of API requests.

Let’s get started.

Verify That Your App is Running

If you haven’t already done so, follow the prior blog post to publish your app to ARS.

Let’s check the application using the following command:

acs list apibm


Note that my application name is apibm

The result is as follows:

Note that the minimum, maximum, desired and deployed servers are all 1 as we have not enabled autoscaling yet.

Let’s check our API to make sure the app is working properly using curl:



with response:

  "success": true,
  "request-id": "d9771da0-33a9-45b6-94b2-463468133e73",
  "key": "dogs",
  "dogs": [
      "id": "5a24a9a67779e860d007b13e",
      "breed": "Poodle",
      "name": "Fido"
      "id": "5a24a9d27779e860d007b140",
      "breed": "Lab",
      "name": "Fred"
      "id": "5b3aafcb9de9003840480fe7",
      "breed": "Doberman",
      "name": "Doobie"


The API is working fine, so let’s move on.

Enable Autoscaling

Referring to the ARS CLI reference, we’ll use the config command to set various configuration options.

Normally, you’d be able to set all of the following in one single config command, but due to a known issue with the CLI, we’ll set them all separately as follows:

acs config --minsize 1 --maxsize 2 apibm
acs config --autoscaleup true apibm
acs config --autoscaledown true apibm
acs config --maxconn 1 apibm
acs config --maxconnws 1 apibm
acs config --maxqueuedrequests 1 apibm


Note that currently in ARS v1, autoscaling is based on HTTP request queue size. We’ll set the queue size to 1 so that as soon as API Builder cannot respond to an HTTP request, ARS will autoscale up the number of API Builder containers according to our setup.

In ARS v2, autoscaling will be based on CPU load

It is also worth noting that ARS is automatically handling the load balancing for us. ARS will load balance API requests across API Builder application instances.

Now, when we list our application using ‘acs list apibm’ – we will see a slightly different response as follows:

Note that now we can see that the maximum number of servers is set to 2 as we requested.

Apply a Load

Now that autoscaling is enabled, let’s apply a load to API Builder and see how ARS autoscales the number of servers.

In order to apply a load of API requests to API Builder, we’ll use an npm called loadtest. We can install it as follows:

npm install -g loadtest


Note that you may need to prefix the command with sudo if you’re on a mac

We can now apply a load on our API using the following command:

loadtest -c 10 --rps 200


Let that run for a while until the errors start to increase significantly. This can take several minutes.

The console log for loadtest should look similar to the following:

Check your API Builder app using ‘acs list apibm’ and expect to see something similar to:

Notice that the number of deployed servers increased to 2, the max we set.

If the number of deployed servers is still 1, then run loadtest again and wait a while.


In this blog post, we described how to enable ARS autoscaling and we applied a load to our API Builder app and saw how ARS autoscaled our application.