SAP BTP autoscaler an easy to use service, with a lot of quickwins

The Autoscaler service on SAP BTP can be used to automatically down and upscale the number of instances. This Autoscaler is a useful service to manage the workload of your application during busy hours. In this blog I will talk about the experiences that me and my colleagues had while using this service, some tips & tricks and lessons learned.


We have an API deployed on Cloudfoundry that is used extensively. The API was originally developed for an UI5 application, but over time more and more applications in multiple teams consumed this API for different use-cases. When the API was first launched it worked fine, but when it was being used more often, we came across a problem. Teams would run daily jobs while using our API in a high frequency which increased our workload tremendously.
At first, we thought that simply increasing the memory for our application instance would fix our problem but after a while this wasn’t enough. Furthermore, our application didn’t need such an increase in memory all the time. We decided to look at the autoscaler service to increase our instances automatically.

The first choice while using the autoscaler is deciding if we want to use a dynamic or static autoscaler. The static autoscaler increases the instances at pre-determined times. The static autoscaler is a good solution when the increase in workload is predictable. However, the downside of the static autoscaler is that if the time frame in which the workload increases change, you have to change the configuration to new time frames at which the workload is increasing. Also, it may take a while before you discover that the timeframe of the high workload has shifted. A static scaling configuration would a great option when your app is only active during working hours. You can configure it in such a way that the app runs on less instances outside the working hours.

A second option is to configure the autoscaler as a dynamic autoscaler. The dynamic autoscaler increases the instances automatically whenever the workload reaches a certain threshold and downscales the number of active instances when the workload is going down. This is a great solution for you if you know that the workload is less predictable.


Since we knew when our workload was going to increase our first thought was to use the static scaler. Since we also knew that the number of teams that are going to use our app will increase and the times at which our app is used is getting less predictable in the future. We decided it is best to go for the dynamic autoscaler.

Below are a few situations summarize in which you can use a static or dynamic configuration:

App is always used during working hours and doesn’t need a lot of instances running outside those hours.App is always used during working hours. However,
people are in different time zones so it’s less predictable
when the workload is going to increase.
The App has a scheduled job which synchronizes the app data with external sources at certain times. The time at which this happens never changes.The App has a job. However, the times at which it needs
to run changes very often.
The App has one specific user group so it’s easy to pinpoint when the workload is going to increase or decrease.The App has one specific user group so it’s easy to pinpoint
when the workload is going to increase or decrease. The
App has many different user groups. This makes it difficult to determine when the workload is going to change. For example, when other teams are using your API for their own app.

After choosing to go for a dynamic autoscaler we configured our app as following:

    "instance_min_count": 1,
    "instance_max_count": 2,
    "scaling_rules": [
            "metric_type": "memoryutil",
            "breach_duration_secs": 1200,
            "threshold": 40,
            "operator": "<",
            "cool_down_secs": 300,
            "adjustment": "-1"
            "metric_type": "memoryutil",
            "breach_duration_secs": 180,
            "threshold": 80,
            "operator": ">=",
            "cool_down_secs": 300,
            "adjustment": "+1"

We made a configuration that would increase our number of instances to a maximum of two. With this configuration our app would be upscaled whenever our memory workload of the instance goes over 80% and it would downscale whenever the memory workload of the instance would be below 40%. If your configuration doesn’t perfectly upscale try tweaking your breach_duration_secs and threshold properties. These will differ depending on your use case.

However, after a few weeks we started to notice that our app was unavailable again at certain times. It took a while for us to notice because we had a maximum memory size of 2 GB per instance. After some debugging, we found out that we had a memory leak which was present in our API, and because of the scheduled jobs this increased our memory usage tremendously. A lesson learned from this is that simply increasing your memory or increasing the number of instances is not the cure for all your problems. We finally made some changes to our app and configuration to finetune our solution. We increased our maximum amount of instances to three and we lowered our memory size to 1 GB.

We made a rule of thumb that if for whatever reason the amount of active instances needed will be three or higher, we will analyse our application to find the cause of the increase in memory usage and if needed refactor our code.


In short, the autoscaler is a great service which can be implemented very easily. However, it is not the magic solution for all your workload issues. I recommend implementing it if you can, since the effort is very low, and the benefits are relatively high. I also recommend that you constantly ask yourself how many instances do you actually need. Only upscale/downscale to the amount you really need.

Installing & configuring

If you need more information on how to implement the autoscaler checkout these links: