The problem of scalability

When considering a software service working in a production mode, which handles multiple users at a time, it is important to ensure appropriate level of efficiency and performance. When the service is deployed on a small scale, i.e. requires up to few machines maximum only, it can be monitored and maintained manually by an administrator in the worst case. However, when the service can grow from up to dozens or more machines, manual maintenance is either impossible or very costly. Thus, it is essential that the service can manage itself. One type of management is especially important when considering software applications and services, namely the self-scalability.

Scalability is the ability of an application to cope under an increased workload. A scalable application will be able to preserve or even increase its level of efficiency when confronted with a larger workload. Naturally, the efficiency of an application can be preserved on increased under heavy workload only when the computational and storage resources are increased as well. Hence, in practice the application is scalable when it can utilize additional resources without loosing much efficiency. As with algorithms, we can define three types of applications scalability:

  • ● Linear scalability, when additional resources of a given type always contribute the same amount of additional capacity to the application performance.
  • ● Sub-linear scalability, when additional resources of a given type contribute less than the same amount of additional capacity to the application performance, e.g. due to synchronization necessity.
  • ● Super-linear scalability, when additional resources of a given type contribute more than the same amount of additional capacity to the application performance.
In most cases, an application is scaled manually, i.e. after discovering a change in workload pattern, an operator adds new resources and reconfigures the application. On contrary, a self-scalable application can scale itself without an external interaction. Once configured, it can adjust itself to different workload patterns dynamically.

Advantages of self-scalability relate to the effort required to maintain applications once started. Traditional applications often require constant monitoring and possible reconfiguration when the workload pattern is changed, e.g. the number of clients increases or efficiency of currently using resources decreases due to running other applications. In many situations, a human administrator is required in a full-time job just to ensure that such a critical application is working well. On the other hand, self-scalable applications should perform any administrative actions automatically. Moreover, by using monitoring data, self-scalable applications can be more efficient than their manually operated counterparts, due to much faster reaction time. This is especially important in dynamically changing environments when the workload pattern can't be determined or predicted upfront. However, building a self-scalable application can be a challenging task. Such functionality is often implemented in a separated module, often referred to as management module, which is responsible for analyzing application workload based on monitoring data and performing the actual applications scaling procedure, e.g. starting a new instance of the application on a different server. The management module often implements the following four functionality:

  • Online monitoring, which is integrated with a management module. As with human administrations, the application requires online data about current workload in the system in order to react to any change.
  • Detecting events, which should trigger the scaling procedure. The management module should automatically identify moments, when the application needs to be scaled. These events can be different for different types of applications, e.g. when the application is CPU bound, it is not necessary to monitoring storage utilization.
  • Performing scaling procedure is related to acquiring additional resources by the application. This step is also application-dependent, e.g. it can involve setting up a new database connection or adding a new machine to a virtual private network.
  • Resource discovering, which involves identification of resources that can be used during the scaling procedure. Mostly often, such resources have to be prepared first, e.g. install dependencies.

Besides implementing these functionalities, self-scalable applications require knowledge about events that should trigger the scaling procedure. This knowledge can be in form of rules, which define conditions upon occurring which the management module should perform some actions. It is often knowledge gathered from observation of the application in real-life scenarios; hence it can be difficult to discover them automatically. Thus, the decision to enhance an existing application with the self-scalability feature is not so obvious. Self-scalability is a part of popular set of features, which is often referred to as self-*, which denotes features related to application autonomicity. The set of such features include also the following capabilities among others:

  • ● self-healing, which is the ability of a system to automatically recover from a failure,
  • ● self-organization, which is the ability of a system to dynamically adjust its logical or physical organization at runtime to new requirements,
  • ● self-adapting, which is the ability of a system to adapt itself to a changing environment in an automatic manner,
  • ● self-protection, which reflects the need of proactive identification and protection from arbitrary attacks.

They are used to describe systems, which should provide a high level of automaticity, i.e. they can be treated as self-aware systems. They are subject of research by the Autonomic Computing initiative. The initiative intends to provide mechanisms and tools for developing more intelligent and self-managed computing systems, where administration interference would be reduce to actual minimum only. An inspiration for the research was the human autonomic nervous system, which controls key functions without any outside involvement. One of a way for designing an autonomic system is extending an existing part of it, which is responsible for functional requirements, with a component, which should take care of non-functional requirements, e.g. availability.

A self-scalable service with modules responsible for handling scalability management is depicted in Fig. 1. A self-scalable service can be considered as a wrapper of an actual service, which provides desirable functionality to external clients. Within a self-scalable service, the actual service can be instantiated multiple times on differ- ent computational resources, which are included in a computational resource pool supervised by the self-scalable service. Load balancer, which constitutes a single access point to the service, is responsible for delegating requests from clients to different service instances. Each computational resource either hosts a service instance or stays idle. Moreover, computational resources include monitoring sensors, which periodically send information about current workload on the resource to Monitoring manager.

Fig. 1: An overview of scalability management within self-scalable services.

The administrator defines scaling rules using Scalability manager of the service to enable self-scalability. Scalability manager starts supervising processes to enforce each defined scaling rule. A supervising process periodically collects monitoring data from Monitoring manager, checks if scaling rule condition is met and executes scaling rule action if necessary. Scaling actions typically involve starting new instances of the actual service on idle computational resources, or stopping already running instances. To select resource to start a new instance or stop an already running one, Scalability manager queries Information manager, which stores information about resources and instances of each scaling services.