Load Balancing

Load balancing lets you get the most out of your servers by evenly distributing the load across systems. By default, RunMyJobs uses the process count of Process Servers to distribute the load. However, this is not very efficient when different processes use resources differently, because resource-intensive processes count as much as lightweight processes. Fortunately, RunMyJobs offers several other load balancing strategies that can be used on their own or in combination.

Generic Load Balancing Using Queues

Generic Queue-based load balancing can be used for any type of workload. This approach is typically used to prevent a weak system from slowing down or offloading work onto a server that has a more important role in another Queue.

When you create a Queue or Queue provider, you can specify an Execution Size that determines the number of concurrent processes allowed in the Queue. If the queue serves several Process Servers, RunMyJobs will send processes to the Process Server with the smallest number of processes in status Running. You can influence this by setting an Execution Size on the Queue provider of a Process Server. RunMyJobs will ignore the Process Server while the number of its running processes equals the Queue provider's Execution Size.

Note: You can include processes in status Waiting in your Execution Size, but these typically do not consume any resources on the remote system.

Using Load Factors

You can use load factors to specify custom metrics for evaluating the load of a Process Server.

  • Multiplier: The relative weight of a specific load factor
  • Threshold: The maximum value allowed for this load factor, when this is reached, the Process Server is set to Overloaded
  • Monitor Value: The unit to use (CPU time, Page rate, Process Server check value, jmonitor value)
  • Load Threshold: The maximum allowed value of the sum of all load factors (multiplier * Monitor Value). When this is reached, a Process Server goes into status Overloaded.

Note: Load Threshold is Process Server-specific not load factor-specific; there is only one Load Threshold per Process Server. On Process Servers with one load factor only, Load Threshold should be set to the same or higher value as Threshold; it is only used when you want to take the combined effects of multiple load factors into account.

Note: The Threshold and Load Threshold values must take the Multiplier values into account.

OS Metric Load Balancing using MonitorValues

For load balancing OS processes across platform Process Servers there are several options:

  • process count load balancing: is the default option if no load factors are defined for the Process Servers. The system sends new processes to the Process Server with the least amount of running processes.
  • OS metric load balancing: uses near real-time monitoring data from the Platform Agents to decide where to run the process.
  • jmonitor load balancing: uses near real-time monitoring data generated by jmonitor.

Process Count

The default load balancing technique uses concurrent processes as the metric for balancing the load and is enabled by default.

The Load and LoadThreshold monitor values for the Process Server have the following values:

  • /System/Process Server/${PSName}/Performance/Load: The number of processes the Process Server is currently processing.
  • /System/Process Server/${PSName}/Performance/LoadThreshold: By default the maximum number of processes allowed to run simultaneously.

OS Metric

This type of load balancing requires a Platform Agent on each server and is typically used for Platform Agent workload. This load balancing uses load factors and a threshold for each Process Server. Two commonly used load factors are CPU usage ( CPUBusy ) and page rate ( PageRate, the rate at which pages are sent to/retrieved from the swap area), however, you create Process Server checks to create your own criteria as well.

The Load and LoadThreshold monitor values for the Process Server have the following values:

  • /System/ProcessServer/${process_server}/Performance/Load: Representation of the load factors as configured/Performance/
  • /System/ProcessServer/${process_server}/Performance/LoadThreshold: Maximum load specified on the load factor tab/Performance/

Example 1

Two Process Servers accept processes that require a specific resource, server prd5.example.com (Process Server MSLN_UNIXS5) is more powerful than prd7.example.com (Process Server MSLN_UNIXS7) so you want to run 1.5 times more processes on pr5.example.com. The prd7.example.com Process Server should be maxed out at 50 concurrent processes, likewise, prd5.example.com will be maxed out at 75 concurrent processes.

You specify the following load factors on the Process Servers:

Process Server Multiplier Threshold Monitor Value Load Threshold
prd5.example.com 2 150 /System/Process Server/MSLN_UNIXS5/Performance/Load 150
prd7.example.com 3 150 /System/Process Server/MSLN_UNIXS7/Performance/Load 150

Example 2

The same situation as the example above, however, the CPU utilisation also needs to be taken into account. The CPU utilization should not be allowed to go above 90%. A process uses a maximum of 5% CPU time.

Process Server Multiplier Threshold Monitor Value Load Threshold
prd5.example.com 2 150 /System/ProcessServer/MSLN_UNIXS5/Performance/Load 235
prd5.example.com 1 90 /System/ProcessServer/MSLN_UNIXS5/Performance/CPUBusy 235
prd7.example.com 3 150 /System/ProcessServer/MSLN_UNIXS7/Performance/Load 235
prd7.example.com 1 90 /System/ProcessServer/MSLN_UNIXS7/Performance/CPUBusy 235

Note: The Load Threshold will only be reached when the sum of Load and CPUBusy monitor node values for one Process Server reach 235.

Example 3

You have two Process Servers of which one is used for other workload as well. Server prd1.example.com is a powerful system that is used by multiple Applications compared to prd3.example.com which has been added to the pool to relieve prd1.example.com. You do not want to assign processes to Process Server prd1.example.com when its CPU usage reaches 80% and you want to have a ratio of 1.5:1 between the two.

You configure the following load factors:

Server Multiplier Threshold MonitorValue Load Threshold
prd1.example.com 3 240 /System/ProcessServer/MSLN_UNIXS1/Performance/CPUBusy

prd3.example.com 2 200 /System/ProcessServer/MSLN_UNIXS3/Performance/CPUBusy

This means that 1% CPU usage is worth 3 units on prd1.example.com and only 2 units on prd3.example.com. In theory, with processes using the same amount of resources, prd3.example.com will peak sooner than prd1.example.com. As soon as the CPU usage on pr1.example.com reaches 80% no new processes will be dispatched to it.

jmonitor

The jmonitor command line program is used to store monitoring values in the Redwood Server monitor tree. Although you are free to use any path, it is highly recommended to store the values under /System/ProcessServer/${process_server}/Custom/; you can create child nodes there to group specific values.