Skip to content
Snippets Groups Projects

easy-powercap

This repository contains a Batsim-compatible implementation of the EASY+powercap and Greedy knapsack scheduling algorithm.

Algorithm description

EASY+powercap (easypower.cpp)

This algorithm is strongly based on EASY backfilling. The main addition over EASY is that a powercap must be followed during a time window. In other words, a constraint on the maximum power consumption value is set for the whole platform. This algorithm uses additional data on the power consumption of jobs to take its decisions. We implemented two ways to sort the waiting queue: FCFS or Smallest Area First (SAF). For SAF, the area is calculated by multiplying the number of requested resources by the requested time.

Please read the algorithm implementation for all details.

Greedy knapsack (knapsack_greedy.cpp)

We model the same problem as a 0/1 knapsack optimization problem. This optimization problem considers a knapsack with capacity

KcKc
and
mm
items denoted by index
jj
. Each item
jj
has a profit
vjv_j
and a weight
wejwe_j
. It needs to determine the subset of items that fit into the knapsack and that maximizes the total profit. The classic formulation is to define a boolean
xjx_j
that is set to 1 if the item is selected and 0 if it is not:

maximize j=0j=m1xj×vjmaximize\ \sum_{j=0}^{j=m-1} x_j \times v_j

subject to j=0j=m1xj×wejKcsubject\ to\ \sum_{j=0}^{j=m-1} x_j \times we_j \leq Kc

In our case, the items are the jobs in the queue. The capacity

KcKc
is the power capping and the weight
we_j
is the predicted job power consumption. Regarding the profit, we modeled it as a QoS metric. Since we are trying to improve the turnaround, and the turnaround is highly impacted by the waiting time, we modeled two profit functions using the waiting time. The first one is the waiting time for every job (
v_j = wait_j
, where
wait_j
is the waiting time). Thus, the jobs that are longer in the queue receive higher priority, which can be seen as similar to FCFS. The second one is:

v_j = \frac{wait_j + \tilde{p}_j}{\tilde{p}_j}

Where

\tilde{p}_j
is the expected execution time. This formula should increase the priority of the jobs waiting longer, but according to their size. In other words, this functions makes long jobs wait longer than small jobs. However, at some point, these long jobs are prioritized because their priority increases with waiting time (aging mechanism). Therefore, no job can wait indefinitely, avoiding starvation. A way to solve this knapsack problem is by using dynamic programming. However, the size of our problem explodes the combinatorial possibilities (large number of jobs and large capacity). In addition, our Gaussian approach verifies if we are below the power capping by considering all jobs taken together. For these reasons, we propose a greedy approach to solve our knapsack problem in a reasonable time.

Please read the algorithm implementation for all details.

Power verification

Both easypower and knapsack_greedy implement some ways to verify the power capping. Giving that

\hat{P_j}
is the power prediction,
\mathcal{J}[t]
is the set of jobs running at time
t
, and
\overline{P}
is the power capping, they are:

  • Mean: This method uses the predicted mean of each job. Therefore,
    \sum_{i \in \mathcal{J}[t]} \hat{P_i}^{avg} \leq \overline{P}
  • Maximum: This method is similar to the previous one, but is the max instead of the mean. Therefore,
    \sum_{i \in \mathcal{J}[t]} \hat{P_i}^{\max} \leq \overline{P}
  • Gaussian: The Gaussian method uses a probabilistic model for jobs' energy consumption. This method tries to quantitatively formalize the intuition that, when many jobs execute concurrently, it is unlikely that all jobs simultaneously consume their peak power usage. In this case, to verify the power capping, we apply
    \sum_{i \in \mathcal{J}[t]} \hat{P_i}^{avg} + (\sigma \times (\sqrt{\sum_{i \in \mathcal{J}[t]} (\hat{P_j}^{std})^2})) < \overline{P}
    .
    \sigma
    controls the probability to be below (
    \sigma = 1
    for 0.68,
    \sigma = 2
    for 0.95, and
    \sigma = 3
    for 0.995)

Usage

This repository is a Nix flake and can be used via this interface. Alternatively, you can compile the algorithm dynamic library via meson + ninja with a command such as meson setup build && ninja -C build (dependencies are defined in flake.nix and flake.lock).

The compiled library can be used as a Batsim EDC (external decision component) library with a command such as:

batsim -l <path/to/compiled/easypower.so> 0 <INPUT-PARAMETERS> \
       -p <path/to/platform.xml> \
       -w <path/to/workload.json> \
       -e <path/to/output/directory/>

or

batsim -l <path/to/compiled/knapsack_greedy.so> 0 <INPUT-PARAMETERS> \
       -p <path/to/platform.xml> \
       -w <path/to/workload.json> \
       -e <path/to/output/directory/>

where <INPUT-PARAMETERS> is a string that contains a JSON object that must at least contain:

{
  "idle_watts": 240.0,
  "powercap_dynamic_watts": 323946,
  "normal_dynamic_watts": 719880.0,
  "powercap_end_time_seconds": 10800,
  "job_power_estimation_field": "mean_power_estimation",
  "predictor_name": "mean",
  "order": "fcfs",
  "type_knapsack": "waiting_time",
  "sigma_times": 1
}
  • idle_watts is the amount of watts that a single node consumes when idle
  • powercap_dynamic_watts is the value of the powercap during the constrained time window, in watts
  • normal_dynamic_watts is the value of the powercap after the constrained time window, in watts
  • powercap_end_time_seconds is the duration of the constrained time window. the time window always starts at t=0
  • job_power_estimation_field is the name of the field to use as the estimation of a job's dynamic power consumption. Please note that this field must be set in the workload file used
  • order: Only for easypower. This parameter controls which order to use for sorting waiting queue. The possible values are "fcfs" or "saf"
  • type_knapsack: Only for knapsack_greedy. This parameter controls which profit type we implement. It can be "waiting_time" or "waiting_time_ratio"
  • sigma_times: It controls the sigma times for the Gaussian verification

Versions

This has been tested with the following important dependencies:

  • batsim commit ee797ccebbb95410479663ee0547e752112fc83e (batprotocol branch as of 2024-04-04)
    • SimGrid release 3.34 (commit 036c801d55e3ab07b470c79640109080fed049a1, compiled by NUR-Kapack's definition 4d8ca88fd8a0a2287ee5c023877f14d53d4854c1)
  • batprotocol commit 25bc5bbf039c18a8024c4ab326047ba56800376a
  • intervalset commit 13d8f2d7000d4dc4e3202422658b0b5d83f83679

Please refer to flake.nix/flake.lock for a full pinning of the environment to compile the library.