easy-powercap
This repository contains a Batsim-compatible implementation of the EASY+powercap and Greedy knapsack scheduling algorithm.
Algorithm description
EASY+powercap (easypower.cpp)
This algorithm is strongly based on EASY backfilling. The main addition over EASY is that a powercap must be followed during a time window. In other words, a constraint on the maximum power consumption value is set for the whole platform. This algorithm uses additional data on the power consumption of jobs to take its decisions. We implemented two ways to sort the waiting queue: FCFS or Smallest Area First (SAF). For SAF, the area is calculated by multiplying the number of requested resources by the requested time.
Please read the algorithm implementation for all details.
Greedy knapsack (knapsack_greedy.cpp)
We model the same problem as a 0/1 knapsack optimization problem. This optimization problem considers a knapsack with capacity
In our case, the items are the jobs in the queue. The capacity
Where
Please read the algorithm implementation for all details.
Power verification
Both easypower and knapsack_greedy implement some ways to verify the power capping. Giving that
-
Mean
: This method uses the predicted mean of each job. Therefore,\sum_{i \in \mathcal{J}[t]} \hat{P_i}^{avg} \leq \overline{P} -
Maximum
: This method is similar to the previous one, but is the max instead of the mean. Therefore,\sum_{i \in \mathcal{J}[t]} \hat{P_i}^{\max} \leq \overline{P} -
Gaussian
: The Gaussian method uses a probabilistic model for jobs' energy consumption. This method tries to quantitatively formalize the intuition that, when many jobs execute concurrently, it is unlikely that all jobs simultaneously consume their peak power usage. In this case, to verify the power capping, we apply\sum_{i \in \mathcal{J}[t]} \hat{P_i}^{avg} + (\sigma \times (\sqrt{\sum_{i \in \mathcal{J}[t]} (\hat{P_j}^{std})^2})) < \overline{P}.\sigmacontrols the probability to be below (\sigma = 1for 0.68,\sigma = 2for 0.95, and\sigma = 3for 0.995)
Usage
This repository is a Nix flake and can be used via this interface.
Alternatively, you can compile the algorithm dynamic library via meson
+ ninja
with a command such as meson setup build && ninja -C build
(dependencies are defined in flake.nix and flake.lock).
The compiled library can be used as a Batsim EDC (external decision component) library with a command such as:
batsim -l <path/to/compiled/easypower.so> 0 <INPUT-PARAMETERS> \
-p <path/to/platform.xml> \
-w <path/to/workload.json> \
-e <path/to/output/directory/>
or
batsim -l <path/to/compiled/knapsack_greedy.so> 0 <INPUT-PARAMETERS> \
-p <path/to/platform.xml> \
-w <path/to/workload.json> \
-e <path/to/output/directory/>
where <INPUT-PARAMETERS>
is a string that contains a JSON object that must at least contain:
{
"idle_watts": 240.0,
"powercap_dynamic_watts": 323946,
"normal_dynamic_watts": 719880.0,
"powercap_end_time_seconds": 10800,
"job_power_estimation_field": "mean_power_estimation",
"predictor_name": "mean",
"order": "fcfs",
"type_knapsack": "waiting_time",
"sigma_times": 1
}
-
idle_watts
is the amount of watts that a single node consumes when idle -
powercap_dynamic_watts
is the value of the powercap during the constrained time window, in watts -
normal_dynamic_watts
is the value of the powercap after the constrained time window, in watts -
powercap_end_time_seconds
is the duration of the constrained time window. the time window always starts at t=0 -
job_power_estimation_field
is the name of the field to use as the estimation of a job's dynamic power consumption. Please note that this field must be set in the workload file used -
order
: Only for easypower. This parameter controls which order to use for sorting waiting queue. The possible values are"fcfs"
or"saf"
-
type_knapsack
: Only for knapsack_greedy. This parameter controls which profit type we implement. It can be"waiting_time"
or"waiting_time_ratio"
-
sigma_times
: It controls the sigma times for the Gaussian verification
Versions
This has been tested with the following important dependencies:
- batsim commit ee797ccebbb95410479663ee0547e752112fc83e (batprotocol branch as of 2024-04-04)
- SimGrid release 3.34 (commit 036c801d55e3ab07b470c79640109080fed049a1, compiled by NUR-Kapack's definition 4d8ca88fd8a0a2287ee5c023877f14d53d4854c1)
- batprotocol commit 25bc5bbf039c18a8024c4ab326047ba56800376a
- intervalset commit 13d8f2d7000d4dc4e3202422658b0b5d83f83679
Please refer to flake.nix/flake.lock for a full pinning of the environment to compile the library.