This repository contains a [Batsim](https://batsim.readthedocs.io/en/latest/)-compatible implementation of the EASY+powercap scheduling algorithm.
This repository contains a [Batsim](https://batsim.readthedocs.io/en/latest/)-compatible implementation of the EASY+powercap and Greedy knapsack scheduling algorithm.
## Algorithm description
### EASY+powercap (easypower.cpp)
This algorithm is strongly based on EASY backfilling.
The main addition over EASY is that a powercap must be followed during a time window.
In other words, a constraint on the maximum power consumption value is set for the whole platform.
This algorithm uses additional data on the power consumption of jobs to take its decisions.
This algorithm uses additional data on the power consumption of jobs to take its decisions.
We implemented two ways to sort the waiting queue: FCFS or Smallest Area First (SAF). For SAF, the area is calculated by multiplying the number of resources by the
Please read [the algorithm implementation](./easypower.cpp) for all details.
### Greedy knapsack (knapsack_greedy.cpp)
We model the same problem as a 0/1 knapsack optimization problem. This optimization problem considers a knapsack with capacity $Kc$ and $m$ items denoted by index $j$. Each item $j$ has a profit $v_j$ and a weight $we_j$. It needs to determine the subset of items that fit into the knapsack and that maximizes the total profit. The classic formulation is to define a boolean $x_j$ that is set to 1 if the item is selected and 0 if it is not:
In our case, the items are the jobs in the queue. The capacity $Kc$ is the power capping and the weight $we_j$ is the predicted job power consumption. Regarding the profit, we modeled it as a QoS metric. Since we are trying to improve the turnaround, and the turnaround is highly impacted by the waiting time, we modeled two profit functions using the waiting time. The first one is the waiting time for every job ($v_j = wait_j$, where $wait_j$ is the waiting time). Thus, the jobs that are longer in the queue receive higher priority, which can be seen as similar to FCFS. The second one is:
$v_j = \frac{wait_j + \tilde{p}_j}{\tilde{p}_j}$
Where $\tilde{p}_j$ is the expected execution time. This formula should increase the priority of the jobs waiting longer, but according to their size. In other words, this functions makes long jobs wait longer than small jobs. However, at some point, these long jobs are prioritized because their priority increases with waiting time (aging mechanism). Therefore, no job can wait indefinitely, avoiding starvation. A way to solve this knapsack problem is by using dynamic programming. However, the size of our problem explodes the combinatorial possibilities (large number of jobs and large capacity). In addition, our Gaussian approach verifies if we are below the power capping by considering all jobs taken together. For these reasons, we propose a greedy approach to solve our knapsack problem in a reasonable time.
Please read [the algorithm implementation](./knapsack_greedy.cpp) for all details.
### Power verification
Both easypower and knapsack_greedy implement some ways to verify the power capping. Giving that $\hat{P_j}$ is the power prediction, $\mathcal{J}[t]$ is the set of jobs running at time $t$, and $\overline{P}$ is the power capping, they are:
-`Mean`: This method uses the predicted mean of each job. Therefore, $\sum_{i \in \mathcal{J}[t]} \hat{P_i}^{avg} \leq \overline{P}$
-`Maximum`: This method is similar to the previous one, but is the max instead of the mean. Therefore, $\sum_{i \in \mathcal{J}[t]} \hat{P_i}^{\max} \leq \overline{P}$
-`Gaussian`: The Gaussian method uses a probabilistic model for jobs' energy consumption. This method tries to quantitatively formalize the intuition that, when many jobs execute concurrently, it is unlikely that all jobs simultaneously consume their peak power usage. In this case, to verify the power capping, we apply $\sum_{i \in \mathcal{J}[t]} \hat{P_i}^{avg} + (\sigma \times (\sqrt{\sum_{i \in \mathcal{J}[t]} (\hat{P_j}^{std})^2})) < \overline{P}$. $\sigma$ controls the probability to be below ($\sigma = 1$ for 0.68, $\sigma = 2$ for 0.95, and $\sigma = 3$ for 0.995)
## Usage
This repository is a [Nix flake](https://nixos.wiki/wiki/Flakes) and can be used via this interface.
Alternatively, you can compile the algorithm dynamic library via `meson` + `ninja` with a command such as `meson setup build && ninja -C build` (dependencies are defined in [flake.nix](./flake.nix) and [flake.lock](./flake.lock)).
-`idle_watts` is the amount of watts that a **single node** consumes when idle
-`powercap_dynamic_watts` is the value of the powercap during the constrained time window, in watts
-`normal_dynamic_watts` is the value of the powercap after the constrained time window, in watts
-`powercap_end_time_seconds` is the duration of the constrained time window. the time window always starts at t=0
-`job_power_estimation_field` is the name of the field to use as the estimation of a job's dynamic power consumption.
Please note that this field must be set in the workload file used.
-`job_power_estimation_field` is the name of the field to use as the estimation of a job's dynamic power consumption. Please note that this field must be set in the workload file used
-`order`: Only for easypower. This parameter controls which order to use for sorting waiting queue. The possible values are `"fcfs"` or `"saf"`
-`type_knapsack`: Only for knapsack_greedy. This parameter controls which profit type we implement. It can be `"waiting_time"` or `"waiting_time_ratio"`
-`sigma_times`: It controls the sigma times for the Gaussian verification
## Versions
This has been tested with the following important dependencies: