This notebook analyzes the power consumption of the Marconi 100 nodes during 2022 (from 2022-01 to 2022-09), as available in the [M100 ExaData trace](https://gitlab.com/ecs-lab/exadata).
This notebook is part of the work that has been conducted for the article "Light-weight prediction for improving energy consumption in HPC platforms" published at Euro-Par 2024.
For full context of this work please refer to the article preprint, which is available on [[hal long-term open-access link]](https://hal.science/hal-04566184).
The goal of this notebook is to model how the nodes behave in terms of power consumption.
# Read the aggregated data
```{r, echo = TRUE}
suppressMessages(library(tidyverse))
suppressMessages(library(viridis))
library(knitr)
data = read_csv(params$m100_node_power_aggregation, show_col_types = FALSE) %>% transmute(
We can see that only a single node seems to have small power values. Let us give a look at the data directly.
```{r}
knitr::kable(all_nodes_agg)
```
We can see that the power values are discrete with a 20 W precision.
This is consistent with the ExaData documentation which states that values have been obtained via IPMI from a BMC.
This measurement system is mostly a failure control system not intended for high precision.
We can also see that all the measurements below 240 W are at 0 W. Let us see on which nodes these measurements come from.
```{r}
nodes_with_0w_measures = data %>%
filter(power == 0) %>%
group_by(node) %>%
summarize(total_nb_occ = sum(nb_occ))
knitr::kable(nodes_with_0w_measures)
```
All 0 W measures comes from node 155!
As the 0 W values are unexpected and as all values come from the same node, we have decided to **filter out 0 W values for the rest of this analysis and power modeling**.
```{r}
data = data %>% filter(power > 0)
cumulated_data = data %>% group_by(node) %>% arrange(power) %>% mutate(
cum_nb_occ = cumsum(nb_occ),
node = as.factor(node)
)
```
# Node power modeling
On the previous per-node eCDF power plot, we could see that most nodes have a wide range of power values but that some of them were idle most of the time. This is more clearly shown on the following figure (nodes that have a median power value lower than 450 W are classified as lazy).
SimGrid's computation hosts power model requires 3 power values : the minimum power of a powered on node (this is typically a CPU sleep state), the power when a tiny amount of work is done, and the power when the node is at full capacity.
As we are not doing a controlled experiment but using existing traces with little information about the applications that ran, we propose to use the minimum and maximum values of each node to instantiate this model.
Here is a unbiased (minimum and maximum values are in the plot, same linear scale for both axes) visualization of the minimum and maximum power values of all nodes. Each node is a point.
```{r, fig.width=6, fig.height=6}
minmax_per_node = data %>%
group_by(node) %>%
summarize(
min_power = min(power),
max_power = max(power)
) %>% mutate(
node = as.factor(node)
)
p = minmax_per_node %>%
ggplot() +
geom_jitter(aes(x=min_power, y=max_power)) +
theme_bw() +
labs(
x = "Node minimum power value (W)",
y = "Node maximum power value (W)"
)
p +
expand_limits(x=0, y=0) +
expand_limits(x=2100, y=2100)
```
Here is a zoomed view.
```{r}
p
```
Vertical bands are expected since the x axis range is small and the power values are discrete with a 20 W precision.
We can check whether the laziness of nodes impact their minimum and maximum power values.
As the number of lazy nodes is small (2 % of total nodes) and that remaining close to minimum power consumption value half of the time does not seem to be the normal behavior of HPC nodes, we have decided to **filter out lazy nodes** for the rest of this analysis and modeling.
We would have liked to be able to run controlled applications on M100 nodes to have sane values of the power consumption of each node.
Here, only using the ExaData M100 traces, we think that the minimum and maximum power values of non-lazy nodes can be used to generate the power model of nodes similar to M100 nodes. However, these values cannot be used safely since we cannot be sure that the maximum value came from the same application execution. This limitation, in addition to the fact that we realized that the "Usage trace replay" Batsim profile type introduced in Batsim-4.2.0 that we planned to use had poor performance on large platforms such as Marconi100, led us to simply replay the power traces of each job *a posteriori*, instead of using SimGrid to compute power values during the simulation.
The following code produces the power model file needed by the script that generates the SimGrid platform.