Skip to content
Snippets Groups Projects
Commit 461eae8c authored by Millian Poquet's avatar Millian Poquet
Browse files

artifacts: guide overview embryo

parent e2298aad
No related branches found
No related tags found
No related merge requests found
#import "@preview/big-todo:0.2.0": *
#import "@preview/showybox:2.0.1": showybox
#set page(
paper: "a4",
margin: 2cm,
footer: context [
#set align(center)
#counter(page).display() / #counter(page).final().at(0)
]
)
#set heading(numbering: "1.1.")
#let subsetbox(content, ..args) = showybox(..args,
frame: (
footer-color: green.lighten(80%),
thickness: 0.125mm,
radius: 0mm,
footer-inset: (x: 0.5em, y: 0.2em),
),
footer-style: (
color: luma(50),
align: right,
sep-thickness: 0.125mm,
),
content
)
#let fullbox(content, ..args) = showybox(..args,
frame: (
radius: 0mm,
thickness: 0.5mm,
footer-inset: (x: 0.5em, y: 0.25em),
),
footer-style: (
color: luma(50),
align: right,
sep-thickness: 0.5mm,
),
content
)
#set par(justify: true)
#show link: x => underline(offset: 0.5mm, stroke: .25mm, text(font: "DejaVu Sans Mono", weight: "semibold", size: 9.5pt, fill: blue, x))
#let todo = todo.with(inline: true)
#[
#line(length:100%, stroke: .5mm)
#set text(size: 20pt)
//#set align(center)
#show par: set block(spacing: 3mm)
#set par(leading: 2mm)
*Artifacts Overview*
#set par(leading: 2mm)
#show par: set block(spacing: 4mm)
#set text(size: 12pt)
#set align(left)
*Conference.* Euro-Par 2024\
*Article.* Light-weight prediction for improving energy consumption in HPC platforms\
*Links*.
#set list(marker: none, body-indent: 5mm)
- Preprint on HAL (long-term). #link("https://hal.science/hal-04566184")
- Artifact data on Zenodo (long-term). #todo([zenodo link])
- Artifact code on Software Heritage (long-term). #todo([software heritage link])
- Artifact code on our GitLab. #todo([git link])
#line(length:100%, stroke: .5mm)
]
//#outline(indent: 5mm)
= Introduction
This document shows how to reproduce the experimental sections (6.2 to 6.5) of article @lightpredenergy.
We hope that this document is enough to reproduce the whole experiments from scratch.
However, as reproducing the exact analyses and experiments conducted by the authors requires to download and store lots of input trace data (#box([$tilde.eq$ 300 Go)]) and to do some heavy computations,
various intermediate and final results have been cached and made available on #todo[Zenodo] to enable the reproduction of only subparts of the experimental pipeline. In particular, the final analyses of the article are done in standalone notebooks whose input data is available.
Unless otherwise specified, all commands shown in this document similar to the one in the box below are expressed in #link("https://en.wikipedia.org/wiki/Bourne_shell")[`sh`] and thus compatible with `bash` and `zsh`. Every command that takes a significant amount of time, storage or bandwidth have its overhead given in the second part of the box. Unless otherwise specified, amounts of times are those we obtained on a powerful machine (2x Intel Xeon Gold 6130).
#fullbox(footer: [Time: 00:00:01.])[
```sh
echo 'Example command'
sleep 1
```
]
= Getting Started Guide
All the software environments required to reproduce the analyses and experiments of article @lightpredenergy are open source and have been packaged with #link("https://nixos.org/", [Nix]).
Nix can build the *full* software stack needed for this experiment as long as source code remains available. As we also put our source code on #link("https://www.softwareheritage.org/")[Software Heritage] we hope that this artifact will have long-term longevity. For the sake of this artifact reviewers' quality of life, we have set up a binary cache with precompiled versions of the software used in the experiments.
No special hardware is required to reproduce our work. We think that our Nix environments will work on future Nix versions, but for the sake of traceability we stress that we have used Nix 2.18.0 installed either by #link("https://archive.softwareheritage.org/swh:1:rev:b5b47f1ea628ecaad5f2d95580ed393832b36dc8;origin=https://github.com/DavHau/nix-portable;visit=swh:1:snp:318694dfdf0449f0a95b20aab7e8370cff809a66")[nix-portable 0.10.0] or directly available via NixOS using channel `23-11`.
Our software environments likely work on all platforms supported by Nix (Linux on `i686`/`x86_64`/`aarch64` and MacOS on `x86_64`/`aarch64` as of 2024-05-07) but we have only tested it on Linux on `x86_64`. More precisely, we have used the #link("https://www.grid5000.fr/w/Grenoble:Hardware#dahu")[Dahu Grid'5000 cluster] (Dell PowerEdge C6420, 2x Intel Xeon Gold 6130, 192 GiB of RAM) on the default operating system available on Grid'5000 as of 2024-05-07 (Debian `5.10.209-2` using Linux kernel `5.10.0-28-amd64`).
== Install Nix
If you are already using NixOS, Nix should already be usable on your system.
Otherwise up-to-date information on how to install Nix is available on #link("https://nixos.org/download/").
As of 2024-05-07 the recommended command to install Nix (on a Linux system running systemd, with SELinux disabled and `sudo` usable) is to run the following command.
#fullbox[
```sh
sh <(curl -L https://nixos.org/nix/install) --daemon
```
]
Please note that you may need to launch a new shell, to source a file or to modify your shell configuration script as indicated by the Nix installer.
*Test your installation.* Launching `nix-shell --version` should run and print you the Nix version installed.
== Enable Nix flakes
Our Nix packages rely on #link("https://nixos.wiki/wiki/Flakes")[Nix flakes], which are not enabled by default as of 2024-05-07.
Up-to-date information on how to enable them can be found on the #link("https://nixos.wiki/wiki/Flakes")[Nix flakes documentation].
If you are using NixOS and as of 2024-05-07, flakes can be enabled by setting the following in your system configuration file.
#h(1fr) #box(stroke: (thickness: .1mm, dash: "densely-dashed"), fill: luma(90%), outset:1mm)[
#set align(bottom)
```nix
nix.settings.experimental-features = [ "nix-command" "flakes" ];
```
]
Otherwise, as of 2024-05-07, the following commands should enable flakes on non-NixOS Linuxes.
#fullbox[
```sh
mkdir -p ~/.config/nix/
echo 'experimental-features = nix-command flakes' > ~/.config/nix/nix.conf
```
]
*Testing your flakes configuration.* Launching `nix build 'github:nixos/nixpkgs?ref=23.11#hello'` should create a `result` symbolic link in your current directory. Then, launching `./result/bin/hello` should print `Hello, world!`.
== Configure Nix to use our binary cache
This step is completely optional but recommended for this artifact reviewers, as it enables to download precompiled versions of our software environments instead of building them on your own machine.
#todo[cachix]
== Version traceability
#todo[dump all envs versions]
= Step-by-Step Instructions
#todo[clone repo]
#todo[introduce data caches on zenodo]
#todo[introduce how commands should be run (root of cloned repo)]
#todo[propose paths in the graph depending on what the reproducer wants to do]
== Trace analysis??
== Job power prediction <sec-job-power-pred>
*Inputs.* ???\
*Outputs.*
- Job power predictions for all prediction methods. #todo[cache mean/max power prediction tarballs]
== Modeling of the power behavior of Marconi100 nodes
*Inputs.* None.\
*Outputs.*
- Marconi100 power and job traces on your disk.
- Marconi100 nodes power model.
- Notebook that analyses the power profiles of M100 nodes.
=== Get power and job Marconi100 traces on your disk <sec-m100-power-job-traces>
This section downloads parts of the Marconi100 trace from Zenodo, checks that the downloaded parts have the right content (via a md5 checksum), extracts the data needed by later stages of the pipeline (power usage traces, jobs information traces), then removes unneeded extracted files and the downloaded files.
#fullbox(footer:[Download: 254 Go. Final disk used: 2.5 Go. Time: 00:40:00.])[
```sh
nix develop .#download-m100-months --command \
m100-data-downloader ./m100-data 22-01 22-02 22-03 22-04 22-05 22-06 \
22-07 22-08 22-09
```
]
=== Analyze Marconi100 power traces <sec-analyze-m100-power-traces>
*Outputs.*
- powermodel file
== Job scheduling with power prediction <sec-sched>
This section shows how to reproduce Sections 6.4 and 6.5 of article @lightpredenergy.
=== Prepare all the files required to run the simulation
==== Generate a SimGrid platform
The following command generates the SimGrid platform used for the simulations.
This requires a power model of the Marconi100 nodes (as outputted by @sec-analyze-m100-power-traces).
#fullbox[
```sh
nix develop .#py-scripts --command \
m100-generate-sg-platform ./m100-data/22-powermodel_total.csv 100 \
-o ./expe-sched/m100-platform.xml
```
]
==== Generate simulation instances
The following commands generate workload parameters (_i.e._, when each workload should start and end). The start points are taken randomly during the 2022 M100 trace.
#fullbox[
```sh
nix develop .#gen-simu-instances --command \
m100-generate-expe-workload-params -o ./expe-sched/workload-params.json
nix develop .#gen-simu-instances --command \
m100-generate-expe-params ./expe-sched/workload-params.json \
-o ./expe-sched/simu-instances.json
```
]
==== Merge job power predictions and jobs information into a single file
The job power predictions (as outputted by @sec-job-power-pred) are two archives that we assume are on your disk in the `./user-power-predictions` directory.
These archives contain gzipped files for each user.
To make things more convenient for the generation of simulation inputs, all the job power prediction files are merged into a single file with the following commands.
#fullbox(footer: [Temporary disk used: 519 Mo. Final disk used: 25 Mo. Time: 00:00:30.])[
```sh
mkdir ./user-power-predictions/tmp
nix develop .#merge-m100-power-predictions --command \
tar xf ./user-power-predictions/*mean.tar.gz --directory ./user-power-predictions/tmp
nix develop .#merge-m100-power-predictions --command \
tar xf ./user-power-predictions/*max.tar.gz --directory ./user-power-predictions/tmp
nix develop .#merge-m100-power-predictions --command \
gunzip ./user-power-predictions/tmp/*/*.gz
nix develop .#merge-m100-power-predictions --command \
m100-agg-power-predictions ./user-power-predictions/tmp \
./m100-data/22-job-power-estimations.csv
rm -rf ./user-power-predictions/tmp
```
]
Similarly, Marconi100 job traces are also merged into a single file.
#fullbox(footer: [Final disk used: 343 Mo. Time: 00:02:00.])[
```sh
nix develop .#py-scripts --command \
m100-agg-jobs-info ./m100-data/ ./m100-data/22-jobs.csv \
22-01 22-02 22-03 22-04 22-05 22-06 22-07 22-08 22-09
nix develop .#py-scripts --command \
m100-join-usable-jobs-info ./m100-data/22-job-power-estimations.csv \
./m100-data/22-jobs.csv \
./m100-data/22-jobs-with-prediction.csv
```
]
==== Generate workloads
The following command generates all the workloads needed by the simulation.
*This step is very long!*
#todo[zenodo]
#fullbox(footer: [Time: 05:00:00.])[
```sh
nix develop .#py-scripts --command \
m100-generate-expe-workloads ./expe-sched/workload-params.json \
./m100-data/22-jobs-with-prediction.csv \
./m100-data \
-o /tmp/wlds
```
]
=== Run the simulation campaign
The following command runs the whole simulation campaign. This requires all the
#fullbox[
```sh
nix develop .#simulation --command \
m100-run-batsim-instances ./expe-sched/simu-instances.json \
-w /tmp/wlds \
-o /tmp/simout \
--output_state_file ./expe-sched/exec-state.json \
--output_result_file ./expe-sched/agg-result.csv
```
]
=== Analyze the simulation campaign outputs
#fullbox[
```sh
nix develop .#r-notebook --command \
Rscript notebooks/run-rmarkdown-notebook.R \
notebooks/simulation-output-analysis.Rmd
```
]
#bibliography("artifact-bib.yml")
#todo_outline
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment