updated readme after rerun by Mael

3bfd1215 · Maël Madon · 4f637479 · 3bfd1215
Commit 3bfd1215 authored 1 year ago by Maël Madon
--- a/README.md
+++ b/README.md
-# Multi-behavior experiment
+# Sufficiency behaviors under renewable production for data center users 
-This directory contains the necessary script for the experiment 
+This repository is the artifact associated with the article
-with batmen multi-behavior class using metacentrum dataset.
+J. Gatt, M. Madon, G. Da Costa
-It is a fork from demand-response-user 
+*"Digital sufficiency behaviors to deal with intermittent energy sources in a data center"*,
-(https://gitlab.irit.fr/sepia-pub/open-science/demand-response-user). 
+published at the conference ICT4S'24.
+It contains all the scripts and files needed to reproduce the experiments that are described.
+It is originally a fork from [demand-response-user](https://gitlab.irit.fr/sepia-pub/open-science/demand-response-user). 
 ## Description of the main files
- `scripts/prepare_workload.sh` shell script to prepare the workload
+- folders `behavior_file/`, `data_energy/`, `platform/` and `workload/`: contain the simulation inputs
- `scripts/run_expe.sh` shell script which launch experiments when workload is prepared.
+- `default.nix`: [Nix] file defining the software dependencies and their specific versions
- `scripts/compute_stat.sh` shell script which compute stats when the experiments finished.
+- `scripts/prepare_workload.sh`: script to download and filter the input workload
- `default.nix` nix file given all the necessary dependencies.
+- `scripts/run_expe.sh`: script to launch the full experimental campaign
- `campaign.py`: Python script preparing and launching in parrallel the experiments. 
+(with the help of `campaign.py`, preparing and launching in parallel the experiments,
-Each experiment corresponds to one instance of `instance.py`.
+each being one instance of `instance.py`)
- `analyse_campaign.ipynb`: Jupyter notebook analysing the results.
+- `scripts/compute_stat.sh` script to compute statistics on the experiment outputs
+- `analyse_campaign.ipynb`: Jupyter notebook analyzing the results and plotting the graphs
 ## Steps to reproduce the experiments
-The version used for the experiments is the version tagged `experiments-version`.
+### 1. Install dependencies (install Nix + ~4 minutes)
-Once the repository clone you can type this command to switch to the tagged version :
-```bash
-git checkout tags/experiments-version
-```
-To reproduce the following steps make sure to be on this version. 
-### 1. Install dependencies
-The main software used (and configured in the file `default.nix`) are:
+The main software used for our simulations are:
- [Batsim](https://batsim.org/) and [SimGrid](https://simgrid.org/) for the infrastructure simulation
+- [Batsim] v4.10 and [SimGrid](https://simgrid.org/) v3.31 for the infrastructure simulation
- [Batmen](https://gitlab.irit.fr/sepia-pub/mael/batmen): our set of schedulers for batsim and plugin to simulate users
+- [Batmen](https://gitlab.irit.fr/sepia-pub/mael/batmen) v2.0: our set of schedulers for [Batsim] and plugin to simulate users
 - [Batmen-tools](https://gitlab.irit.fr/sepia-pub/mael/batmen-tools/-/tree/main/) : a set of tools to manipulate swf files
 - python3, pandas, jupyter, matplotlib etc. for the data analysis
-All the dependencies of the project are given in the `default.nix` file.
+All the dependencies of the project and their specific versions are handled by the package manager [Nix].
-If you do not have Nix installed on your machine, you can install it with
+We highly recommend you to use it if you want to reproduce the experiments on your machine.
-```bash
+If you do not have [Nix] installed, follow the instructions provided on their [installation page](https://nixos.org/download/#download-nix).
-scripts/install_nix.sh
-```
+Once the installation is done, enter a nix-shell where all the dependencies are available. 
-It might be required to type other commands for the nix command to be
+This will download, compile and install them, which can take some time:
-available in the current shell. In this case it will be indicated by the 
-prompt of the nix installation.
-Open a shell with all the dependencies managed:
 ```bash
 nix-shell -A exp_env --pure
 ```
-This will compile and install all the dependencies needed for the experiments.
+### 2. Prepare input workload (~ 2 minutes)
-It can take some time (in our case, it took 6 minutes).
-### 2. Prepare input workload
 Inside the nix-shell, run the following script to download (from the [Parallel Workloads Archive](https://www.cs.huji.ac.il/labs/parallel/workload/)) and filter the input workload used in the experiments:
 ```bash
 scripts/prepare_workload.sh
 ```
-### 3. Launch the campaign
+### 3. Launch campaign (~7 hours)
-By default the `run_expe` scripts only run one seed by experiment 
+The experimental campaign consists of
-to do the 30 experiments you have to modify `--nb_replicat` argument
+- preparing the inputs for each instance (random seed, energy state windows, workload, behavior scenario),
-it should look like this:
+- running the simulations with [Batsim] and Batmen, and
+- computing aggregated metrics on all the simulation outputs.
+The full experimental campaign described in the paper is quite long.
+It took us 7 hours to complete in parallel on a 2x16 core Intel Xeon Gold 6130 machine.
+For this reason, we provide two versions of the campaign:
+- `scripts/run_expe.sh`: the campaign presented in the paper with 164 days, 4 effort scenarios and 30 replicates
+- `scripts/run_expe_test.sh`: a small campaign with 3 days, 2 effort scenarios and 30 replicates *(runtime < 1 minute)*
+Both of them take charge of calling `campaign.py` with the good parameters.
+For example, for the full experimental campaign:
 ```bash 
 python3 campaign.py --nb-replicat 30 --expe-range -1 --window-mode 8 --nb-days 164 \
@@ -68,6 +69,8 @@ python3 campaign.py --nb-replicat 30 --expe-range -1 --window-mode 8 --nb-days 1
 --compress-mode --production-file data_energy/energy_trace_sizing_solar.csv data_energy/energy_trace_sizing_solar.csv
 ```
+For more information on the available parameters, run `python3 campaign.py --help`
 As every experiment can take up to 20 GB of RAM, 
 you might be limited by the memory of your system.
 When you are running the experiments you can limit 
@@ -75,72 +78,63 @@ the number of parallel run using `--threads n` command-line argument
 with `n` the maximum of experiments to run in parallel.
 By default, it uses every physical cores available.
-Once you have done all the previous steps, 
-launch the bash script in the nix-shell :
-```bash
+### 4. Data analysis (~ 1 minute)
-scripts/run_expe.sh
-```
+From the aggregated metrics provided by the previous steps, we can plot the graphs presented in the paper.
-If you see this line :
+Everything is explained in the notebook `analyse_campaign.ipynb`.
-```
+You will have to change the variables defined in the first cell to match with your setup.
-Rigid finished, Now monolithic behavior
-```
+Still from inside the nix-shell, you can run the notebook with 
-It means that the program has finished computing the simulation 
-and is now computing stat on the obtained value. 
-You can stop the program now if you only want raw simulation results
-### 4. Generate the metrics
-The tagged `experiments-version` version forget some metrics in computation.
-After the experiments, to compute the metrics we have to switch to tag 
-`metrics-compute-version`, and then compute the desired metrics 
-using the command :
 ```bash
-scripts/compute_stat.sh
+jupyter notebook
 ```
-The stat will be computed and place in the shell directory  with the names 
-`campaign_metrics.csv` and `campaign_metrics_relative.csv`.
-It will likely differ from the one provided in result_big_expe. 
+Given the time required to reproduce the full experimental campaign,
-In our reproduction we noticed a relative difference of 0.5% in energy related metrics and
+we provide the campaign's aggregated metrics in the files `out/metrics_fullexpe.csv` and `out/metrics_relative_fullexpe.csv`.
+Note that these metrics were obtained with a previous version of the campaign,
+and might differ from the one you obtain. 
+In our reproduction, we noticed a relative difference of 0.5% in energy related metrics and
 2% in user behaviors related metrics
-### 5. Generate the graph 
-To generate the graph you can launch the notebook `analyse_campaign.ipynb`.
-You will have to change the variable `RAW_DATA_DIR` and `OUT_DIR` to match with your setup.
+## Extra informations
 ### Tips 
 With the experiments, there are various scripts_file provided 
 to help you manage the data of experiments :
 - `scripts/compress_out.sh out/ out.tar.zst`
-allow you to compress the out_dir in a tar file for archive purpose. 
+allow you to compress the `out_dir` in a tar file for archive purposes. 
 In our case, we divided by 7 the space used by the experiments result.
 - `scripts/sync_expe_out.sh out/  path_to_backup/` allow you to do 
 backup of simulation data in a backup directory. 
 It uses rsync so it will only write changes if you do twice this command.
-## Energy Data
-The provided example_trace_occitanie_2019.csv comes from a modification of the Open Data Réseaux Electrique 
+### Energy Data
-energy production dataset for Occitanie in 2019 ( the original file can be directly downloaded 
+The provided `example_trace_occitanie_2019.csv` comes from a modification of the Open Data Réseaux Electrique 
+energy production dataset for Occitanie in 2019 (the original file can be directly downloaded 
 [here](https://odre.opendatasoft.com/explore/dataset/eco2mix-regional-cons-def/information/?disjunctive.libelle_region&disjunctive.nature)).
-The provided energy_trace_sizing_solar.csv
+The provided `energy_trace_sizing_solar.csv`
-is the energy trace of the energy produced by DataZero2 sizing algorithm
+is the energy trace of the energy produced by DataZero2 sizing algorithm.
-## Advanced options
+### Advanced options
-Inside the nix shell exp_env, launch the commands :
+Inside the nix-shell exp_env, launch the command `python3 campaign.py --help` to get the details of every possible arguments.
-```bash
-python3 campaign.py --help
+You will have to give at least the following arguments :
-```
-You will get a details of every possible arguments.
-You will have to at least give the following arguments :
 - `--expe-range` to provide the expe to do for all available 
 experiment type `--expe-range -1` else provide the list of experiment 
 with experiments `[0,1]` it will be `--expe-range 0 1`
-## Information about experiments
+### Information about experiments
 The experiments took 7 hours to do on two Intel Xeon Gold 6130 and the output took 55 GB to store, 
 once compress using `scripts/compress-out.sh` it takes 7.8 GB. 
 As each experiments took approximately 20 GB of RAM and we had 188GB available we weren't able to exploit at full capacity the CPU, 
 so with more memory we could have better speed.
 ### List of metrics 
-The metrics computed for the experiments are the following ( available in result-big-expe) :
+The metrics computed for the experiments are the following (available in `result-big-expe`) :
 - `XP`,`dir`, `behavior`, `seed`,
 the experiment number in our case it is always 0, the directory from which the data where computed,
 the behavior the user have (probability distribution can be found in behavior_file),
@@ -178,4 +172,7 @@ check if the original submission time is the real submission time in the jobs ba
 Compute the waiting time and slowdown while taking into account of behaviors 
 (submission time is the original one without see you later, reconfig jobs execution time is the one before reconfiguring)
 It has been computed by using behaviors stat and jobs data (1), or by crossing the data with the rigid case (2).
 As the difference (sanity) are quite big we are not sure of the accuracy of these data computed.
\ No newline at end of file
+[Nix]:(https://nixos.org/)
+[Batsim]:(https://batsim.org/)
\ No newline at end of file