From cfd14105cad0d976605acb2e06103b3b35c42a8b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ma=C3=ABl=20Madon?= <mael.madon@irit.fr> Date: Wed, 13 Sep 2023 10:37:17 +0200 Subject: [PATCH] update readme (steps to reproduce) and notebooks for figure directory --- KTH.ipynb | 2 +- README.md | 51 +++++++++++++++++++++++++++++++++++++++++++++++---- SDSC.ipynb | 2 +- 3 files changed, 49 insertions(+), 6 deletions(-) diff --git a/KTH.ipynb b/KTH.ipynb index 1ded19b..193613c 100644 --- a/KTH.ipynb +++ b/KTH.ipynb @@ -56,7 +56,7 @@ "import evalys.visu.legacy as vleg\n", "\n", "empty_wl = \"workload/empty_workload.json\"\n", - "! mkdir -p {EXPE_DIR} {WL_folder}\n", + "! mkdir -p {EXPE_DIR} {WL_folder} {fig_path}\n", "\n", "header=[\n", " \"JOB_ID\",\"SUBMIT_TIME\",\"WAIT_TIME\",\"RUN_TIME\",\"ALLOCATED_PROCESSOR_COUNT\",\"AVERAGE_CPU_TIME_USED\",\"USED_MEMORY\",\n", diff --git a/README.md b/README.md index 898958b..3cdc0ce 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,56 @@ # Compare traditional replay and replay with feedback -This repository contains all the necessary material to reproduce the experiments and the graphs presented in the article XXXXXX. +This repository contains all the necessary material to reproduce the experiments and the graphs presented in the article XXXXXX, submitted to [FGCS](https://www.sciencedirect.com/journal/future-generation-computer-systems): -There are two notebooks: `expe_replay_feedback_KTH.ipynb` and `expe_replay_feedback_SDSC.ipynb` which contain essentially the same treatments, only the input workload changes. +- `default.nix`: [Nix](https://nixos.org) file describing the software dependencies and their specific versions +- `workload/`: folder storing the workload files +- `platform/`: folder containing the platform files +- `KTH.ipynb` and `SDSC.ipynb`: Jupyter notebook running and analysing the experiments + +The two notebooks are self-contained. They download the input data from the [Parallel Workload Archive](https://www.cs.huji.ac.il/labs/parallel/workload/), run the experiments presented in the article, do the data analysis and plot the graphs. +Both files contain essentially the same treatments, **only the input workload changes**. -Running entirely the two notebooks takes less than 30 minutes (on one core of an i5-1135 processor). *Note: it shouldn't be very hard to run the experiments with another workload from [Parallel Workload Archive](https://www.cs.huji.ac.il/labs/parallel/workload/). It requires adapting the variables set in the first cell of the notebook and creating a platform corresponding to the chosen workload. However, we use FCFS and EASY backfilling as scheduling algorithms, which might not correspond to the original scheduler used in the real infrastructure.* ## Steps to reproduce -TODO \ No newline at end of file +Time needed to reproduce the experiments (on a i5-1135 processor): +- downloading and compiling dependencies with Nix: ~10min +- `KTH.ipynb`: 17min11 +- `SDSC.ipynb`: 41min53 + +Disk space: 6.2GB for experiment files and 3.4GB for dependencies (in `/nix/store`) + +### With [Nix](https://nixos.org/download) package manager +Clone the repository and go to the specific version (you can also download the version tagged `fgcs-submission` from gitlab web interface): + +``` +git clone git@gitlab.irit.fr:sepia-pub/open-science/expe-replay-feedback.git +cd expe-replay-feedback +git fetch --all --tags && git checkout tags/fgcs-submission +``` + +Enter a Nix-shell, where all the dependencies you might not have on your machine are carefully managed: + +``` +nix-shell -A exp_env +``` + +Start a jupyter notebook (or any IDE providing .ipynb support): + +``` +jupyter notebook +``` + +Open `KTH.ipynb` or `SDSC.ipynb` and run all the cells, in order. + +### Without Nix (not recommended) +You will need the following dependencies: +- [batsim](https://batsim.org/) v4.2 +- [simgrid](https://simgrid.org/) v3.34 +- [batexpe](https://framagit.org/batsim/batexpe.git) v1.2.0 +- [batmen](https://gitlab.irit.fr/sepia-pub/mael/batmen.git) tag `replay_feedback2023` +- [swf2userSessions](https://gitlab.irit.fr/sepia-pub/mael/swf2userSessions.git) tag `replay_feedback2023` +- [batmenTools](https://gitlab.irit.fr/sepia-pub/mael/batmen-tools.git) commit `caefe1c12a059c919d2710ee8a00b9c179faf907` +- some python packages like [evalys](https://evalys.readthedocs.io/), pandas, jupyter... \ No newline at end of file diff --git a/SDSC.ipynb b/SDSC.ipynb index b0e2e23..75692cc 100644 --- a/SDSC.ipynb +++ b/SDSC.ipynb @@ -56,7 +56,7 @@ "import evalys.visu.legacy as vleg\n", "\n", "empty_wl = \"workload/empty_workload.json\"\n", - "! mkdir -p {EXPE_DIR} {WL_folder}\n", + "! mkdir -p {EXPE_DIR} {WL_folder} {fig_path}\n", "\n", "header=[\n", " \"JOB_ID\",\"SUBMIT_TIME\",\"WAIT_TIME\",\"RUN_TIME\",\"ALLOCATED_PROCESSOR_COUNT\",\"AVERAGE_CPU_TIME_USED\",\"USED_MEMORY\",\n", -- GitLab