update config and readme

d54ce8a4 · huongdm1896 · 5dbe9e9a · d54ce8a4 · d54ce8a4 · d54ce8a4
Commit d54ce8a4 authored 4 months ago by huongdm1896
--- a/README.md
+++ b/README.md
 # Measure Energy of Flower FL in G5K
+This project provides tools to measure the energy consumption of Flower-based federated learning (FL) experiments on the Grid'5000 (G5K) testbed. It includes scripts to manage distributed nodes, run FL experiments, and monitor energy usage.
+## Table of Contents
+- [Getting Started](#getting-started)
+- [Installation](#installation)
+- [Usage](#usage)
+  - [Step 1. Configure](#step-1-configure)
+  - [Step 2. Reserve the Hosts in G5K](#step-2-reserve-the-hosts-in-g5k)
+  - [Step 3. Collect IP](#step-3-collect-ip)
+  - [Step 4. Run the Campaign or Instance](#step-4-run-the-campaign-or-instance)
+  - [Step 5. Output](#step-5-output)
+  - [Step 6. Clean Up](#step-6-clean-up)
+- [Quickstart](#quickstart)
+- [Output Structure](#output-structure)
+- [License](#license)
 ## Getting Started
-These instructions will let you know how to run it. 
+The repository includes an example of Flower (using TensorFlow) in the `Flower_v1` directory and the source of measuring framework in `Run`. This example demonstrates how to use this framework to measure energy consumption.
-An example of Flower (using tensorflow) is stored in Flower_v1. The test example will be use this source.  
+## Installation
-### Prerequisites
+This framework requires:
+- **Python 3.9.2** or higher.
+- Additional dependencies listed in `requirements.txt`. Install them with:
+  ```bash
+  pip install -r requirements.txt
+  ```
+*Note:* `requirements.txt` includes TensorFlow for running the provided Flower example.
-This framework requires **Python 3.9.2** or higher. Check your Python package version:
-```bash
-python3 --version
-```
-Other dependencies are stored in *requirements.txt*. Run:
-```bash
-pip install -r requirements.txt
-```
-This *requirements.txt* includes tensorflow to run the provided example Flwr.
-### Installing
-Download code:
+Clone the repository and navigate to the `Run` directory:
 ```bash
 git clone https://gitlab.irit.fr/sepia-pub/delight/eflwr.git
-```
-Go to Run directory:
-```bash
 cd eflwr/Run
 ```
-## Running the tests
+## Usage
-**Remark:** EFLWR is configured for Flower only. Your Flower framework must contain server and client script to run on distributed nodes. The Flower_v1 directory provides an example. Once configured, EFLWR will automatically execute flower on the specified nodes and measure the energy consumption.
-### Step 1. Configure
-Configure the information of each exp in `config_instance*.json`, you can add more `config_instance*.json` with * is a number. Use your editor instead of *vim* (*nano* etc). See the examples in 2 config files which provied in Run directory.
-```bash
-vim config_instance1.json
-```
-```
-"instance": the name of your exp, anything you want, I suggest to input the specific name of your testing.
-"output_dir": where you want to store your log (your exp log and energy mornitoring log)
+- FL scripts can be updated in `Flower_v1`.
+- Configure each instance of experiment in `Run\config_instance*.json`.
+- Follow these steps to config and run your experiment (or jump to [Quickstart](#quickstart) to run an example).
-"server": 
+### Step 1. Reserve the Hosts in G5K
-    "command": which cmd sever use, my case is python3.
-    "args": includes file to run and arguments
-    "ip": address of server, this one will be automatic input when you run the code, just leave it blank
-    "port": default 8080m dont change it.
-"clients":
+Reserve the required number of hosts (*See the [document of G5K](https://www.grid5000.fr/w/Getting_Started#Reserving_resources_with_OAR:_the_basics) for more details*)
-    "name": numbering/name your client, should be client1.2.3...
+```bash
-    "command": which cmd client use, my case is python3
+oarsub -I -l host=[number_of_hosts],walltime=[duration]
-    "args": includes file to run and arguments
-    "ip": address of client, this one will be automatic input when you run the code, just leave it blank.
 ```
-Note that if you create *x* clients in json config then you have to researve *x+1* hosts in g5k (1 for server and *x* for *x* clients).  
+### Step 2. Configure
+Edit the JSON configuration file (`config_instance*.json`) to specify experiment details. You can create multiple `config_instance*.json` files with the * is numbering of instance (the numbers must be consecutive positive integers starting from 1.)
-### Step 2. Reserve the hosts in g5k:
-Run below cmd:
 ```bash
-oarsub -I -l host=<number_of_hosts>,walltime=<number_of_during>
+vim config_instance1.json
 ```
-For example: 1 sever and 3 clients need 4 hosts.
+Example structure:
-```bash
-oarsub -I -l host=4,walltime=2
+```json
-```
+{
+    "instance": "fedAvg_cifar10",
+    "output_dir": "/home/mdo/Huong_DL/Log",
+    "dvfs": {
+        "dummy": false,
+        "baseline": false,
+        "frequencies": [2000000,2200000]
+    },
+    "server": {
+        "command": "python3",
+        "args": [
+            "Flower_v1/server.py",
+            "-r 50",
+            "-s fedAvg"
+        ],
+        "ip": "172.16.66.18",
+        "port": 8080
+    },
+    "clients": [
+        {
+            "name": "client1",
+            "command": "python3",
+            "args": [
+                "Flower_v1/client_1.py",
+                "cifar10",
+                "1",
+                "3"
+            ],
+            "ip": "172.16.66.2"
+        }
+        {
+            "name": "client2",
+            "command": "python3",
+            "args": [
+                "Flower_v1/client_1.py",
+                "cifar10",
+                "1",
+                "3"
+            ],
+            "ip": "172.16.66.3"
+        }
+    ]
+}
+```
+- **instance**: The name of your experiment.
+- **output_dir**: Where to store the log files (experiment log and energy monitoring log).
+- **dvfs**: choose only one in 3 settings, detects all available frequencies and go through all of them.
+  - `dummy`: false or true (Only uses min and max frequency)
+  - `baseline`: false or true (Only uses max freq)
+  - `frequencies`: null or int list (Limits to the provided list of frequencies)
+**Remark:** check the available frequencies before using oftion `frequencies`.
+- Set the permissions and disable Turbo Boost first:
+```bash
+bash "$(python3 -c "import expetator, os; print(os.path.join(os.path.dirname(expetator.__file__), 'leverages', 'dvfs_pct.sh'))")" init
+```
+- Run this command to get available frequencies:
+```bash
+python3 get_frequencies.py
+```
+- Update extraced frequencies value to configure files.
 ### Step 3. Collect IP
-Run below cmd:
+Run the following command to generate a node list:
 ```bash
 uniq $OAR_NODEFILE > nodelist
 ```
-Then automatically fill out missing IP addresses in `json`:
+Automatically populate missing IP addresses in the JSON file:
 ```bash
 python3 collect_ip.py
 ```
-### Step 4. Run the campaign or instance: 
+### Step 4. Run the Campaign or Single Instance
-Now you can run the monitoring campaign by run_measure.py. Note that, this function will scan all the json file in the Run directory with the name "config_instance*.json".
+Run campain:
 ```bash
-python3 run_measure.py -x <your_str> -r <number_of_repetation>
+python3 run_measure.py -x [experiment_name] -r [repetitions]
 ```
+Run single instance:
-For example:
 ```bash
-python3 run_measure.py -x IamPretty -r 2
+python3 measure.py -c [config_file] -x [experiment_name] -r [repetitions]
 ```
-<your_str>: whatever_you_want to recorgnize your testing.  
+- **[experiment_name]**: The name you use to identify your experiment.
-<number_of_repetation>: number of repeatation of your exp
+- **[repetitions]**: Number of repetitions for the experiment.
-In case you only need to run 1 instance, you can use measure.py instead:
+### Step 5. Output
-```bash
-python3 measure.py -c <config_instance*.json> -x <your_str> -r <number_of_repetation>
+The logs and energy monitoring data will be saved in the directory specified in the JSON configuration.
-```
+### Step 6. Clean Up
-For example:
+After the experiment:
+Exit the host:
+  ```bash
+  exit
+  ```
+Check the job ID:
+  ```bash
+  oarstat -u
+  ```
+Kill the job:
+  ```bash
+  oardel <job_id>
+  ```
+## Quickstart
+Follow these steps to run an example:
+1. Reserve 4 hosts (1 server + 3 clients) for 2 hours:
 ```bash
-python3 measure.py -c config_instance1.json -x IamPretty -r 2
+oarsub -I -l host=4,walltime=2
 ```
+2. Configure
-### Step 6. Output
+`config_instance1.json` and `config_instance2.json` provide two examples of instance configuration. All fields are configured but "output_dir" and "args" must be updated with your directories setting.
-Check the output in the directory where you set in json file. The output structure:
+- `config_instance1.json`: fedAvg, cifar10, dvfs with min and max freq, 1 round.
-```plaintext
+- `config_instance1.json`: fedAvg2Clients, cifar10, dvfs with min and max freq, 1 round.
-/Flower_Test1
-├── Flower_instance_fedAvg_cifar10
-│   ├── Expetator
-│   ├── Expetator_gros-26.nancy.grid5000.fr_1732808824
-│   ├── Expetator_gros-26.nancy.grid5000.fr_1732808824_mojitos
-│   │   ├── gros-26.nancy.grid5000.fr_flower_1732808838
-│   │   ├── gros-38.nancy.grid5000.fr_flower_1732808838
-│   │   ├── gros-4.nancy.grid5000.fr_flower_1732808838
-│   │   └── gros-65.nancy.grid5000.fr_flower_1732808838
-│   ├── Expetator_gros-26.nancy.grid5000.fr_1732808824_power
-│   │   └── gros-26.nancy.grid5000.fr_flower_1732808838
-│   ├── Flwr_20241128_164718
-│   │   ├── Client_172.16.66.38
-│   │   ├── Client_172.16.66.4
-│   │   ├── Client_172.16.66.65
-│   │   ├── flower_log_summary.txt
-│   │   └── Server_172.16.66.26
-```
-### Step 7. Kill job in g5k after finish
-Exit the host:
+3. Collect IP
 ```bash
-exit
+uniq $OAR_NODEFILE > nodelist
+python3 collect_ip.py
 ```
-Check the job id:
+4. Run the Single Instance or Campaign
+Run single instance1 with `/Run/config_instance1.json`, 2 repetitions:
 ```bash
-oarstat -u
+python3 measure.py -c config_instance1.json -x SingleTest -r 2
 ```
-Kill the job:
+Run a campaign with all config_instance*.json in `/Run`, 2 repetitions:
 ```bash
-oardel <job_id>
+python3 run_measure.py -x CampaignTest -r 2
 ```
+## Output Structure
+Example output directory:
+```plaintext
+/Flower_<x>
+├── Flower_instance_<instance_name>
+│   ├── Expetator
+|   |   ├── config_instance*.json
+│   ├── Expetator_<host_info>
+│   ├── Expetator_<host_info>_power
+│   │   ├── <client_logs>
+│   ├── Flwr_<timestamp>
+│   │   ├── Client_<ip>
+│   │   ├── Server_<ip>
+```
+## License
+This project is licensed under [GPLv3]. 
\ No newline at end of file
--- a/Run/config_instance1.json
+++ b/Run/config_instance1.json
@@ -2,20 +2,15 @@
    "instance": "fedAvg_cifar10",
    "output_dir": "/home/mdo/Framework/eflwr/Log",
    "dvfs": {
-        "dummy": false,
+        "dummy": true,
        "baseline": false,
-        "frequencies": [
+        "frequencies": null
-            1000000,
-            1400000,
-            1800000,
-            2200000
-        ]
    },
    "server": {
        "command": "python3",
        "args": [
            "/home/mdo/Framework/eflwr/Flower_v1/server_1.py",
-            "-r 25",
+            "-r 1",
            "-s fedAvg"
        ],
        "additional_env_var": [

--- a/Run/config_instance2.json
+++ b/Run/config_instance2.json
@@ -2,20 +2,15 @@
    "instance": "fedAvg2Clients_cifar10",
    "output_dir": "/home/mdo/Framework/eflwr/Log",
    "dvfs": {
-        "dummy": false,
+        "dummy": true,
        "baseline": false,
-        "frequencies": [
+        "frequencies": null
-            1000000,
-            1400000,
-            1800000,
-            2200000
-        ]
    },
    "server": {
        "command": "python3",
        "args": [
            "/home/mdo/Framework/eflwr/Flower_v1/server_1.py",
-            "-r 25",
+            "-r 1",
            "-s fedAvg2Clients"
        ],
        "additional_env_var": [