Skip to content
Snippets Groups Projects
Commit d54ce8a4 authored by huongdm1896's avatar huongdm1896
Browse files

update config and readme

parent 5dbe9e9a
Branches
No related tags found
No related merge requests found
# Measure Energy of Flower FL in G5K # Measure Energy of Flower FL in G5K
This project provides tools to measure the energy consumption of Flower-based federated learning (FL) experiments on the Grid'5000 (G5K) testbed. It includes scripts to manage distributed nodes, run FL experiments, and monitor energy usage.
## Table of Contents
- [Getting Started](#getting-started)
- [Installation](#installation)
- [Usage](#usage)
- [Step 1. Configure](#step-1-configure)
- [Step 2. Reserve the Hosts in G5K](#step-2-reserve-the-hosts-in-g5k)
- [Step 3. Collect IP](#step-3-collect-ip)
- [Step 4. Run the Campaign or Instance](#step-4-run-the-campaign-or-instance)
- [Step 5. Output](#step-5-output)
- [Step 6. Clean Up](#step-6-clean-up)
- [Quickstart](#quickstart)
- [Output Structure](#output-structure)
- [License](#license)
## Getting Started ## Getting Started
These instructions will let you know how to run it. The repository includes an example of Flower (using TensorFlow) in the `Flower_v1` directory and the source of measuring framework in `Run`. This example demonstrates how to use this framework to measure energy consumption.
An example of Flower (using tensorflow) is stored in Flower_v1. The test example will be use this source. ## Installation
### Prerequisites This framework requires:
- **Python 3.9.2** or higher.
- Additional dependencies listed in `requirements.txt`. Install them with:
```bash
pip install -r requirements.txt
```
*Note:* `requirements.txt` includes TensorFlow for running the provided Flower example.
This framework requires **Python 3.9.2** or higher. Check your Python package version:
```bash
python3 --version
```
Other dependencies are stored in *requirements.txt*. Run:
```bash
pip install -r requirements.txt
```
This *requirements.txt* includes tensorflow to run the provided example Flwr.
### Installing
Download code: Clone the repository and navigate to the `Run` directory:
```bash ```bash
git clone https://gitlab.irit.fr/sepia-pub/delight/eflwr.git git clone https://gitlab.irit.fr/sepia-pub/delight/eflwr.git
```
Go to Run directory:
```bash
cd eflwr/Run cd eflwr/Run
``` ```
## Running the tests ## Usage
**Remark:** EFLWR is configured for Flower only. Your Flower framework must contain server and client script to run on distributed nodes. The Flower_v1 directory provides an example. Once configured, EFLWR will automatically execute flower on the specified nodes and measure the energy consumption.
### Step 1. Configure
Configure the information of each exp in `config_instance*.json`, you can add more `config_instance*.json` with * is a number. Use your editor instead of *vim* (*nano* etc). See the examples in 2 config files which provied in Run directory.
```bash
vim config_instance1.json
```
```
"instance": the name of your exp, anything you want, I suggest to input the specific name of your testing.
"output_dir": where you want to store your log (your exp log and energy mornitoring log) - FL scripts can be updated in `Flower_v1`.
- Configure each instance of experiment in `Run\config_instance*.json`.
- Follow these steps to config and run your experiment (or jump to [Quickstart](#quickstart) to run an example).
"server": ### Step 1. Reserve the Hosts in G5K
"command": which cmd sever use, my case is python3.
"args": includes file to run and arguments
"ip": address of server, this one will be automatic input when you run the code, just leave it blank
"port": default 8080m dont change it.
"clients": Reserve the required number of hosts (*See the [document of G5K](https://www.grid5000.fr/w/Getting_Started#Reserving_resources_with_OAR:_the_basics) for more details*)
"name": numbering/name your client, should be client1.2.3... ```bash
"command": which cmd client use, my case is python3 oarsub -I -l host=[number_of_hosts],walltime=[duration]
"args": includes file to run and arguments
"ip": address of client, this one will be automatic input when you run the code, just leave it blank.
``` ```
Note that if you create *x* clients in json config then you have to researve *x+1* hosts in g5k (1 for server and *x* for *x* clients). ### Step 2. Configure
Edit the JSON configuration file (`config_instance*.json`) to specify experiment details. You can create multiple `config_instance*.json` files with the * is numbering of instance (the numbers must be consecutive positive integers starting from 1.)
### Step 2. Reserve the hosts in g5k:
Run below cmd:
```bash ```bash
oarsub -I -l host=<number_of_hosts>,walltime=<number_of_during> vim config_instance1.json
``` ```
For example: 1 sever and 3 clients need 4 hosts. Example structure:
```bash
oarsub -I -l host=4,walltime=2 ```json
``` {
"instance": "fedAvg_cifar10",
"output_dir": "/home/mdo/Huong_DL/Log",
"dvfs": {
"dummy": false,
"baseline": false,
"frequencies": [2000000,2200000]
},
"server": {
"command": "python3",
"args": [
"Flower_v1/server.py",
"-r 50",
"-s fedAvg"
],
"ip": "172.16.66.18",
"port": 8080
},
"clients": [
{
"name": "client1",
"command": "python3",
"args": [
"Flower_v1/client_1.py",
"cifar10",
"1",
"3"
],
"ip": "172.16.66.2"
}
{
"name": "client2",
"command": "python3",
"args": [
"Flower_v1/client_1.py",
"cifar10",
"1",
"3"
],
"ip": "172.16.66.3"
}
]
}
```
- **instance**: The name of your experiment.
- **output_dir**: Where to store the log files (experiment log and energy monitoring log).
- **dvfs**: choose only one in 3 settings, detects all available frequencies and go through all of them.
- `dummy`: false or true (Only uses min and max frequency)
- `baseline`: false or true (Only uses max freq)
- `frequencies`: null or int list (Limits to the provided list of frequencies)
**Remark:** check the available frequencies before using oftion `frequencies`.
- Set the permissions and disable Turbo Boost first:
```bash
bash "$(python3 -c "import expetator, os; print(os.path.join(os.path.dirname(expetator.__file__), 'leverages', 'dvfs_pct.sh'))")" init
```
- Run this command to get available frequencies:
```bash
python3 get_frequencies.py
```
- Update extraced frequencies value to configure files.
### Step 3. Collect IP ### Step 3. Collect IP
Run below cmd:
Run the following command to generate a node list:
```bash ```bash
uniq $OAR_NODEFILE > nodelist uniq $OAR_NODEFILE > nodelist
``` ```
Then automatically fill out missing IP addresses in `json`:
Automatically populate missing IP addresses in the JSON file:
```bash ```bash
python3 collect_ip.py python3 collect_ip.py
``` ```
### Step 4. Run the campaign or instance: ### Step 4. Run the Campaign or Single Instance
Now you can run the monitoring campaign by run_measure.py. Note that, this function will scan all the json file in the Run directory with the name "config_instance*.json". Run campain:
```bash ```bash
python3 run_measure.py -x <your_str> -r <number_of_repetation> python3 run_measure.py -x [experiment_name] -r [repetitions]
``` ```
Run single instance:
For example:
```bash ```bash
python3 run_measure.py -x IamPretty -r 2 python3 measure.py -c [config_file] -x [experiment_name] -r [repetitions]
``` ```
<your_str>: whatever_you_want to recorgnize your testing. - **[experiment_name]**: The name you use to identify your experiment.
<number_of_repetation>: number of repeatation of your exp - **[repetitions]**: Number of repetitions for the experiment.
In case you only need to run 1 instance, you can use measure.py instead: ### Step 5. Output
```bash
python3 measure.py -c <config_instance*.json> -x <your_str> -r <number_of_repetation> The logs and energy monitoring data will be saved in the directory specified in the JSON configuration.
```
### Step 6. Clean Up
For example: After the experiment:
Exit the host:
```bash
exit
```
Check the job ID:
```bash
oarstat -u
```
Kill the job:
```bash
oardel <job_id>
```
## Quickstart
Follow these steps to run an example:
1. Reserve 4 hosts (1 server + 3 clients) for 2 hours:
```bash ```bash
python3 measure.py -c config_instance1.json -x IamPretty -r 2 oarsub -I -l host=4,walltime=2
``` ```
2. Configure
### Step 6. Output `config_instance1.json` and `config_instance2.json` provide two examples of instance configuration. All fields are configured but "output_dir" and "args" must be updated with your directories setting.
Check the output in the directory where you set in json file. The output structure: - `config_instance1.json`: fedAvg, cifar10, dvfs with min and max freq, 1 round.
```plaintext - `config_instance1.json`: fedAvg2Clients, cifar10, dvfs with min and max freq, 1 round.
/Flower_Test1
├── Flower_instance_fedAvg_cifar10
│ ├── Expetator
│ ├── Expetator_gros-26.nancy.grid5000.fr_1732808824
│ ├── Expetator_gros-26.nancy.grid5000.fr_1732808824_mojitos
│ │ ├── gros-26.nancy.grid5000.fr_flower_1732808838
│ │ ├── gros-38.nancy.grid5000.fr_flower_1732808838
│ │ ├── gros-4.nancy.grid5000.fr_flower_1732808838
│ │ └── gros-65.nancy.grid5000.fr_flower_1732808838
│ ├── Expetator_gros-26.nancy.grid5000.fr_1732808824_power
│ │ └── gros-26.nancy.grid5000.fr_flower_1732808838
│ ├── Flwr_20241128_164718
│ │ ├── Client_172.16.66.38
│ │ ├── Client_172.16.66.4
│ │ ├── Client_172.16.66.65
│ │ ├── flower_log_summary.txt
│ │ └── Server_172.16.66.26
```
### Step 7. Kill job in g5k after finish
Exit the host: 3. Collect IP
```bash ```bash
exit uniq $OAR_NODEFILE > nodelist
python3 collect_ip.py
``` ```
Check the job id:
4. Run the Single Instance or Campaign
Run single instance1 with `/Run/config_instance1.json`, 2 repetitions:
```bash ```bash
oarstat -u python3 measure.py -c config_instance1.json -x SingleTest -r 2
``` ```
Kill the job: Run a campaign with all config_instance*.json in `/Run`, 2 repetitions:
```bash ```bash
oardel <job_id> python3 run_measure.py -x CampaignTest -r 2
``` ```
## Output Structure
Example output directory:
```plaintext
/Flower_<x>
├── Flower_instance_<instance_name>
│ ├── Expetator
| | ├── config_instance*.json
│ ├── Expetator_<host_info>
│ ├── Expetator_<host_info>_power
│ │ ├── <client_logs>
│ ├── Flwr_<timestamp>
│ │ ├── Client_<ip>
│ │ ├── Server_<ip>
```
## License
This project is licensed under [GPLv3].
\ No newline at end of file
...@@ -2,20 +2,15 @@ ...@@ -2,20 +2,15 @@
"instance": "fedAvg_cifar10", "instance": "fedAvg_cifar10",
"output_dir": "/home/mdo/Framework/eflwr/Log", "output_dir": "/home/mdo/Framework/eflwr/Log",
"dvfs": { "dvfs": {
"dummy": false, "dummy": true,
"baseline": false, "baseline": false,
"frequencies": [ "frequencies": null
1000000,
1400000,
1800000,
2200000
]
}, },
"server": { "server": {
"command": "python3", "command": "python3",
"args": [ "args": [
"/home/mdo/Framework/eflwr/Flower_v1/server_1.py", "/home/mdo/Framework/eflwr/Flower_v1/server_1.py",
"-r 25", "-r 1",
"-s fedAvg" "-s fedAvg"
], ],
"additional_env_var": [ "additional_env_var": [
......
...@@ -2,20 +2,15 @@ ...@@ -2,20 +2,15 @@
"instance": "fedAvg2Clients_cifar10", "instance": "fedAvg2Clients_cifar10",
"output_dir": "/home/mdo/Framework/eflwr/Log", "output_dir": "/home/mdo/Framework/eflwr/Log",
"dvfs": { "dvfs": {
"dummy": false, "dummy": true,
"baseline": false, "baseline": false,
"frequencies": [ "frequencies": null
1000000,
1400000,
1800000,
2200000
]
}, },
"server": { "server": {
"command": "python3", "command": "python3",
"args": [ "args": [
"/home/mdo/Framework/eflwr/Flower_v1/server_1.py", "/home/mdo/Framework/eflwr/Flower_v1/server_1.py",
"-r 25", "-r 1",
"-s fedAvg2Clients" "-s fedAvg2Clients"
], ],
"additional_env_var": [ "additional_env_var": [
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment