Skip to content
Snippets Groups Projects
user avatar
huongdm1896 authored
1babdd8a
History

FedEator_Results_analysis

Remark: This is introduction, then for details on how to use and analyze the data, go to ./UsageOfData.ipynb.

Unzip the folders:

unzip Log.zip #original data
unzip Data_analysis.zip #processed data (output from all steps below)

Raw results of experiments are stored in /Log. There are data from 3 experiments (3 folders):

  • Verify the usage of FedEater for CPUs: use 4 CPUs nodes (1 server + 3 clients), use different FL settings:

    • set in 4 CPU frequencies: min, max, 2 in between
    • repeat 2 times each seexperiment_nametting.
    • Flower settings: client side (local training epoch: 1 or 2), server side (2 aggreation algorithms: FedAvg and FedAvg2Clients), dataset CIFAR10, 15 rounds
      --> Total 4 instances (4 FL settings) * 4 frequencies * 2 repeats.
      Dir: ./Log/Flower_campaign
  • Verify the usage of FedEater for GPUs: use 4 GPUs nodes (1 server + 3 clients), use 1 FL setting:

    • set in 2 frequencies: min and max
    • no repeat
    • Flower setting: only one Flower setting, client side (local training epoch: 1), server side (fedAvg), dataset CIFAR10, 15 rounds --> Total 1 instance (1 FL setting) * 2 frequencies * 1 repeat
      Dir: ./Log/Flower_GPUminmax
  • Verify the usage of FedEator in largecase: use 10 CPUs nodes (1 server + 9 clients), use 1 FL setting:

    • set in 4 frequencies: min, max, 2 in between
    • no repeat
    • Flower settings: client side (local training epoch: 1), server side (fedAvg), dataset CIFAR10, 15 rounds. --> Total 1 instance (1 FL settings) * 4 frequencies * 1 repeat
      Dir: ./Log/Flower_Largecase

Each directory contains the instances. For example:

Flower_campaign contain 4 instances folders.

Log/Flower_campaign
├── Flower_instance_fedAvg_cifar10_epoch1
├── Flower_instance_fedAvg_cifar10_epoch2
├── Flower_instance_fedAvg2Clients_cifar10_epoch1
├── Flower_instance_fedAvg2Clients_cifar10_epoch2
...

In each instance, the structure of folder is same:

Log/Flower_campaign
├── Flower_instance_fedAvg_cifar10_epoch1: each instance folder
│   ├── Expetator
|   |   ├── config_instance_1.json: meta data of instance
│   ├── Expetator_<host_info>_<timestamp>_mojitos: mojitos outputs
│   ├── Expetator_<host_info>_<timestamp>_power: wattmetter outputs
│   ├── Expetator_<host_info>_<timestamp>: measurement log
│   ├── Flwr_<timestamp>: each flower folder log from each setting/test in instance
│   │   ├── Client_<ip>
│   │   ├── Client_<ip>
│   │   ├── Server_<ip>
│   │   ├── training_results_<instance_name>_<time>.csv
...

For example: Flower_instance_fedAvg_cifar10_epoch1

Log/Flower_campaign/Flower_instance_fedAvg_cifar10_epoch1
├── Expetator
│   └── config_instance_1.json
├── Expetator_taurus-1.lyon.grid5000.fr_1740648134
├── Expetator_taurus-1.lyon.grid5000.fr_1740648134_mojitos 
│  < each file is data of each node * 4 nodes * 2 times repeat * 4 CPU freq setting -> 32 files>
│   ├── taurus-13.lyon.grid5000.fr_flower_1740648154
│   ├── taurus-13.lyon.grid5000.fr_flower_1740649591
│   ├── taurus-13.lyon.grid5000.fr_flower_1740650725
│   ├── taurus-13.lyon.grid5000.fr_flower_1740651708
│   ├── taurus-13.lyon.grid5000.fr_flower_1740652602
│   ├── taurus-13.lyon.grid5000.fr_flower_1740654039
│   ├── taurus-13.lyon.grid5000.fr_flower_1740655170
│   ├── taurus-13.lyon.grid5000.fr_flower_1740656160
│   ├── taurus-1.lyon.grid5000.fr_flower_1740648154
│   ├── taurus-1.lyon.grid5000.fr_flower_1740649591
│   ├── taurus-1.lyon.grid5000.fr_flower_1740650725
│   ├── taurus-1.lyon.grid5000.fr_flower_1740651708
│   ├── taurus-1.lyon.grid5000.fr_flower_1740652602
│   ├── taurus-1.lyon.grid5000.fr_flower_1740654039
│   ├── taurus-1.lyon.grid5000.fr_flower_1740655170
│   ├── taurus-1.lyon.grid5000.fr_flower_1740656160
│   ├── taurus-8.lyon.grid5000.fr_flower_1740648154
│   ├── taurus-8.lyon.grid5000.fr_flower_1740649591
│   ├── taurus-8.lyon.grid5000.fr_flower_1740650725
│   ├── taurus-8.lyon.grid5000.fr_flower_1740651708
│   ├── taurus-8.lyon.grid5000.fr_flower_1740652602
│   ├── taurus-8.lyon.grid5000.fr_flower_1740654039
│   ├── taurus-8.lyon.grid5000.fr_flower_1740655170
│   ├── taurus-8.lyon.grid5000.fr_flower_1740656160
│   ├── taurus-9.lyon.grid5000.fr_flower_1740648154
│   ├── taurus-9.lyon.grid5000.fr_flower_1740649591
│   ├── taurus-9.lyon.grid5000.fr_flower_1740650725
│   ├── taurus-9.lyon.grid5000.fr_flower_1740651708
│   ├── taurus-9.lyon.grid5000.fr_flower_1740652602
│   ├── taurus-9.lyon.grid5000.fr_flower_1740654039
│   ├── taurus-9.lyon.grid5000.fr_flower_1740655170
│   └── taurus-9.lyon.grid5000.fr_flower_1740656160
├── Expetator_taurus-1.lyon.grid5000.fr_1740648134_power
│ < each file contains data from 4 nodes * 2 times repeat * 4 CPU freq setting -> 8 files>
│   ├── taurus-1.lyon.grid5000.fr_flower_1740648154
│   ├── taurus-1.lyon.grid5000.fr_flower_1740649591
│   ├── taurus-1.lyon.grid5000.fr_flower_1740650725
│   ├── taurus-1.lyon.grid5000.fr_flower_1740651708
│   ├── taurus-1.lyon.grid5000.fr_flower_1740652602
│   ├── taurus-1.lyon.grid5000.fr_flower_1740654039
│   ├── taurus-1.lyon.grid5000.fr_flower_1740655170
│   └── taurus-1.lyon.grid5000.fr_flower_1740656160
< For Flwr logs: each folder contains training log from 4 nodes * 2 times repeat * 4 CPU freq setting -> 8 folders, mapping from flower data and system data by time of log>
├── Flwr_20250227_102234
│   ├── Client_172.16.48.13
│   ├── Client_172.16.48.8
│   ├── Client_172.16.48.9
│   ├── Server_172.16.48.1
│   └── training_results_fedAvg_15_20250227_102251.csv <-- main results are stored in csv 
├── Flwr_20250227_104631
│   ├── Client_172.16.48.13
│   ├── Client_172.16.48.8
│   ├── Client_172.16.48.9
│   ├── Server_172.16.48.1
│   └── training_results_fedAvg_15_20250227_104641.csv
├── Flwr_20250227_110525
│   ├── Client_172.16.48.13
│   ├── Client_172.16.48.8
│   ├── Client_172.16.48.9
│   ├── Server_172.16.48.1
│   └── training_results_fedAvg_15_20250227_110533.csv
├── Flwr_20250227_112148
│   ├── Client_172.16.48.13
│   ├── Client_172.16.48.8
│   ├── Client_172.16.48.9
│   ├── Server_172.16.48.1
│   └── training_results_fedAvg_15_20250227_112155.csv
├── Flwr_20250227_113643
│   ├── Client_172.16.48.13
│   ├── Client_172.16.48.8
│   ├── Client_172.16.48.9
│   ├── Server_172.16.48.1
│   └── training_results_fedAvg_15_20250227_113655.csv
├── Flwr_20250227_120039
│   ├── Client_172.16.48.13
│   ├── Client_172.16.48.8
│   ├── Client_172.16.48.9
│   ├── Server_172.16.48.1
│   └── training_results_fedAvg_15_20250227_120049.csv
├── Flwr_20250227_121930
│   ├── Client_172.16.48.13
│   ├── Client_172.16.48.8
│   ├── Client_172.16.48.9
│   ├── Server_172.16.48.1
│   └── training_results_fedAvg_15_20250227_121939.csv
└── Flwr_20250227_123600
    ├── Client_172.16.48.13
    ├── Client_172.16.48.8
    ├── Client_172.16.48.9
    ├── Server_172.16.48.1
    └── training_results_fedAvg_15_20250227_123607.csv
...

Next step: Please check out ./UsageOfData.ipynb to understand usage of the data and also to see our analysis (reproduce the figure in our article)

here is visualization