"According to the system specifications given in the [corresponding page in Parallel Workload Archive](https://www.cs.huji.ac.il/labs/parallel/workload/l_metacentrum2/index.html), if we exclude nodes with >16 cores, there are $\\#cores_{total} = 6416$ cores on May 1st 2014.(1)\n",
"According to the system specifications given in the [corresponding page in Parallel Workload Archive](https://www.cs.huji.ac.il/labs/parallel/workload/l_metacentrum2/index.html): from June 1st 2014 to Nov 30th 2014 there is no change in the platform for the clusters considered in our study (<16 cores). There is a total of **6304 cores**.(1)\n",
"\n",
"\n",
"We build a platform file adapted to the remaining workload. We choose to make it homogeneous with 16-core nodes. To have a coherent number of nodes, we count:\n",
"We build a platform file adapted to the remaining workload. We see above that the second selection cut 73.7\\% of core-hours from the original workload. We choose to make an homogeneous with 16-core nodes. To have a coherent number of nodes, we count:\n",
"The corresponding SimGrid platform file can be found in `platform/average_metacentrum.xml`.\n",
"The corresponding SimGrid platform file can be found in `platform/average_metacentrum.xml`.\n",
"\n",
"\n",
"(1) clusters decomissionned before or comissionned after May 1st 2014 have also been removed: $8+480+160+1792+256+576+88+416+108+168+752+112+588+152+160+160+192+24+224 = 6416$"
"(1) clusters decomissionned before or comissionned after the 6-month period have been removed: $8+480+160+1792+256+576+88+416+108+168+752+112+588+48+152+160+192+24+224 = 6304$"
]
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "north-meeting",
"metadata": {},
"outputs": [],
"source": []
}
}
],
],
"metadata": {
"metadata": {
...
@@ -207,7 +243,7 @@
...
@@ -207,7 +243,7 @@
"name": "python",
"name": "python",
"nbconvert_exporter": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"pygments_lexer": "ipython3",
"version": "3.8.10"
"version": "3.8.9"
}
}
},
},
"nbformat": 4,
"nbformat": 4,
...
...
%% Cell type:markdown id:forced-resolution tags:
%% Cell type:markdown id:forced-resolution tags:
# Downloading and preparing the workload and platform
# Downloading and preparing the workload and platform
## Workload
## Workload
We use the reconverted log `METACENTRUM-2013-3.swf` available on [Parallel Workload Archive](https://www.cs.huji.ac.il/labs/parallel/workload/l_metacentrum2/index.html).
We use the reconverted log `METACENTRUM-2013-3.swf` available on [Parallel Workload Archive](https://www.cs.huji.ac.il/labs/parallel/workload/l_metacentrum2/index.html).
It is a 2-year-long trace from MetaCentrum, the national grid of the Czech republic. As mentionned in the [original paper releasing the log](https://www.cs.huji.ac.il/~feit/parsched/jsspp15/p5-klusacek.pdf), the platform is **very heterogeneous** and underwent majors changes during the logging period. For the purpose of our study, we perform the following selection.
It is a 2-year-long trace from MetaCentrum, the national grid of the Czech republic. As mentionned in the [original paper releasing the log](https://www.cs.huji.ac.il/~feit/parsched/jsspp15/p5-klusacek.pdf), the platform is **very heterogeneous** and underwent majors changes during the logging period. For the purpose of our study, we perform the following selection.
First:
First:
- we remove from the workload all the clusters whose nodes have **more than 16 cores**
- we remove from the workload all the clusters whose nodes have **more than 16 cores**
- we truncate the workload to keep only 6 month (June to November 2014) where no major change was performed in the infrastructure (no cluster < 16 cores added nor removed, no reconfiguration in the scheduling system)
- we truncate the workload to keep only 6 month (June to November 2014) where no major change was performed in the infrastructure (no cluster < 16 cores added nor removed, no reconfiguration in the scheduling system)
Second:
Second:
- we remove from the workload the jobs with an **execution time greater than one day**
- we remove from the workload the jobs with an **execution time greater than one day**
- we remove from the workload the jobs with a **number of requested cores greater than 16**
- we remove from the workload the jobs with a **number of requested cores greater than 16**
To do so, we use a home made SWF parser.
To do so, we use a home made SWF parser.
%% Cell type:code id:ff40dcdd tags:
%% Cell type:code id:ff40dcdd tags:
``` python
``` python
# First selection
# Create a swf with only the selected clusters and the 6 selected months
fromtimeimport*
fromtimeimport*
begin_trace=1356994806# according to original SWF header
begin_trace=1356994806# according to original SWF header
jun1_unix_time,nov30_unix_time=mktime(strptime('Sun Jun 1 00:00:00 2014')),mktime(strptime('Sun Nov 30 23:59:59 2014'))
jun1_unix_time,nov30_unix_time=mktime(strptime('Sun Jun 1 00:00:00 2014')),mktime(strptime('Sun Nov 30 23:59:59 2014'))
--keep_only="nb_res <= 16 and run_time <= 24*3600"
```
%% Output
Processing swf line 100000
Processing swf line 200000
Processing swf line 300000
Processing swf line 400000
Processing swf line 500000
Processing swf line 600000
Processing swf line 700000
Processing swf line 800000
Processing swf line 900000
Processing swf line 1000000
Processing swf line 1100000
Processing swf line 1200000
Processing swf line 1300000
Processing swf line 1400000
Processing swf line 1500000
Processing swf line 1600000
-------------------
End parsing
Total 1604201 jobs and 546 users have been created.
Total number of core-hours: 4785357
44828 valid jobs were not selected (keep_only) for 13437365 core-hour
Jobs not selected: 2.7% in number, 73.7% in core-hour
0 out of 1649030 lines in the file did not match the swf format
1 jobs were not valid
%% Cell type:markdown id:afde35e8 tags:
%% Cell type:markdown id:afde35e8 tags:
## Platform
## Platform
According to the system specifications given in the [corresponding page in Parallel Workload Archive](https://www.cs.huji.ac.il/labs/parallel/workload/l_metacentrum2/index.html), if we exclude nodes with >16 cores, there are $\#cores_{total} = 6416$ cores on May 1st 2014.(1)
According to the system specifications given in the [corresponding page in Parallel Workload Archive](https://www.cs.huji.ac.il/labs/parallel/workload/l_metacentrum2/index.html): from June 1st 2014 to Nov 30th 2014 there is no change in the platform for the clusters considered in our study (<16 cores). There is a total of **6304 cores**.(1)
We build a platform file adapted to the remaining workload. We choose to make it homogeneous with 16-core nodes. To have a coherent number of nodes, we count:
We build a platform file adapted to the remaining workload. We see above that the second selection cut 73.7\% of core-hours from the original workload. We choose to make an homogeneous with 16-core nodes. To have a coherent number of nodes, we count:
The corresponding SimGrid platform file can be found in `platform/average_metacentrum.xml`.
The corresponding SimGrid platform file can be found in `platform/average_metacentrum.xml`.
(1) clusters decomissionned before or comissionned after May 1st 2014 have also been removed: $8+480+160+1792+256+576+88+416+108+168+752+112+588+152+160+160+192+24+224 = 6416$
(1) clusters decomissionned before or comissionned after the 6-month period have been removed: $8+480+160+1792+256+576+88+416+108+168+752+112+588+48+152+160+192+24+224 = 6304$