swf2userSessions
Python script to read a workload trace in the Standard Workload Format (SWF), decompose it into user sessions, analyse the dependencies between sessions and store the results in the Session Annotated Batsim JSON format (SABjson).
What is a session?
Analysing the workload trace of a parallel infrastructure to identify user sessions and their dependencies was started by Zackay and Feitelson (Zackay and Feitelson 2013). The idea behind it is to keep the logic of user submissions rather than the exact submission times. For example, in the image below, the workload has been split into 4 sessions following the "Arrival" delimitation approach:
- Job1 and job2 belong to the same session because their inter-arrival time is lower than a threshold value (here: 60 minutes).
- Job3, however, started a new session.
- Session4 depends on sessions 2 and 3 because its first job was submitted after the terminaison of all jobs in sessions 2 and 3.
- Session3 depends on session1, but not on session2.
Usage
Requirements:
- python3
-
networkx (optional, for the graph outputs):
pip3 install networkx
-
pytest (optional, for the tests):
pip3 install pytest
To run the session decomposition on the workload workloads/example.swf
illustrated above, with "Arrival" delimitation approach and a threshold of 60 minutes:
python3 swf2userSessions.py -a 60 workloads/example.swf out/
For more documentation, see: python3 swf2userSessions.py -h
Tests
Some integration tests have been written for this script, and are stored in the test/
folder. To run them on your machine, just type pytest
at the root of the project.