swf2userSessions
Python script to read a workload trace in the Standard Workload Format (SWF), decompose it into user sessions, analyse the dependencies between sessions and store the results in the Session Annotated Batsim JSON format (SABjson).
What is a session?
Analysing the workload trace of a parallel infrastructure to identify user sessions and their dependencies was started by Zackay and Feitelson (Zackay and Feitelson 2013). The idea behind it is to keep the logic of user submissions rather than the exact submission times. For example, in the image below, the workload has been split into 4 sessions following the "Arrival" delimitation approach:
- Job1 and job2 belong to the same session because their inter-arrival time is lower than a threshold value (here: 60 minutes).
- Job3, however, started a new session.
- Session4 depends on sessions 2 and 3 because its first job was submitted after the terminaison of all jobs in sessions 2 and 3.
- Session3 depends on session1, but not on session2.
Installation
Installation with pip
without cloning:
pip install git+https://gitlab.irit.fr/sepia-pub/mael/swf2userSessions
Clone and install:
git clone https://gitlab.irit.fr/sepia-pub/mael/swf2userSessions
cd swf2userSessions
pip install .
You can then run the app from CLI by simply calling swf2userSessions
.
You can also clone the repository and directly execute the main script python3 swf2userSessions/swf2sessions.py
.
Requirements:
- python3
-
networkx (optional, for the graph outputs):
pip3 install networkx
-
pytest (optional, for the tests):
pip3 install pytest
Usage
To run the session decomposition on the workload workloads/example.swf
illustrated above, with "Arrival" delimitation approach and a threshold of 60 minutes:
swf2userSessions -a 60 workloads/example.swf out/
For more documentation, see: swf2userSessions -h
Example
For example use of the script, you can have a look at these two notebooks:
-
session_stats.ipynb
makes the session decomposition on three traces with diffent threshold values and compare statistical properties like number of session or session length -
graph_viz.ipynb
showcases the simple graph visualization tool for session graphs that is included in the script (option--graph
)
Tests
Some integration tests have been written for this script, and are stored in the test/
folder. To run them on your machine, just type pytest
at the root of the project after having installed the package.