Skip to content
Snippets Groups Projects
user avatar
Maël Madon authored
2ba4ddd2
History

swf2userSessions

Python script to read a workload trace in the Standard Workload Format (SWF), decompose it into user sessions, analyse the dependencies between sessions and store the results in the Session Annotated Batsim JSON format (SABjson).

What is a session?

Analysing the workload trace of a parallel infrastructure to identify user sessions and their dependencies was started by Zackay and Feitelson (Zackay and Feitelson 2013). The idea behind it is to keep the logic of user submissions rather than the exact submission times. For example, in the image below, the workload has been split into 4 sessions following the "Arrival" delimitation approach:

diagram

  • Job1 and job2 belong to the same session because their inter-arrival time is lower than a threshold value (here: 60 minutes).
  • Job3, however, started a new session.
  • Session4 depends on sessions 2 and 3 because its first job was submitted after the terminaison of all jobs in sessions 2 and 3.
  • Session3 depends on session1, but not on session2.

Installation

Installation with pip without cloning:

pip install git+https://gitlab.irit.fr/sepia-pub/mael/swf2userSessions

Clone and install:

git clone https://gitlab.irit.fr/sepia-pub/mael/swf2userSessions
cd swf2userSessions
pip install .

You can then run the app from CLI by simply calling swf2userSessions.

You can also clone the repository and directly execute the main script python3 swf2userSessions/swf2sessions.py.

Requirements:

  • python3
  • networkx (optional, for the graph outputs): pip3 install networkx
  • pytest (optional, for the tests): pip3 install pytest

Usage

To run the session decomposition on the workload workloads/example.swf illustrated above, with "Arrival" delimitation approach and a threshold of 60 minutes:

swf2userSessions -a 60 workloads/example.swf out/

For more documentation, see: swf2userSessions -h

Example

For example use of the script, you can have a look at these two notebooks:

  • session_stats.ipynb makes the session decomposition on three traces with diffent threshold values and compare statistical properties like number of session or session length
  • graph_viz.ipynb showcases the simple graph visualization tool for session graphs that is included in the script (option --graph)

Tests

Some integration tests have been written for this script, and are stored in the test/ folder. To run them on your machine, just type pytest at the root of the project after having installed the package.