Skip to content
Snippets Groups Projects

Batmen

Batmen is a plugin to the scientific simulator Batsim. It allows simulating users of large-scale distributed systems. Batmen is originally a fork of batsched.

How does it work?

batmen diagram

A simulation with Batsim consists of two processes:

  • Batsim itself, in charge of reading an input workload and simulating the platform (job arrival, job termination, energy consumed...)
  • a scheduler, in charge of making the scheduling decisions. The scheduler communicates with Batsim via a ZeroMQ socket through messages defined in Batsim protocol.

In Batmen, we added a layer: the simulation of users.

User simulation

The interaction with users happens through a "broker". The broker manages a pool of simulated users and dynamically register new jobs to submit. These jobs come on top of the input workload read by Batsim.

Implementation, in the folder src/users:

  • class Broker: implement the broker, submitting the jobs on behalf of the users
  • class DynScheduler: superclass managing the interaction with the broker from the scheduler
  • class User: an individual user, submitting her jobs to the broker

Types of user

For now, we have developed three types of users.

  • modeled users (class User): using a model to generate the jobs to submit
  • replay users (class ReplayUser): replaying a workload trace given as input
  • feedback users (class FeedbackUser): taking feedback on the status of previous jobs into account when submitting the next one

Available schedulers

Dynamic submission is supported by all the schedulers inheriting from the class DynScheduler i.e., all the schedulers located in the folder src/scheds. Some of them support multicore machines while others only support monocore machines.

  • monocore schedulers:
    • fcfs: jobs scheduled in their order of arrival ("first come first served")
    • easy-bf: job scheduled in fcfs order, with possibility to execute ("backfill") small jobs that were submitted after if they don't postpone the starting time of the next job
  • multicore schedulers:
    • bin_packing: the scheduler packs the jobs on the least number of multicore machines. Each job runs on only one host but can take several cores
    • bin_packing_energy: same as bin_packing, but saves power by switching off the machines as soon as they are idle and powering them on again when needed
    • multicore_filler: for test purposes. Schedules only one job at a time on only one multicore host.

For guidance on how to add a new dynamic scheduler, refer to this readme.

Install

With Nix package manager. At the root of the project:

nix-shell -A batmen # enter a Nix environment with all dependencies managed
meson build # generate a build directory
ninja -C build # compile the project into the build directory

Manually: TBD...

Run a small instance

Follow the steps to install Batmen. The executable is at the location build/batmen. We suppose that you also have Batsim installed on your machine (follow the installation guide here).

To run a simulation, we will need two terminals:

  • one (Batmen) to simulate the users and the scheduler, and
  • one (Batsim) to simulate the underlying infrastructure

Create the output directory:

mkdir -p out/first_simu

Launch Batmen with

  • the scheduler FCFS
  • one simulated user (described in the user description file test/schedconf/routine_greedy_simple.json)
build/batmen -v fcfs \
    --variant_options_filepath test/schedconf/routine_greedy_simple.json

Launch Batsim with

  • the platform file test/platforms/monocore/2machines.xml
  • an empty workload file test/workloads/static/empty.json (jobs are submitted dynamically by the simulated user)
  • the output directory
  • all the options required for Batmen to work properly (see their description in Batsim CLI)
path/to/batsim -p test/platforms/monocore/2machines.xml \
    -w test/workloads/static/empty.json \
    -e out/first_simu/ \
    --energy --enable-compute-sharing \
    --enable-dynamic-jobs --acknowledge-dynamic-jobs \
    --enable-profile-reuse

After the execution, you should see the Batsim outputs in the output directory.

Run the tests

Batmen has a set of integration tests, located in the test directory. To run all the tests:

nix-shell -A test --command "pytest"

For more information about the tests, read the test README