Stop sending CALL_ME_LATER when workload too big
I tried to execute batmen with a large input and I get an error from batsim:
Line 1269232 of batsim.log:
[70055051.000466] ../src/kernel/EngineImpl.cpp:718: [ker_engine/CRITICAL] Oops! Deadlock detected, some activities are still around but will never complete. This usually happens when the user code is not perfectly clean.
When looking at sched.err.log, I notice that the last CALL_ME_LATER message is sent at line 666377/713393, which is weird...
Expe cmd file: robinfile.yaml
TODO:
-
run with --debug? (problem: will make the log file explode.. already 221M) -
try to isolate a MWE with a user sending a million CALL_ME_LATER... -
use valgrindto see if memory management is OK
[Edit] The problem seems to come from fb_user_think_time_only. We can try:
-
rounding up to the nearest second the dates in the CALL_ME_LATER, -
try to isolate a MWE with a fb_user_think_time_onlywith a SABjson of 500k jobs.
Edited by Ghost User