-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Search before creating an issue
- I have searched existing issues and confirmed this is not a duplicate
Bug Description
Looks like the FillingModeFlag can't be disabled anymore.
In the Pilot code, it is set to True when we have a PoolCE (i.e. multiple jobs can be executed in parallel in different processes):
Line 1042 in 10330ec
| "-o FillingModeFlag=True", |
In the JobAgent, it is also set to True by default: https://github.com/DIRACGrid/DIRAC/blob/b0bc9cfb83af19215870cee0541a4e404471c53b/src/DIRAC/WorkloadManagementSystem/Agent/JobAgent.py#L71
This looks correct if we manage to solve our issues to compute CPU Time Left in the allocations (see DIRACGrid/DIRAC#8416).
Steps to Reproduce
No response
Expected Behavior
Sometimes we might have some issues with time calculations, and we should be able to configure the fillingModeFlag for a given queue I guess.
Actual Behavior
We have no way to solve time calculation issues with the current configuration.
Environment
No response
Relevant Log Output
Additional Context
I am wondering whether the filling mode should not be always disabled for simplicity (single core pilots would execute 1 job and stop, multi-core pilots would fill the slots once and stop) as we go towards more multi-core jobs.
- we should see what we gain/lose from it (in LHCb I see pilots are generally executing more than 1 job, but I also see a significant number of stalled pilots/jobs)
- it could work fine if we would have an accurate estimate of the time left in the allocation as well as the CPU power, which is not the case now (e.g. can't get time left in HTCondor allocations)