Overview
Request 1110259 accepted
- Updated to 23.02.4 with the following changes:
* Bug Fixes:
+ Fix main scheduler loop not starting after a failover to backup
controller. Avoid slurmctld segfault when specifying
`AccountingStorageExternalHost` (bsc#1214983).
+ Fix sbatch return code when `--wait` is requested on a job array.
+ Fix collected `GPUUtilization` values for `acct_gather_profile` plugins.
+ Fix `slurmrestd` handling of job hold/release operations.
+ Fix step running indefinitely when slurmctld takes more than
`MessageTimeout` to respond. Now, `slurmctld` will cancel the step when
detected, preventing following steps from getting stuck waiting for
resources to be released.
+ Fix regression to make `job_desc.min_cpus` accurate again in `job_submit`
when requesting a job with `--ntasks-per-node`.
+ Fix handling of `ArrayTaskThrottle` in backfill.
+ Fix regression in 23.02.2 when checking gres state on `slurmctld`
startup or reconfigure. Gres changes in the configuration were not
updated on slurmctld startup. On startup or reconfigure, these messages
were present in the log: `"error: Attempt to change gres/gpu Count`".
+ Fix potential double count of gres when dealing with limits.
+ Fix `slurmstepd` segfault when `ContainerPath` is not set in `oci.conf`
+ Fixed an issue where jobs requesting licenses were incorrectly rejected.
+ `scrontab` - Fix cutting off the final character of quoted variables.
+ `smail` - Fix issues where e-mails at job completion were not being sent.
+ `scontrol/slurmctld` - fix comma parsing when updating a reservation's
nodes.
+ Fix `--gpu-bind=single binding` tasks to wrong gpus, leading to some gpus
having more tasks than they should and other gpus being unused.
+ Fix regression in 23.02 that causes slurmstepd to crash when `srun`
requests more than `TreeWidth` nodes in a step and uses the pmi2 or
Request History
eeich created request
- Updated to 23.02.4 with the following changes:
* Bug Fixes:
+ Fix main scheduler loop not starting after a failover to backup
controller. Avoid slurmctld segfault when specifying
`AccountingStorageExternalHost` (bsc#1214983).
+ Fix sbatch return code when `--wait` is requested on a job array.
+ Fix collected `GPUUtilization` values for `acct_gather_profile` plugins.
+ Fix `slurmrestd` handling of job hold/release operations.
+ Fix step running indefinitely when slurmctld takes more than
`MessageTimeout` to respond. Now, `slurmctld` will cancel the step when
detected, preventing following steps from getting stuck waiting for
resources to be released.
+ Fix regression to make `job_desc.min_cpus` accurate again in `job_submit`
when requesting a job with `--ntasks-per-node`.
+ Fix handling of `ArrayTaskThrottle` in backfill.
+ Fix regression in 23.02.2 when checking gres state on `slurmctld`
startup or reconfigure. Gres changes in the configuration were not
updated on slurmctld startup. On startup or reconfigure, these messages
were present in the log: `"error: Attempt to change gres/gpu Count`".
+ Fix potential double count of gres when dealing with limits.
+ Fix `slurmstepd` segfault when `ContainerPath` is not set in `oci.conf`
+ Fixed an issue where jobs requesting licenses were incorrectly rejected.
+ `scrontab` - Fix cutting off the final character of quoted variables.
+ `smail` - Fix issues where e-mails at job completion were not being sent.
+ `scontrol/slurmctld` - fix comma parsing when updating a reservation's
nodes.
+ Fix `--gpu-bind=single binding` tasks to wrong gpus, leading to some gpus
having more tasks than they should and other gpus being unused.
+ Fix regression in 23.02 that causes slurmstepd to crash when `srun`
requests more than `TreeWidth` nodes in a step and uses the pmi2 or
anag+factory added as a reviewer
Being evaluated by staging project "openSUSE:Factory:Staging:adi:31"
anag+factory accepted review
Picked "openSUSE:Factory:Staging:adi:31"
factory-auto added opensuse-review-team as a reviewer
Please review sources
factory-auto accepted review
skipping the staging process since only .changes modifications
factory-auto accepted review
Check script succeeded
licensedigger accepted review
ok
darix accepted review
Accepted review for by_group opensuse-review-team request 1110259 from user factory-auto
anag+factory accepted review
Staging Project openSUSE:Factory:Staging:adi:31 got accepted.
anag+factory approved review
Staging Project openSUSE:Factory:Staging:adi:31 got accepted.
anag+factory accepted request
Staging Project openSUSE:Factory:Staging:adi:31 got accepted.