I'm seeing a new failure mode recently.
In mldsa-native/mlkem-native, we start 4 relatively fast jobs on the RISE runners for each PR. Since recently (maybe yesterday), I often see 3 of the 4 jobs being picked up very quickly, but one does not get picked up for a long time even after the other 3 finish. Sometimes it helps to cancel the job and restart it.
For example here the 4th job wasn't picked up for almost 2 hours until I cancelled it and restarted it. The restarted one is not yet picked up after 30 minutes even though I think you are not under heavy load right now according to the dashboard.
The dashboard shows the job as pending and a corresponding worker as "active":

I'm seeing a new failure mode recently.
In mldsa-native/mlkem-native, we start 4 relatively fast jobs on the RISE runners for each PR. Since recently (maybe yesterday), I often see 3 of the 4 jobs being picked up very quickly, but one does not get picked up for a long time even after the other 3 finish. Sometimes it helps to cancel the job and restart it.
For example here the 4th job wasn't picked up for almost 2 hours until I cancelled it and restarted it. The restarted one is not yet picked up after 30 minutes even though I think you are not under heavy load right now according to the dashboard.
The dashboard shows the job as pending and a corresponding worker as "active":