#36678 closed Bug (fixed)
Infinite retries in parallel test runner if _init_worker fails
| Reported by: | Jacob Walls | Owned by: | Jacob Walls |
|---|---|---|---|
| Component: | Testing framework | Version: | dev |
| Severity: | Release blocker | Keywords: | |
| Cc: | Triage Stage: | Ready for checkin | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
Recent GitHub actions were timing out after 6 hours because of an infinite retry in the parallel test runner.
To reproduce, throw an error at the top of _init_worker.
We will fix the specific error source in #36677, but we should also fix the parallel test runner to have a bound on retries.
Change History (9)
comment:1 by , 3 weeks ago
| Summary: | Parallel test runner retries indefinitiely if _init_worker fails → Infinite retries in parallel test runner if _init_worker fails |
|---|
comment:2 by , 3 weeks ago
| Has patch: | set |
|---|
comment:3 by , 3 weeks ago
| Triage Stage: | Unreviewed → Accepted |
|---|
comment:4 by , 3 weeks ago
| Patch needs improvement: | set |
|---|
Need to tack back toward an earlier approach that tracked the origin of the failure in _init_worker to avoid introducing an arbitrary time limit on tests.
comment:5 by , 3 weeks ago
| Patch needs improvement: | unset |
|---|
comment:6 by , 3 weeks ago
| Severity: | Normal → Release blocker |
|---|
I think we need to handle this before 6.0 final since while this was only a theoretical failure point before, #36083 is going to make this a realistic vector for errors.
comment:7 by , 13 days ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
PR