Opened 8 years ago

Closed 8 years ago

#27301 closed Cleanup/optimization (fixed)

Better handling of errors that are not pickleable when testing in parallel

Reported by: Adam Wróbel Owned by: Tim Graham <timograham@…>
Component: Testing framework Version: 1.10
Severity: Normal Keywords: testrunner parallel pickle
Cc: Aymeric Augustin, chris.jerdonek@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When running test in parallel the tests results are communicated between processes in serialized form. Unfortunately not all test errors are serializable using pickle module.

Django currently detects errors that fail pickle.dumps, reports them to STDERR and raises exception that interrupts test runner.

This ticket has two parts:

The first is a minor enhancement. In addition to detecting pickle.dumps errors also detect pickle.loads errors. I've implemented it in a Github PR #7325.

The second is a proposal to instead of raising pickling exception that interrupts tests and displays useless multiprocessing.pool.RemoteTraceback wrap the error type and message in some kind of UnpickleableError container and pass it to parent process with the usual addError/addFailure/addSubTest/addExpectedFailure methods. This will let the test suite continue, display final error statistics and perform teardown. This is especially important for addExpectedFailure which currently stops the entire suite if the expected error is unpickleable (my actual use case).

Change History (13)

comment:1 by Tim Graham, 8 years ago

Owner: Tim Graham removed
Status: assignednew
Type: UncategorizedCleanup/optimization

comment:2 by Tim Graham, 8 years ago

Triage Stage: UnreviewedAccepted

comment:3 by Chris Jerdonek, 8 years ago

Cc: chris.jerdonek@… added

comment:4 by Chris Jerdonek, 8 years ago

In the PR, you say--

One example of error like that is botocore.exceptions.ClientError which pickles as "ClientError(message)", but its initializer is actually ClientError(response, operation_name).

In addition to creating this ticket, have you also tried reporting this to the botocore project?

This ticket has two parts:

This sounds like it should be taken care of by two separate PR's (and indeed you created one PR to solve the first part). Does this mean you should have two separate tickets? This way the two issues can be discussed in isolation without confusing the two.

in reply to:  4 comment:5 by Adam Wróbel, 8 years ago

Replying to Chris Jerdonek:

In the PR, you say--

One example of error like that is botocore.exceptions.ClientError which pickles as "ClientError(message)", but its initializer is actually ClientError(response, operation_name).

In addition to creating this ticket, have you also tried reporting this to the botocore project?

I've started writing a PR for botocore, but consider it a lower priority: we can try to fix thousands of unpickleable errors, but with a robust test runner it might not matter at all.

This ticket has two parts:

This sounds like it should be taken care of by two separate PR's (and indeed you created one PR to solve the first part). Does this mean you should have two separate tickets? This way the two issues can be discussed in isolation without confusing the two.

Indeed I could try to write a second PR if the second part of this ticket was accepted.

As for two tickets, I was really hoping to have that minor adjustment from first PR merged without any ticket at all. That one-direction-only check seems like a trivial omission. I think @aaugustin would agree.

comment:6 by Chris Jerdonek, 8 years ago

I've started writing a PR for botocore, but consider it a lower priority:

Can you post a link to the issue?

we can try to fix thousands of unpickleable errors, but with a robust test runner it might not matter at all.


Thousands seems like an exaggeration to me. A dozen seems like a more reasonable upper bound.

In any case, if a particular exception isn't picklable, it seems like that should be reported as a bug each time we notice it -- regardless of the state of Django's test runner, and regardless of the number of instances we notice. Also, as long as Django relies on the picklability of exceptions for the nicest form of error messages, it seems we should want those exceptions to be picklable.

Version 0, edited 8 years ago by Chris Jerdonek (next)

in reply to:  6 comment:7 by Adam Wróbel, 8 years ago

Replying to Chris Jerdonek:

I've started writing a PR for botocore, but consider it a lower priority:

Can you post a link to the issue?

Well, I said I started writing PR, not that I posted it. But hey, here's me hoping to please you:

https://github.com/boto/botocore/pull/1044

In either case botocore issue shouldn't in any way delay our progress here. They might fix it or not, but we should be able to handle problems like theirs.

comment:8 by Adam Wróbel, 8 years ago

I am beginning to think that making botocore errors picklable is a lost cause. They support python 2.6 and 2.7 which is even worse at pickling (my current botocore PR only passes on 3.X). They also inherit exceptions from requests package , similarly rewriting the __init__ signature and affecting picklability (I've accidentally omitted that one error class in my PR, but now I don't know how to tackle it).

comment:9 by Chris Jerdonek, 8 years ago

Thanks. Re: botocore, it would seem reasonable to me for them to accept a PR that fixes it only for 3.x and/or future versions of botocore if fixing it for 2.x is too difficult and/or would break backwards compatibility.

Also, on a related note, if it turns out that picklability of exceptions is too much to expect in general, then I think we should consider not having Django's test runner error out if the exception isn't picklable. But that can be handled in the "part 2" ticket when that is created.

comment:10 by Tim Graham, 8 years ago

Has patch: set
Triage Stage: AcceptedReady for checkin

PR looks good to me. Wouldn't mind a +1 from someone else.

comment:11 by Chris Jerdonek, 8 years ago

I can take a look, but it would be later today.

comment:12 by Chris Jerdonek, 8 years ago

I added three comments to the PR.

comment:13 by Tim Graham <timograham@…>, 8 years ago

Owner: set to Tim Graham <timograham@…>
Resolution: fixed
Status: newclosed

In 52188a5:

Fixed #27301 -- Prevented exceptions that fail unpickling from crashing the parallel test runner.

Note: See TracTickets for help on using tickets.
Back to Top