Opened 7 years ago
Closed 7 years ago
#28099 closed Bug (duplicate)
Reuse of UpdateQuery breaks complex delete updates
Reported by: | milosu | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.11 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
I've discovered some obscure bug, which triggers randomly and only under certain circumstances, based on the address space layout.
Try to apply attached delete_on_update_bug.py and run delete_regress multiple times.
The test case should fail with:
====================================================================== FAIL: test_delete_with_custom_handlers (delete_regress.tests.CollectorUpdateTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\data\django\tests\delete_regress\tests.py", line 381, in test_delete_with_custom_handlers self.assertEqual(cost_claim_in_report.threatened_id, t_org_in_report.pk) AssertionError: None != 2 ---------------------------------------------------------------------- Ran 19 tests in 0.970s
I can reproduce this bug with latest Django when I make several tweaks, that change the address space layout of the Python process, besides others:
- use psycopg2 backend while testing instead of the dummy default
- try to put pdb near the random shuffle
- add somehow additional DLLs to the Python process etc.
The issue can be fixed by applying delete_on_update_fix.diff.
The root cause is the reuse of sql.UpdateQuery across the inner for-loop, due to the way update_batch is implemented (by calling add_update_values etc.).
Caveats:
The print statement in delete_on_update_bug.diff prints the order of the items when iterating over the field_updates dictionary.
The order that breaks the test case is one of the following:
organization None set([<ThreatenedOrganization: reported threatened org>])
threatened None set([<CostClaim: cost claim>])
organization None set([<CostClaim: reported cost claim>])
i.e. update of "organization" fkey on <CostClaim: reported cost claim> must follow after the update of the "threatened" attr.
Attachments (3)
Change History (12)
by , 7 years ago
Attachment: | delete_on_update_bug.diff added |
---|
comment:1 by , 7 years ago
Summary: | Global use of UpdateQuery breaks complex delete updates → Reuse of UpdateQuery breaks complex delete updates |
---|
comment:2 by , 7 years ago
Patch needs improvement: | set |
---|
I haven't investigated this is detail but it would be nice to understand the root cause of the issue a bit more to confirm that the fix is the best one. The patch should be reworked to apply to master (which supports Python 3 only) and the test simplified as much as possible. Existing test models should be used as much as possible. If any new models are needed they should have as few fields as possible. The test should use create()
rather than get_or_create()
and I don' think the DoesNotExist
exception catching is necessary.
comment:3 by , 7 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:4 by , 7 years ago
The root cause is the reuse of the sql.UpdateQuery(model) instance in the inner forloop.
Repeated call to add_update_values called from update_batch is essentially adding value items to self.values array of the Query object.
This can not be tested explicitly in the test case due to the way how the delete method on the collector is structured.
The best way how to deal with this bug is to prevent the reuse of UpdateQuery.
I can polish the testcase somehow per your instructions, but keep in mind that the reproducibility of the issue is hard.
This is a critical ORM bug, which is present in Django since ages, with ability to randomly corrupt data in the database.
comment:5 by , 7 years ago
Hi,
When you say a bug in Django is triggered by process memory layout, it sounds suspicious -- Python code should generally not be affected by such low-level concerns. So, it sounds more likely that Django's use triggers a bug in the underlying components which are written in C -- Python itself, Psycopg2, or the PostgreSQL client libraries. Such bugs are, indeed, more likely on Windows, where these components are less used and tested.
It may be the case that there is such a bug, and your suggested fix makes Django work around it -- which may make the fix worth while, but the upstream projects should be notified.
Or, unlikely as it seems to me, it may be that Django is actually doing something wrong here. But to show that, in my opinion, you need to show not just that something bad can happen when using Django -- you need to show where Django is violating some contract of the APIs it is using.
comment:6 by , 7 years ago
Hi, guys, I'm attaching new simplified patch. Fails against Django 1.11 on Windows in my test setup. Even against SQL Lite.
Clear enough now?
by , 7 years ago
Attachment: | delete_on_update_bug_update.diff added |
---|
comment:8 by , 7 years ago
Depends on what you mean by custom on_delete handler.
Collector.add_field_update is called from models.CASCADE - this sets None => can not trigger this bug. Same holds true for models.SET_NULL.
Than there are handlers like models.SET() and models.SET_DEFAULT => bug could be triggered, if callable with variable output is used in these handlers.
I do not have real-world use cases using these "default" handlers, but I can produce silly looking regression test cases using specially crafted field.default or models.SET callables that will also trigger assertion failure.
The bug report use case is the real-world one, anyway.
comment:9 by , 7 years ago
Resolution: | → duplicate |
---|---|
Status: | new → closed |
Closing as a duplicate of #29016 which has the same fix and simple steps to reproduce.
code to reproduce the bug