#32433 closed Bug (fixed)
Delete distinct produces an unexpected and potentially harmful SQL
| Reported by: | egism | Owned by: | egism |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 3.1 |
| Severity: | Normal | Keywords: | orm, delete, distinct |
| Cc: | Triage Stage: | Ready for checkin | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description (last modified by )
I was looking for a way to delete the first Comment of each Post (a sample domain). Since I know that every new Post starts with a system generated comment I decided to go with:
Comment.objects.order_by('post_id', 'created_at').distinct('post_id').delete()
Before proceeding I tested it with:
Comment.objects.order_by('post_id', 'created_at').distinct('post_id').count()
Made sure the result actually contains what I needed and proceeded with the delete(). The result was rather surprising. I was notified that the whole table was wiped clean. I then checked the actual SQL that was executed and it was a simple DELETE FROM comments;.
As an ORM user, I would say it is the worst outcome possible and I would at least expect an error in such a case or ideally a SQL of what I was trying to achieve. At the same time, count and delete produces an inconsistent result which is even more mistaking.
Potential solutions:
- raise an error with a decent explanation
- produce a desired SQL according to the query
Since I have never submitted a change to Django, I have a very limited knowledge of the ORM and its intricacies. I could give it a try and issue a patch for this with some guidance.
Change History (10)
comment:1 by , 5 years ago
| Description: | modified (diff) |
|---|
follow-up: 3 comment:2 by , 5 years ago
| Keywords: | postgresql removed |
|---|---|
| Triage Stage: | Unreviewed → Accepted |
| Type: | Uncategorized → Bug |
comment:3 by , 5 years ago
Hi Simon,
Sounds very simple, I will give it a go. Will be a good opportunity to make my first contribution. Thanks for directions.
comment:4 by , 5 years ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:6 by , 5 years ago
| Has patch: | set |
|---|
comment:7 by , 5 years ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
I think we should start by failing loudly when
distinct().delete()is used as adding support for it would require a subquery pushdown (DELETE FROM table WHERE field IN (SELECT DISTINCT field FROM table) which would require a non negligible amount of work and additional complexity for the rare cases where this is useful and a manual__insubquery lookup can be used to achieve the same results e.g.egism if you want to work on a patch you should raise a
TypeErrorinQuerySet.deleteifself.query.distinct or self.query.distinct_fields. A regression test can then be added todelete_regress/tests(source).