Opened 5 years ago

Closed 2 years ago

Last modified 2 years ago

#13724 closed Bug (fixed)

Calling QuerySet.delete() through a relation the DB router is ignored.

Reported by: Chris Targett <chris@…> Owned by: paulcollins
Component: Database layer (models, ORM) Version: 1.3
Severity: Normal Keywords: multidb
Cc: paul.collins.iii@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

I'm using django-multidb-router from here: http://github.com/jbalogh/django-multidb-router With two database definitions, a read_only_user and a read_write_user (with the intentions of having multiple read-only definitions).

When calling QuerySet.delete() through a relation query is sent to the read_only user. Upon inspection it appears MasterSlaveRouter.allow_relation() isn't even called.

Example:

from django.db import models

class ModelA(models.Model):
    value = models.IntegerField(default=0)

class ModelB(models.Model):
    value = models.IntegerField(default=0)
    a = models.ManyToManyField(ModelA)


>>> a = ModelA()
>>> a.save()
>>> b = ModelB()
>>> b.a.add(a)
>>> b.a.all()
[<ModelA: ModelA object>]
>>> b.a.all().delete()
Traceback (most recent call last):
  File "<console>", line 1, in ?
  File "/home/ctargett/d/proj/code/django/db/models/query.py", line 445, in delete
    delete_objects(seen_objs, del_query.db)
  File "/home/ctargett/d/proj/code/django/db/models/query.py", line 1335, in delete_objects
    del_query.delete_batch(pk_list, using=using)
  File "/home/ctargett/d/proj/code/django/db/models/sql/subqueries.py", line 41, in delete_batch
    self.do_query(self.model._meta.db_table, where, using=using)
  File "/home/ctargett/d/proj/code/django/db/models/sql/subqueries.py", line 27, in do_query
    self.get_compiler(using).execute_sql(None)
  File "/home/ctargett/d/proj/code/django/db/models/sql/compiler.py", line 727, in execute_sql
    cursor.execute(sql, params)
  File "/home/ctargett/d/proj/code/django/db/backends/util.py", line 15, in execute
    return self.cursor.execute(sql, params)
  File "/home/ctargett/d/proj/code/django/db/backends/mysql/base.py", line 86, in execute
    return self.cursor.execute(query, args)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/cursors.py", line 166, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.4/site-packages/MySQLdb/connections.py", line 35, in defaulterrorhandler
    raise errorclass, errorvalue
OperationalError: (1142, "DELETE command denied to user 'read_user'@'localhost' for table 'testapp_modelb_a'")

Attachments (2)

13724.diff (887 bytes) - added by barnardo 2 years ago.
13724.routing_test.diff (962 bytes) - added by paulcollins 2 years ago.
Test case showing the problem

Download all attachments as: .zip

Change History (15)

comment:1 Changed 5 years ago by mariarchi

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Resolution set to invalid
  • Status changed from new to closed

I'm pretty sure it belongs to django-multidb-router ticket tracker. Please reopen if I'm wrong.

comment:2 Changed 2 years ago by paulcollins

  • Cc paul.collins.iii@… added
  • Easy pickings unset
  • Resolution invalid deleted
  • Severity set to Normal
  • Status changed from closed to new
  • Type set to Bug
  • UI/UX unset
  • Version changed from 1.2 to 1.3

So we're not using the django-multidb-router, and we're seeing this kind of odd behavior as well. Allow_relation is not being called, which makes sense but what doesn't make sense is
.all().delete()
The Router is being called for the .all() (db_for_read), but then the QuerySet is being changed to a DELETE. Since db_for_write is never called, and our router automatically picks a read-only slave for read-only queries, thus the expected explosion happens.

Last edited 2 years ago by paulcollins (previous) (diff)

comment:3 Changed 2 years ago by timo

@paulcollins, could you provide a test for Django's test suite that demonstrates the problem?

Changed 2 years ago by barnardo

comment:4 Changed 2 years ago by barnardo

Attached is a test case for 1.3.7. I was not able to duplicate. Please verify.

Changed 2 years ago by paulcollins

Test case showing the problem

comment:5 Changed 2 years ago by paulcollins

  • Has patch set
  • Owner changed from nobody to paulcollins
  • Status changed from new to assigned
  • Triage Stage changed from Unreviewed to Accepted

PR created with unit tests showing several cases where this bug manifests
https://github.com/django/django/pull/1597

Marking ticket as accepted per conversations with Jacob Kaplan-Moss and Russell Keith-Magee

comment:6 Changed 2 years ago by paulcollins

  • Triage Stage changed from Accepted to Ready for checkin

yes badform I know but the code is in a good state per reviews with Russell in person. The docs I think are okay but a second set of eyes is always appreciated. Flagging this as ready for checkin just to get a core devs attention =)

comment:7 Changed 2 years ago by russellm

To back up @paulcollins here -- I'm completely OK with this being bumped to RFC. There's still a need for a final review, but I've been reviewing the patch as it was being developed, and I'm happy with the direction it's been heading.

comment:8 Changed 2 years ago by akaariai

  • Patch needs improvement set
  • Triage Stage changed from Ready for checkin to Accepted

I added a question to the PR about how db_for_write should work in related manager methods. For example in someobj.related_set.add() (fields/related.py:L418 in current master), should there be one call to db_for_write, and then all objects are saved to that DB? This is how m2m field's add works for example. And, if so, should this be tested. As is, each time save() is called it will individually call db_for_write() and this could lead splitting the save into different databases.

Maybe the related manager write routing doesn't need to be addressed in this ticket, and maybe I just don't understand this issue fully (I don't know multidb routing that well). Still, I will mark "patch needs improvement" until this question is answered.

comment:9 Changed 2 years ago by paulcollins

@akaariai

That's a good point, the save could end up splitting it across multiple databases. Depending on how complex the router is maybe that's desired?

In any this case, the point of this ticket was the related manager write routing. For example
A.objects.get(pk=something).related_set.filter(name__in=[str(range(12))]).delete()
A ... get is fine.
related_set.filter says get db_for_read
.delete should call db_for_write, but since a db has already been set on the QuerySet object it doesn't. At that point it's locked into trying to send the delete to the read database. In the case of sharding db_for_read and db_for_write are probably the same thing, but in the case of master / slave (which is where I ran into this issue) they're not. In a master slave setup, trying to do a delete on db_for_read will (one hopes) fail.

In the case of things being split up because of multiple calls to get db_for_write I believe the hint object is passed down the chain so the implementer of the router could make a decision based on that hint. I'll add a check in the tests for the hint and report back.

comment:10 Changed 2 years ago by Russell Keith-Magee <russell@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In 9595183d03cfd0d94ae2dd506a3d2b86cf5c74a7:

Fixed #13724: Corrected routing of write queries involving managers.

Previously, if a database request spanned a related object manager, the
first manager encountered would cause a request to the router, and this
would bind all subsequent queries to the same database returned by the
router. Unfortunately, the first router query would be performed using
a read request to the router, resulting in bad routing information being
used if the subsequent query was actually a write.

This change defers the call to the router until the final query is acutally
made.

It includes a small *BACKWARDS INCOMPATIBILITY* on an edge case - see the
release notes for details.

Thanks to Paul Collins (@paulcollinsiii) for the excellent debugging
work and patch.

comment:11 Changed 2 years ago by russellm

@akaariai - FYI - I implemented your suggested change in the test suite; the router tests now validate that the hints and model passed to the router are valid.

comment:12 Changed 2 years ago by Tim Graham <timograham@…>

In f93896344abfdfd288e0d8249b37e2b799fd7d53:

Fixed Python 3.2 syntax errors; refs #13724.

comment:13 Changed 2 years ago by Tim Graham <timograham@…>

In 4745ea1d27a154a44ed2cacb66b01d5de1f9e3d3:

Added hints argument to GeoQuerySet; refs #13724.

Note: See TracTickets for help on using tickets.
Back to Top