Opened 4 years ago

Closed 3 years ago

#17459 closed Bug (wontfix)

Subquery fails across multiple databases

Reported by: dgouldin Owned by: nobody
Component: Database layer (models, ORM) Version: 1.3
Severity: Normal Keywords:
Cc: dgouldin@…, lrekucki@… Triage Stage: Design decision needed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


Given the following databases:

    'default': {
        'ENGINE': 'django.db.backends.sqlite3',
        'NAME': 'db1',
    'db2': {
        'ENGINE': 'django.db.backends.sqlite3',
        'NAME': 'db2',

database router:

class FooBarRouter(object):
    def db_for_read(self, model, **hints):
        if model._meta.app_label == 'bar':
            return 'db2'
        return None

    def db_for_write(self, model, **hints):
        if model._meta.app_label == 'bar':
            return 'db2'
        return None

    def allow_syncdb(self, db, model):
        if model._meta.app_label == 'bar':
            return db == 'db2'
            return db == 'default'

and models:

class FooModel(models.Model):
    name = models.CharField(max_length=100)
class BarModel(models.Model):
    name = models.CharField(max_length=100)
    foo_id = models.IntegerField()

The following will produce "DatabaseError: no such table: foo_foomodel":

from foo.models import FooModel
from bar.models import BarModel

foo1 = FooModel.objects.create(name='foo1')
bar1 = BarModel.objects.create(name='bar1',

BarModel.objects.filter(foo_id__in=FooModel.objects.values_list('id', flat=True))

Instead of trying to merge the 2 querysets into a single sql statement, the inner queryset should be found to use a different database and be evaluated separately from the outer one.

Change History (5)

comment:1 Changed 4 years ago by lrekucki

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Resolution set to wontfix
  • Status changed from new to closed

After thinking about this for a while, I think it's not something Django should do implicitly.

First of all, because this results in some serious performance drawback that's hidden in a place you normally wouldn't expect it to be and it gives a false impression of Django supporting cross-database queries.

Second, because it's not always the right thing to do. Taking the (naive) example of master-slave configuration from the docs, the fact that the router returns different slaves for Bar and Foo, shouldn't prevent Django from generating the subquery. As Django has no other way of knowing, which dbs contain which tables, we can't assume that the table is or isn't there.

Explicitly using list() fixes the problem and is more explicit about performance/memory overhead of the query.

Last edited 4 years ago by lrekucki (previous) (diff)

comment:2 Changed 4 years ago by jdunck

  • Resolution wontfix deleted
  • Status changed from closed to reopened

Well, even if the table did exist, you couldn't assume that the result would be correct in a multi-db context because it's pretty often the case that table A on db-0 has a subset and table B on db-1 has a different subset.

I think the fact that a multi-db query could successfully run quietly on a subset of the intended data is the worse bug here. I agree that list'ing the subquery is a fair solution, but first you have to realize that you have the problem at all.

The problem is that the inlined queryset doesn't call db_for_read at all, so that the router can't advise the ORM not to use the default/parent-qs's DB.

I agree that inlining is useful, even in a multi-db context, but the child qs's router.db_for_read should still get a chance to advise and not be inlined.

Reopening - please wontfix again if you disagree.

Last edited 4 years ago by jdunck (previous) (diff)

comment:3 Changed 4 years ago by lrekucki

  • Cc lrekucki@… added
  • Triage Stage changed from Unreviewed to Design decision needed

I'm not sure calling db_for_read(Foo) will help here even if we include a hint like subquery_of=Bar. If parent query is already bound to some db, the only thing the router can do for subquery is either return the same DB (thus allowing the subquery to perform) or throw an exception because it's impossible to perform the query (i'm still -1 on doing list() implicitly).

How about adding a new method to the router allow_subquery (which would basically work the same as allow_relation, but for models instead of instances) ?

comment:4 Changed 3 years ago by aaugustin

  • Status changed from reopened to new

comment:5 Changed 3 years ago by jacob

  • Resolution set to wontfix
  • Status changed from new to closed

Yeah I agree with lrekucki's initial analysis: doing some magic cross-database join emulation isn't something I want Django to do. It's possible the error message could be improved -- feel free to open a new ticket for that -- but I don't want this sort of silent join emulation happening.

Note: See TracTickets for help on using tickets.
Back to Top