Opened 22 months ago

Last modified 11 months ago

#26019 new Bug

Incorrect query generated when combining querysets refering to different fields under the same alias.

Reported by: Suriya Subramanian Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: Simon Charette Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The Django ORM generates incorrect queries for a QuerySet union that has an annotate. Here's output from a Python shell session illustrating this error.

I am trying to do a union of two QuerySets, one which gets the first_name field (renamed as name_field), and another one which gets the last_name field (also renamed as name_field). The unioned QuerySet gets only the first_name field values.

>>> from django.contrib.auth.models import User
>>> from django.db.models import F
>>> first_name_qs = User.objects.order_by().annotate(name_field=F('first_name')).values('name_field').distinct()
>>> last_name_qs = User.objects.order_by().annotate(name_field=F('last_name')).values('name_field').distinct()
>>> print(first_name_qs.query)
SELECT DISTINCT "auth_user"."first_name" AS "name_field" FROM "auth_user"
>>> print(last_name_qs.query)
SELECT DISTINCT "auth_user"."last_name" AS "name_field" FROM "auth_user"
>>> print((first_name_qs | last_name_qs).query)
SELECT DISTINCT "auth_user"."first_name" AS "name_field" FROM "auth_user"

The above behavior is with Django 1.9.

Change History (9)

comment:1 Changed 22 months ago by Simon Charette

Are you expecting the ORM to issue an UNION like the following query?

SELECT "auth_user"."first_name" AS "name_field" FROM "auth_user"
UNION
SELECT "auth_user"."last_name" AS "name_field" FROM "auth_user"

I think the ORM only combines the queries filters when the | operator is used.

Last edited 22 months ago by Simon Charette (previous) (diff)

comment:2 Changed 22 months ago by Suriya Subramanian

That (or a similar query) is what I am expecting. I have used the | operator.

comment:3 in reply to:  2 Changed 22 months ago by Simon Charette

Replying to mssnlayam:

That (or a similar query) is what I am expecting. I have used the | operator.

What I meant here is that the ORM only considers the queries filters and ignore the annotate() aliases and the fact distinct() was used.

comment:4 Changed 22 months ago by Simon Charette

Summary: Incorrect query generated for QuerySet Union with annotateIncorrect query generated when combining querysets refering to different fields under the same alias.
Triage Stage: UnreviewedAccepted
Version: 1.9master

I changed the title to reflect what I think is the actual issue.

If we could use kwargs to specify aliases (#16735) the issue could be expressed by the following query combination:

User.object.values(name='first_name') | User.objects.values(name='last_name')

Still considering this a bug because a TypeError should be raised in Query._merge_sanity_checks to warn the user about this case.

Once we land this bug fix we should consider reorienting this ticket as a feature request to teach the ORM it should use UNION ALL in such case.

SELECT "auth_user"."first_name" AS "name" FROM "auth_user"
UNION ALL
SELECT "auth_user"."last_name" AS "name" FROM "auth_user"
Last edited 22 months ago by Shai Berger (previous) (diff)

comment:5 Changed 22 months ago by Shai Berger

If and when we do teach it to use UNION ALL, and since distinct() was mentioned in the OP's example, we should probably do something to get distinct queries to use UNION instead of UNION ALL.

comment:6 Changed 22 months ago by Simon Charette

@shaib I agree but I assumed teaching the ORM to use UNION instead of UNION ALL if combined queries use distinct could be done in an later patch.

Given

first_names = User.object.values(name='first_name')
last_names = User.object.values(name='last_name')

I would expect

first_names.distinct() | last_names.distinct()

To result in

SELECT DISTINCT "auth_user"."first_name" AS "name" FROM "auth_user"
UNION ALL
SELECT DISTINCT "auth_user"."last_name" AS "name" FROM "auth_user"

And

(first_names | last_names).distinct()

To result in

SELECT "auth_user"."first_name" AS "name" FROM "auth_user"
UNION
SELECT "auth_user"."last_name" AS "name" FROM "auth_user"

or the equivalent

SELECT DISTINCT "name" FROM (
    SELECT "auth_user"."first_name" AS "name" FROM "auth_user"
    UNION ALL
    SELECT "auth_user"."last_name" AS "name" FROM "auth_user"
)

comment:7 Changed 22 months ago by Shai Berger

Yes, exactly.

comment:8 Changed 22 months ago by Simon Charette

Cc: Simon Charette added

comment:9 Changed 11 months ago by Matthew Schinckel

It seems to me that from this, it _could_ be possible to UNION two ValuesQuerySets from different models, as long as they have the same list of columns.

Yes, I'm aware that ValuesQuerySet went away, but I'm hoping that this may still be possible, at some level ;)

Note: See TracTickets for help on using tickets.
Back to Top