Opened 8 years ago

Closed 19 months ago

#26019 closed Bug (fixed)

Incorrect query generated when combining querysets refering to different fields under the same alias.

Reported by: Suriya Subramanian Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Simon Charette Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The Django ORM generates incorrect queries for a QuerySet union that has an annotate. Here's output from a Python shell session illustrating this error.

I am trying to do a union of two QuerySets, one which gets the first_name field (renamed as name_field), and another one which gets the last_name field (also renamed as name_field). The unioned QuerySet gets only the first_name field values.

>>> from django.contrib.auth.models import User
>>> from django.db.models import F
>>> first_name_qs = User.objects.order_by().annotate(name_field=F('first_name')).values('name_field').distinct()
>>> last_name_qs = User.objects.order_by().annotate(name_field=F('last_name')).values('name_field').distinct()
>>> print(first_name_qs.query)
SELECT DISTINCT "auth_user"."first_name" AS "name_field" FROM "auth_user"
>>> print(last_name_qs.query)
SELECT DISTINCT "auth_user"."last_name" AS "name_field" FROM "auth_user"
>>> print((first_name_qs | last_name_qs).query)
SELECT DISTINCT "auth_user"."first_name" AS "name_field" FROM "auth_user"

The above behavior is with Django 1.9.

Change History (10)

comment:1 by Simon Charette, 8 years ago

Are you expecting the ORM to issue an UNION like the following query?

SELECT "auth_user"."first_name" AS "name_field" FROM "auth_user"
UNION
SELECT "auth_user"."last_name" AS "name_field" FROM "auth_user"

I think the ORM only combines the queries filters when the | operator is used.

Last edited 8 years ago by Simon Charette (previous) (diff)

comment:2 by Suriya Subramanian, 8 years ago

That (or a similar query) is what I am expecting. I have used the | operator.

in reply to:  2 comment:3 by Simon Charette, 8 years ago

Replying to mssnlayam:

That (or a similar query) is what I am expecting. I have used the | operator.

What I meant here is that the ORM only considers the queries filters and ignore the annotate() aliases and the fact distinct() was used.

comment:4 by Simon Charette, 8 years ago

Summary: Incorrect query generated for QuerySet Union with annotateIncorrect query generated when combining querysets refering to different fields under the same alias.
Triage Stage: UnreviewedAccepted
Version: 1.9master

I changed the title to reflect what I think is the actual issue.

If we could use kwargs to specify aliases (#16735) the issue could be expressed by the following query combination:

User.object.values(name='first_name') | User.objects.values(name='last_name')

Still considering this a bug because a TypeError should be raised in Query._merge_sanity_checks to warn the user about this case.

Once we land this bug fix we should consider reorienting this ticket as a feature request to teach the ORM it should use UNION ALL in such case.

SELECT "auth_user"."first_name" AS "name" FROM "auth_user"
UNION ALL
SELECT "auth_user"."last_name" AS "name" FROM "auth_user"
Last edited 8 years ago by Shai Berger (previous) (diff)

comment:5 by Shai Berger, 8 years ago

If and when we do teach it to use UNION ALL, and since distinct() was mentioned in the OP's example, we should probably do something to get distinct queries to use UNION instead of UNION ALL.

comment:6 by Simon Charette, 8 years ago

@shaib I agree but I assumed teaching the ORM to use UNION instead of UNION ALL if combined queries use distinct could be done in an later patch.

Given

first_names = User.object.values(name='first_name')
last_names = User.object.values(name='last_name')

I would expect

first_names.distinct() | last_names.distinct()

To result in

SELECT DISTINCT "auth_user"."first_name" AS "name" FROM "auth_user"
UNION ALL
SELECT DISTINCT "auth_user"."last_name" AS "name" FROM "auth_user"

And

(first_names | last_names).distinct()

To result in

SELECT "auth_user"."first_name" AS "name" FROM "auth_user"
UNION
SELECT "auth_user"."last_name" AS "name" FROM "auth_user"

or the equivalent

SELECT DISTINCT "name" FROM (
    SELECT "auth_user"."first_name" AS "name" FROM "auth_user"
    UNION ALL
    SELECT "auth_user"."last_name" AS "name" FROM "auth_user"
)

comment:7 by Shai Berger, 8 years ago

Yes, exactly.

comment:8 by Simon Charette, 8 years ago

Cc: Simon Charette added

comment:9 by Matthew Schinckel, 7 years ago

It seems to me that from this, it _could_ be possible to UNION two ValuesQuerySets from different models, as long as they have the same list of columns.

Yes, I'm aware that ValuesQuerySet went away, but I'm hoping that this may still be possible, at some level ;)

comment:10 by Simon Charette, 19 months ago

Resolution: fixed
Status: newclosed

Now that both QuerySet.union (#27718) and QuerySet.values(alias:str=exr:Expression) (#16735) support has landed I think we can close this ticket as QuerySet combination through | is likely going to remain unchanged.

Note: See TracTickets for help on using tickets.
Back to Top