Opened 11 years ago

Closed 11 years ago

Last modified 7 years ago

#7142 closed (invalid)

QuerySet: Distinct + Values + order by doesn't behave as one would expect

Reported by: bo Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Keywords: qsrf-cleanup distinct order_by values
Cc: Triage Stage: Design decision needed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


best shown by example .. effects the Queryset Refactor

## a simple model
class Thing(models.Model):
   is_ok = models.BooleanField()
   name = models.CharField()

## This works just fine
q = Thing.objects.filter(is_ok = True).values('name').distinct()
##  SQL
##  SELECT DISTINCT name from thing where is_ok = 1

## this fails

q = Thing.objects.filter(is_ok = True).extra(select={'short_name' : "SUBSTRING(name, 1, 3) "}.values('short_name').distinct()

# it does return a dict of just short_name
#  q  = [{ 'short_name': 'bla'}, { 'short_name': 'bla2'}, ...]
#  however the SQL generated is
# SELECT DISTINCT name, (SUBSTRING(name, 1, 3)) as short_name from thing where is_ok = 1
#  Note that 'Name' is still included in the query thus 'DISTINCT' 
#  is NOT distinct on 'short_name' but the name - short_name combo
# it should be
# SELECT DISTINCT (SUBSTRING(name, 1, 3)) as short_name from thing where is_ok = 1

Change History (5)

comment:1 Changed 11 years ago by Honza Král

Component: UncategorizedDatabase wrapper
Keywords: order_by values added; Distinct + Extra + Values = not so removed
Summary: QuerySet: Distinct + Extra + Values = Not distinct on values if values contains an 'extra select' fieldQuerySet: Distinct + Values + order by doesn't behave as one would expect
Triage Stage: UnreviewedDesign decision needed

The "problem" is introduced in [7455], it is due to PostgreSQL's inability to order by on a column not present in a distinct query. I am flagging this Design Decision needed and will bring it up on django-dev.

The hotfix is .order_by() - turn of ordering for the given query, then it will work, it has nothing to do with extra .

The problem is a bit cryptic, because the fields needed for ordering are added to SELECT so the value is counted upon when doing the distinct in the DB, but is not returned to the user so he is puzzled for quite some time why distinct doesn't return distinct values.

comment:2 Changed 11 years ago by George Vilches

Keywords: qsrf-cleanup added

comment:3 Changed 11 years ago by Jacob

milestone: 1.0

comment:4 Changed 11 years ago by Malcolm Tredinnick

Resolution: invalid
Status: newclosed

I gather there's some ordering on the model that's not mentioned here (or in the queryset construction that isn't mentioned)? Because I see no ordering in the examples above.

If that is the case (that there is ordering involved), then it's neither unexpected nor incorrect behaviour. It's correct SQL. It's also documented (see the note in that section of the docs).

If there's no ordering involved, then please reopen (although I can't repeat the problem), but as far as I can work out at the moment, this isn't a bug in Django.

comment:5 Changed 7 years ago by Jacob

milestone: 1.0

Milestone 1.0 deleted

Note: See TracTickets for help on using tickets.
Back to Top