Opened 16 years ago

Closed 16 years ago

Last modified 13 years ago

#7142 closed (invalid)

QuerySet: Distinct + Values + order by doesn't behave as one would expect

Reported by: bo Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Keywords: qsrf-cleanup distinct order_by values
Cc: Triage Stage: Design decision needed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

best shown by example .. effects the Queryset Refactor

## a simple model
class Thing(models.Model):
   is_ok = models.BooleanField()
   name = models.CharField()


## This works just fine
q = Thing.objects.filter(is_ok = True).values('name').distinct()
##
##  SQL
##
##  SELECT DISTINCT name from thing where is_ok = 1
##

## this fails
#

q = Thing.objects.filter(is_ok = True).extra(select={'short_name' : "SUBSTRING(name, 1, 3) "}.values('short_name').distinct()

#
# it does return a dict of just short_name
# 
#  q  = [{ 'short_name': 'bla'}, { 'short_name': 'bla2'}, ...]
#
#  however the SQL generated is
#
#
# SELECT DISTINCT name, (SUBSTRING(name, 1, 3)) as short_name from thing where is_ok = 1
#
#  Note that 'Name' is still included in the query thus 'DISTINCT' 
#  is NOT distinct on 'short_name' but the name - short_name combo
#
# it should be
#
# SELECT DISTINCT (SUBSTRING(name, 1, 3)) as short_name from thing where is_ok = 1
#

Change History (5)

comment:1 by Honza Král, 16 years ago

Component: UncategorizedDatabase wrapper
Keywords: order_by values added; Distinct + Extra + Values = not so removed
Summary: QuerySet: Distinct + Extra + Values = Not distinct on values if values contains an 'extra select' fieldQuerySet: Distinct + Values + order by doesn't behave as one would expect
Triage Stage: UnreviewedDesign decision needed

The "problem" is introduced in [7455], it is due to PostgreSQL's inability to order by on a column not present in a distinct query. I am flagging this Design Decision needed and will bring it up on django-dev.

The hotfix is .order_by() - turn of ordering for the given query, then it will work, it has nothing to do with extra .

The problem is a bit cryptic, because the fields needed for ordering are added to SELECT so the value is counted upon when doing the distinct in the DB, but is not returned to the user so he is puzzled for quite some time why distinct doesn't return distinct values.

comment:2 by George Vilches, 16 years ago

Keywords: qsrf-cleanup added

comment:3 by Jacob, 16 years ago

milestone: 1.0

comment:4 by Malcolm Tredinnick, 16 years ago

Resolution: invalid
Status: newclosed

I gather there's some ordering on the model that's not mentioned here (or in the queryset construction that isn't mentioned)? Because I see no ordering in the examples above.

If that is the case (that there is ordering involved), then it's neither unexpected nor incorrect behaviour. It's correct SQL. It's also documented (see the note in that section of the docs).

If there's no ordering involved, then please reopen (although I can't repeat the problem), but as far as I can work out at the moment, this isn't a bug in Django.

comment:5 by Jacob, 13 years ago

milestone: 1.0

Milestone 1.0 deleted

Note: See TracTickets for help on using tickets.
Back to Top