Opened 4 years ago

Closed 4 years ago

#15709 closed Cleanup/optimization (fixed)

Duplicated group_by condition

Reported by: ziangsong Owned by: nobody
Component: Database layer (models, ORM) Version: 1.3
Severity: Normal Keywords: group_by, annotate
Cc: ziangsong Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

So I want to implement this SQL:

selelct id, count(*), max(insert_date) as m_d
from Book
group by id

Here is the Django ORM query:

q = Book.objects.values('id').annotate(c = Count('id'), m_d = Max('insert_date')).order_by()

However, the translated sql is like this:

selelct id, count(*), max(insert_date) as m_d
from Book
group by id, id <-here is another id! It messed up things!

Btw, the id in Book is a foreign key to another table and I am using MySql database.

Attachments (1)

15709.diff (1.6 KB) - added by mk 4 years ago.

Download all attachments as: .zip

Change History (10)

comment:1 Changed 4 years ago by ziangsong

  • Cc ziangsong added
  • Component changed from Uncategorized to Database layer (models, ORM)
  • milestone set to 1.3
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

comment:2 Changed 4 years ago by mk

  • Triage Stage changed from Unreviewed to Accepted

Some quick testing shows that the issue even exists when the Count('id') is left away:

In [4]: Page.objects.values('parent').annotate(m_d=Max('title')).order_by()
Out[4]: [{'parent': None, 'm_d': u'Test page 2'}, {'parent': 1, 'm_d': u'Maybe a subpage'}, {'parent': 2, 'm_d': u'Test page 2 wants a subpage too'}]

In [5]: connection.queries
Out[5]: 
[{'sql': u'SELECT "page_page"."parent_id", MAX("page_page"."title") AS "m_d" FROM "page_page" GROUP BY "page_page"."parent_id", "page_page"."parent_id" LIMIT 21',
  'time': '0.001'}]

Inside django.db.models.sql.compiler.SQLCompiler.get_grouping, both group_by and self.query.select contain ('page_page', 'parent_id') which causes the duplicated GROUP BY clause.

comment:3 Changed 4 years ago by mk

  • Has patch set
  • milestone 1.3 deleted

The attached patch fixes this problem (hopefully) by not letting group_by clause duplicates through. I don't think duplicated entries in GROUP BY ever make sense.

Test included.

Changed 4 years ago by mk

comment:4 Changed 4 years ago by ziangsong

Thank you! I really appreciate it!

comment:5 Changed 4 years ago by mk

Could you promote this issue to "Ready for checkin" if it fixes the problem for you? Thanks!

comment:6 Changed 4 years ago by ziangsong

  • Triage Stage changed from Accepted to Ready for checkin

comment:7 Changed 4 years ago by lukeplant

  • Type set to Cleanup/optimization

comment:8 Changed 4 years ago by lukeplant

  • Severity set to Normal

comment:9 Changed 4 years ago by lukeplant

  • Resolution set to fixed
  • Status changed from new to closed

In [16180]:

Fixed #15709 - Duplicated group_by condition

Thanks to ziangsong for report, and to mk for the patch

Note: See TracTickets for help on using tickets.
Back to Top