Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#26225 closed Cleanup/optimization (wontfix)

Coalesce Multiple Calls to order_by with Same Arguments

Reported by: Alex Rothberg Owned by: nobody
Component: Database layer (models, ORM) Version: 1.9
Severity: Normal Keywords:
Cc: Anssi Kääriäinen, Marc Tamlyn, Josh Smeaton, Shai Berger Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Currently multiple calls to order_by with the same arguments will invalidate the query cache. While the the documentation notes, "Each order_by() call will clear any previous ordering" (it returns a new QuerySet), I see no reason to clear the cache / not return self in the case of calling order_by with the currently sort order.

Here is an example in action:

>>> children = Child.objects.order_by('saved_dt').all()
>>> list(children)
[<Child: Child object>, <Child: Child object>]
(0.001) SELECT "prefetch_child"."id", "prefetch_child"."saved_dt", "prefetch_child"."parent_id" FROM "prefetch_child" ORDER BY "prefetch_child"."saved_dt" ASC; args=()
>>> list(children)
[<Child: Child object>, <Child: Child object>]
>>> children.order_by('saved_dt') 
[<Child: Child object>, <Child: Child object>]
(0.000) SELECT "prefetch_child"."id", "prefetch_child"."saved_dt", "prefetch_child"."parent_id" FROM "prefetch_child" ORDER BY "prefetch_child"."saved_dt" ASC LIMIT 21; args=()

I would hope that the second call to children.order_by('saved_dt') can return self since the queryset is already sorted by the desired key.

This is related to ticket: https://code.djangoproject.com/ticket/26211

Change History (6)

comment:1 by Tim Graham, 8 years ago

Cc: Anssi Kääriäinen Marc Tamlyn Josh Smeaton added
Component: UncategorizedDatabase layer (models, ORM)
Type: UncategorizedCleanup/optimization

I'm a bit nervous this change might cause subtle backwards incompatibilities in case someone is relying on a redundant order_by() to reevaluate the queryset. I wonder if you can make your reasoning any more persuasive than "I see no reason against it." The reasons against it I see:

  1. additional complexity
  2. inconsistency with other queyset methods (e.g. should any of them also try to detect if they are applied redundantly?)

I'll leave it open for opinions from ORM experts.

comment:2 by Marc Tamlyn, 8 years ago

Yes, this would make significant unnecessary complexity in my opinion. The ORM currently abides by a simple rule that chaining on another call creates a new queryset. We even explicitly use this with all() in forms.ModelChoiceField - and that's an API which can never change the queryset.

It's worth noting that you could add this kind of behaviour to a custom queryset class, but you would be inspecting private properties of the underlying query object to get what you want. Looking at the original ticket you opened, how I would approach that sort of situation where a you have code which may have a prefetched query or may not is to use to_attr - prefetch to _ordered_children and then have a ordered_children method which returns _ordered_children if it exists, or runs the query if not.

comment:3 by Josh Smeaton, 8 years ago

I agree with Marc. Too complex for very little gain. Imagine the case where you have multiple complex ordering expressions:

qs.order_by(Upper('field_a').asc(), Lower('field_b').desc(), Concat(F('field_c'), F('field_d')).asc())

If another ordering queryset method is applied, to determine if that was "the same" as last time, you could only really do identity checks which fails unless the objects are reused, or to generate what the ordering sql would be, and comparing that against the existing. Neither of those options are fool proof, and generating the sql would be fairly expensive for very little gain.

comment:4 by Marc Tamlyn, 8 years ago

Resolution: wontfix
Status: newclosed

in reply to:  2 ; comment:5 by Shai Berger, 8 years ago

Cc: Shai Berger added

Replying to mjtamlyn:

[Y]ou could add this kind of behaviour to a custom queryset class, but you would be inspecting private properties of the underlying query object to get what you want.

Perhaps we should consider making such properties public? It could help with making querysets more composable and reusable, I think.

in reply to:  5 comment:6 by Alex Rothberg, 8 years ago

Note: See TracTickets for help on using tickets.
Back to Top