﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
31904	Django queryset cannot call distinct() and order_by() on separate fields.	Logan Knecht	nobody	"== Introduction

Hello there!

I want to start by saying thank you so much for this library! It is absolutely wonderful!


== Data Model

There are `Learners`
There are `Schools`
A `School` has many `Courses`
A `Course has many `Lessons`
A `School` has many `Resources`
A `Lesson` has one `Resource`


== Problem

I want all the `Schools` ordered by the most `Learners` that are enrolled, and I want the returned `Schools` to be distinct.

I have an issue where I've got to do a really complex query across multiple models in order to generate the full data of the school and return it. I have to use a heavily `prefetched` and `select_related` query to get what I need so that it's performant. When I query the `School` model that's returned it is duplicated multiple times.

I can de-duplicate the `School` by calling `distinct(""id"")`.

However, now they're not in the correct order, so I want to call `order_by(""-learner_enrolled_count"")`

When I do this the query looks like this:


{{{
user_query_set = (UserModel.objects.all())
# ---
school_board_headmaster_query_set = (SchoolBoardHeadmasterModel.objects.all()
                                 .select_related(""learner"",
                                                 ""school_board""))
school_board_member_query_set = (SchoolBoardMemberModel.objects.all()
                             .select_related(""learner"",
                                             ""school_board""))
# ---
flash_card_set_query_set = (FlashCardSetModel.objects.all()
                        .select_related(""school""))
flash_card_query_set = (FlashCardModel.objects.all()
                    .select_related(""flash_card_set"",
                                    ""playsheetmusicflashcardmodel"",
                                    ""school"",
                                    ""trueorfalseflashcardmodel""))
play_sheet_music_flash_card_query_set = (PlaySheetMusicFlashCardModel.objects.all()
                                     .select_related(""flash_card"",
                                                     ""school"",
                                                     ""sheet_music""))
true_or_false_flash_card_query_set = (TrueOrFalseFlashCardModel.objects.all()
                                  .select_related(""flash_card"",
                                                  ""school""))
sheet_music_query_set = (SheetMusicModel.objects.all()
                     .select_related(""school""))
# ---
school_course_query_set = (SchoolCourseModel.objects.all()
                       .select_related(""school""))
# ---
school_lesson_query_set = (SchoolLessonModel.objects.all()
                       .select_related(""course"",
                                       ""flash_card_set"",
                                       ""school""))

# ----------------------------------------------------------------------
# Prefetch
# ----------------------------------------------------------------------
school_models = (super()
             .get_queryset()
             .select_related(""school_board"")
             .prefetch_related(
# ------------------------------------------------------------------
# School Board
# ------------------------------------------------------------------
Prefetch(""school_board__school_board_headmaster"",
         school_board_headmaster_query_set),
Prefetch(""school_board__school_board_headmaster__learner__user"",
         user_query_set),
Prefetch(""school_board__school_board_member"",
         school_board_member_query_set),
Prefetch(""school_board__school_board_member__learner__user"",
         user_query_set),
# ------------------------------------------------------------------
# School Course
# ------------------------------------------------------------------
Prefetch(""school_course"",
         school_course_query_set),
# ------------------------------------------------------------------
# School Course -> School Lesson
# ------------------------------------------------------------------
Prefetch(""school_course__school_lesson"",
         school_lesson_query_set),
# ------------------------------------------------------------------
# School Course -> School Lesson -> Flash Card Set
# ------------------------------------------------------------------
Prefetch(""school_course__school_lesson__flash_card_set"",
         flash_card_set_query_set),
Prefetch(""school_course__school_lesson__flash_card_set__flash_card"",
         flash_card_query_set),
Prefetch(""school_course__school_lesson__flash_card_set__flash_card__playsheetmusicflashcardmodel"",
         play_sheet_music_flash_card_query_set),
Prefetch(""school_course__school_lesson__flash_card_set__flash_card__playsheetmusicflashcardmodel__sheet_music"",
         sheet_music_query_set),
Prefetch(""school_course__school_lesson__flash_card_set__flash_card__trueorfalseflashcardmodel"",
         true_or_false_flash_card_query_set),
# ------------------------------------------------------------------
# School Resources
# ------------------------------------------------------------------
Prefetch(""flashcardsetmodel_set"",
         flash_card_set_query_set),
# ---
Prefetch(""flashcardsetmodel_set__flash_card"",
         flash_card_query_set),
# ---
Prefetch(""flashcardsetmodel_set__flash_card__playsheetmusicflashcardmodel"",
         play_sheet_music_flash_card_query_set),
Prefetch(""flashcardsetmodel_set__flash_card__playsheetmusicflashcardmodel__sheet_music"",
         sheet_music_query_set),
# ---
Prefetch(""flashcardsetmodel_set__flash_card__trueorfalseflashcardmodel"",
         true_or_false_flash_card_query_set),
# ---
Prefetch(""sheetmusicmodel_set"",
         sheet_music_query_set)
))

all_school_models = school_models.annotate(
learner_enrolled_count=Case(
    When(learner_enrolled_school__learner_enrolled_course__learner_enrolled_lesson__is_enrolled=True,
         then=1),
    default=0,
    output_field=IntegerField()
))
}}}

So I then call
{{{
all_school_models.distinct(""id"").order_by(""-learner_enrolled_count"")
}}}


Which then gives me this output:

{{{
Internal Server Error: /api/v1/schools/
Traceback (most recent call last):
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/db/backends/utils.py"", line 86, in _execute
    return self.cursor.execute(sql, params)
psycopg2.errors.InvalidColumnReference: SELECT DISTINCT ON expressions must match initial ORDER BY expressions
LINE 1: SELECT DISTINCT ON (""piano_gym_api_schoolmodel"".""id"") ""piano...
                            ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/core/handlers/exception.py"", line 34, in inner
    response = get_response(request)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/core/handlers/base.py"", line 115, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/core/handlers/base.py"", line 113, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/views/decorators/csrf.py"", line 54, in wrapped_view
    return view_func(*args, **kwargs)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/rest_framework/viewsets.py"", line 114, in view
    return self.dispatch(request, *args, **kwargs)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/rest_framework/views.py"", line 505, in dispatch
    response = self.handle_exception(exc)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/rest_framework/views.py"", line 465, in handle_exception
    self.raise_uncaught_exception(exc)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/rest_framework/views.py"", line 476, in raise_uncaught_exception
    raise exc
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/rest_framework/views.py"", line 502, in dispatch
    response = handler(request, *args, **kwargs)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/piano_gym_api/versions/v1/views/school_view.py"", line 102, in list
    request)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/piano_gym_api/versions/v1/pagination/header_link_pagination.py"", line 91, in paginate_queryset
    total_result_count = len(queryset)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/db/models/query.py"", line 258, in __len__
    self._fetch_all()
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/db/models/query.py"", line 1261, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/db/models/query.py"", line 57, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/db/models/sql/compiler.py"", line 1144, in execute_sql
    cursor.execute(sql, params)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/debug_toolbar/panels/sql/tracking.py"", line 198, in execute
    return self._record(self.cursor.execute, sql, params)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/debug_toolbar/panels/sql/tracking.py"", line 133, in _record
    return method(sql, params)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/db/backends/utils.py"", line 100, in execute
    return super().execute(sql, params)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/db/backends/utils.py"", line 68, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/db/backends/utils.py"", line 77, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/db/backends/utils.py"", line 86, in _execute
    return self.cursor.execute(sql, params)
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/db/utils.py"", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File ""/Users/hugbot/Repositories/piano-gym-api-web-server/piano_gym_back_end/venv/lib/python3.7/site-packages/django/db/backends/utils.py"", line 86, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: SELECT DISTINCT ON expressions must match initial ORDER BY expressions
LINE 1: SELECT DISTINCT ON (""piano_gym_api_schoolmodel"".""id"") ""piano...
                            ^

[15/Aug/2020 06:42:27] ""GET /api/v1/schools/?order_by=-learner_enrolled_count HTTP/1.1"" 500 27740
}}}

== Understanding

My understanding of this is that this is caused by PostgresSQL AND Django

If you read the documentation for Django it says
{{{
When you specify field names, you must provide an order_by() in the QuerySet, and the fields in order_by() must start with the fields in distinct(), in the same order.
}}}

- https://docs.djangoproject.com/en/3.0/ref/models/querysets/#django.db.models.query.QuerySet.order_by
- https://docs.djangoproject.com/en/3.0/ref/models/querysets/#distinct

So in this case the solution, if I'm lazy is just to do


{{{
all_school_models.distinct(""learner_enrolled_count"", ""id"").order_by(""-learner_enrolled_count"", ""id"")
}}}

But that doesn't return the correct information because now the results are distinct for the `School` models that have a distinct `learner_enrolled_count` AND `id` when I just want the distinct `Schools` and then now they are first ordered by `learner_enrolled_count` and immediately by `id` which is also incorrect.

== Additional References

I'm struggling with this in the `django-filter` library and created an to issue to track it:
- https://github.com/carltongibson/django-filter/issues/1255

== Conclusion

I have no idea how to achieve the ability to select `distinct` schools and `order_by` a separate column, and it doesn't seem like an unreasonable request? I have looked all over and see this issue repeated, but no work arounds provided.

Do you have guidance on how to achieve this?

This seems like an oversight of Django to not support the ability to distinctly select fields and order arbitrarily."	Bug	closed	Database layer (models, ORM)	3.1	Normal	invalid	PostgreSQL, Django, order_by, distinct, queryset		Unreviewed	0	0	0	0	0	0
