Opened 3 years ago

Last modified 3 years ago

#19493 new Bug

annotate(Count()) does not work properly with django.db.backends.oracle

Reported by: kimvais@… Owned by: nobody
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords: oracle
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by akaariai)

An QuerySet operation like:
models.SomeModel.objects.filter(somecharfield=value).values("someintegerfield").annotate(Count("someintegerfield"))

return a QuerySet with no aggregate count, but multiple rows with someintegerfield__count=1

The correct SQL would be:

SELECT tablespace_somemodel.someintegerfield, COUNT(tablespace_somemodel.someintegerfield) AS someintegerfield__count
  FROM tablespace_somemodel WHERE tablespace_somemodel.somecharfield = 'value'
 GROUP BY tablespace_somemodel.someintegerfield;

This works fine on other backends.

Change History (5)

comment:1 Changed 3 years ago by akaariai

  • Description modified (diff)
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

What SQL do you get? settings.DEBUG = True + connection.queries should tell you the executed query easily.

Quickly checking it seems a QuerySet with multiple rows having someintegerfield__count=1 seems the correct answer for the query.

comment:2 Changed 3 years ago by kimvais@…

The SQL produced by Django, which returns incorrect results (1):

SELECT "TABLESPACE_SOMEMODEL"."SOMEINTEGERFIELD", COUNT("TABLESPACE_SOMEMODEL"."SOMEINTEGERFIELD") AS
"SOMEINTEGERFIELD__COUNT" FROM "TABLESPACE_SOMEMODEL" WHERE "TABLESPACE_SOMEMODEL"."SOMECHARFIELD" = active
GROUP BY "TABLESPACE_SOMEMODEL"."SOMEINTEGERFIELD", "TABLESPACE_SOMEMODEL"."ID" ORDER BY "TABLESPACE_SOMEMODEL"."ID" DESC

Which returns a QuerySet containing lots of rows with same values, eg:
[(1,1), (1, 1), (2, 1), (2, 1), (2, 1)] when correct would be [(1, 2), (2, 3)]
where each item is (someintegerfield, someintegerfield_count)

Query (2) that produces correct results on MySQL:

SELECT `tablespace_somemodel`.`someintegerfield`, COUNT(`tablespace_somemodel`.`someintegerfield`) AS `someintegerfield__count` FROM `tablespace_somemodel` WHERE `tablespace_somemodel`.`somecharfield` = active  GROUP BY `tablespace_somemode
l`.`someintegerfield` ORDER BY `tablespace_somemodel`.`id` DESC

Correctly working query (3) for Oracle (tested on sqlplus command line)

SELECT tablespace_somemodel.someintegerfield, COUNT(tablespace_somemodel.someintegerfield) AS someintegerfield__count
  FROM tablespace_somemodel WHERE tablespace_somemodel.somecharfield = 'value'
 GROUP BY tablespace_somemodel.someintegerfield;

However, query 3 cannot be passed to SomeModel.objects.raw() as "raw queries
must contain primary key".

Furthermore,

SomeModel.objects.raw("SELECT tablespace_somemodel.id, tablespace_somemodel.someintegerfield, "
    "COUNT(tablespace_somemodel.someintegerfield) AS someintegerfield__count "
    "FROM tablespace_somemodel WHERE tablespace_somemodel.somecharfield = 'active' "
    "GROUP BY tablespace_somemodel.someintegerfield, tablespace_somemodel.id")

goes in to infinite recursion loop of

Exception RuntimeError: 'maximum recursion depth exceeded while calling a
Python object' in <type 'exceptions.KeyError'> ignored

But that probably is not a big issues, since I guess the results would in any case be the same as the "wrong" query 1.
There probably isn't a simple fix for this, as it seems to me that there is an inherent requirement for the primary key in order to construct a QuerySet, but in oracle GROUP BY statement MUST contain all the selected fields - which results the Count() to aggregate the results based on both type and id.

comment:3 Changed 3 years ago by russellm

  • Triage Stage changed from Unreviewed to Accepted
  • Type changed from Uncategorized to Bug

I'm guessing an important, but unstated part of this bug report is that your model needs to have an implicit ordering on the primary key; I haven't been able to reproduce the "working" SQL without that important element.

The "wrong" results are being generated because the PK is included in the GROUP BY clause; I'm guessing this is because the ORDER BY clause exists. PKs aren't required for all querysets -- values() querysts can drop them under most circumstances, but they will be added back under certain ordering or join situations. I'm guessing the Oracle backend's logic is slightly different, and isn't pruning the PK as needed.

Regarding the call to raw() -- Yes, it needs to have a PK field present; however, the fact that you were able to get an infinite recursion is alarming. I haven't been able to reproduce on a non-Oracle backend; if this problem isn't fixed by the fix for the COUNT, then it should be opened as a separate ticket.

comment:4 Changed 3 years ago by kimvais@…

You are correct, I have "ordering = ('-pk')" in the model Meta - (to make sure that the ordering is correct on MySQL - which doesn't store fractions of second in date fields). I guess I have to pick lesser of two evils :)

comment:5 Changed 3 years ago by kimvais@…

Just tested and verified that a query with explicit removal of ordering works correctly on Oracle, e.g.

models.SomeModel.objects.order_by().filter(somecharfield=value).values("someintegerfield").annotate(Count("someintegerfield"))
Note: See TracTickets for help on using tickets.
Back to Top