Opened 2 years ago

Last modified 4 weeks ago

#28553 new Bug

Querysets: annotate() columns are forced into a certain position which may disrupt union()

Reported by: David Sanders Owned by: nobody
Component: Database layer (models, ORM) Version: 1.11
Severity: Normal Keywords:
Cc: Ole Laursen Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

(Reporting possible issue found by a user on #django)

Using values() to force selection of certain columns in a certain order proved useful unioning querysets with union() for the aforementioned user. The positioning of columns added with annotate() is not controllable with values() and has the potential to disrupt union() unless this fact is known and the ordering done in a certain way to accommodate it.

I'm reporting this mainly for posterity but also as a highlight that perhaps this should be mentioned in the documentation. I'm sure there are reasons why the annotations are appended to the select but if someone feels that this is changeable then it would be a bonus outcome.

Change History (5)

comment:1 Changed 2 years ago by Tim Graham

Triage Stage: UnreviewedAccepted
Type: UncategorizedCleanup/optimization

(The ticket component should change to "Documentation" if there aren't any code changes to make here. I'm not sure.)

comment:2 Changed 22 months ago by Sergey Fedoseev

Probably duplicate of #28900.

comment:3 Changed 18 months ago by Flávio Juvenal

I've stumbled upon a case in production where this limitation prevents me for making a useful query. I've been able to create a test to reproduce this problem.
It works with sqlite but fails with postgres.

Add this test to tests/queries/test_qs_combinators.py:

from django.db.models import F, IntegerField, TextField, Value

def test_union_with_two_annotated_values_on_different_models(self):
    qs1 = Number.objects.annotate(
        text_annotation=Value('Foo', TextField())
    ).values('text_annotation', 'num')
    qs2 = ReservedName.objects.annotate(
        int_annotation=Value(1, IntegerField()),
    ).values('name', 'int_annotation')
    self.assertEqual(qs1.union(qs2).count(), 10)

In current master (78f8b80f9b215e50618375adce4c97795dabbb84), running ./runtests.py --parallel=1 --settings=tests.test_postgres queries.test_qs_combinators.QuerySetSetOperationTests.test_union_with_two_annotated_values_on_different_models fails:

Testing against Django installed in 'django/django'
Creating test database for alias 'default'...
Creating test database for alias 'other'...
System check identified no issues (1 silenced).
E
======================================================================
ERROR: test_union_with_two_annotated_values_on_different_models (queries.test_qs_combinators.QuerySetSetOperationTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "django/django/db/backends/utils.py", line 85, in _execute
    return self.cursor.execute(sql, params)
psycopg2.ProgrammingError: UNION types integer and character varying cannot be matched
LINE 1: ..._annotation" FROM "queries_number") UNION (SELECT "queries_r...
                                                             ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "django/tests/queries/test_qs_combinators.py", line 140, in test_union_with_two_annotated_values_on_different_models
    self.assertEqual(qs1.union(qs2).count(), 10)
  File "django/django/db/models/query.py", line 382, in count
    return self.query.get_count(using=self.db)
  File "django/django/db/models/sql/query.py", line 494, in get_count
    number = obj.get_aggregation(using, ['__count'])['__count']
  File "django/django/db/models/sql/query.py", line 479, in get_aggregation
    result = compiler.execute_sql(SINGLE)
  File "django/django/db/models/sql/compiler.py", line 1054, in execute_sql
    cursor.execute(sql, params)
  File "django/django/db/backends/utils.py", line 68, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "django/django/db/backends/utils.py", line 77, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "django/django/db/backends/utils.py", line 85, in _execute
    return self.cursor.execute(sql, params)
  File "django/django/db/utils.py", line 89, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "django/django/db/backends/utils.py", line 85, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: UNION types integer and character varying cannot be matched
LINE 1: ..._annotation" FROM "queries_number") UNION (SELECT "queries_r...
                                                             ^


----------------------------------------------------------------------
Ran 1 test in 0.007s

FAILED (errors=1)
Destroying test database for alias 'default'...
Destroying test database for alias 'other'...

My tests/test_postgres.py is:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': 'django-test',
        'HOST': '127.0.0.1',
    },
    'other': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': 'django-test-other',
    }
}

SECRET_KEY = "django_tests_secret_key"

# Use a fast hasher to speed up tests.
PASSWORD_HASHERS = [
    'django.contrib.auth.hashers.MD5PasswordHasher',
]

comment:4 Changed 5 months ago by Ole Laursen

Cc: Ole Laursen added
Type: Cleanup/optimizationBug

There's a ticket with a test case in #30211.

That ticket was reported as a bug, and I think this ticket should be a bug too, so I'm changing the classification for now (apologies if that's inappropriate). I think it's a bug because the documentation in my reading implies that as long as the columns match, union will work. So it really comes as a bit of a surprise that Django overrides the order in values_list().

In my particular case, I'm using union() to combine two different tables that I need to get out in sorted order, and I was trying to use annotate() + values_list() to add a NULL filler to one table as it lacks a column from the other.

Also, I suppose the ORM could possibly also be a bit more efficient if it could return values_list() tuples directly from the select instead of having to rearrange them?

comment:5 in reply to:  4 Changed 4 months ago by Kal Sze

Actually there could be a different problem as well. We were lucky in that the ORM generated SQL where the data types of the columns do not match. But what happens when the data types of the columns match? I'm afraid you would get a query that doesn't throw an exception but is in fact subtly broken, especially if the values of the different columns happen to be similar, in which case it might take a long time for app developers to realize that the query is broken.

Last edited 4 weeks ago by Kal Sze (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top