Opened 2 years ago
Closed 2 years ago
#34080 closed Bug (fixed)
__exact lookup on nested arrays with None values fails on PostgreSQL.
Reported by: | Ion Alberdi | Owned by: | Ion Alberdi |
---|---|---|---|
Component: | contrib.postgres | Version: | 4.1 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
PR:
https://github.com/pricemoov/django/pull/2/files
How to reproduce:
tox -e py39-postgres -- --settings=test_postgres postgres_tests.test_array.TestQuerying
django.db.utils.DataError: invalid input syntax for type integer: "{NULL,NULL}" LINE 1: ...ullableintegerarraymodel"."field_nested" = (ARRAY['{NULL,NUL...
The error seems to be due to the query being generated as
...ARRAY['{NULL,NULL}']...
instead of
...ARRAY[ARRAY[NULL,NULL]]...
Note that the bug is not reproduced if at least one of the elements in the nested array is not null.
Change History (15)
follow-up: 2 comment:1 by , 2 years ago
Component: | Database layer (models, ORM) → contrib.postgres |
---|---|
Summary: | Postgresql/ArrayField: exact filter on NULL nested arrays fails → __exact lookup on nested arrays with None values fails on PostgreSQL. |
Triage Stage: | Unreviewed → Accepted |
follow-up: 4 comment:2 by , 2 years ago
Replying to Mariusz Felisiak:
Thanks for the report (see related 84633905273fc916e3d17883810d9969c03f73c2 and #27808). It's an issue in ArrayRHSMixin.
Actually by running both test cases (bug=None nested array, fix=nested array with one element that is not null.
def test_exact_nested_null(self): instance = NullableIntegerArrayModel.objects.create(field_nested=[[None, None], [None, None]]) self.assertSequenceEqual( NullableIntegerArrayModel.objects.filter(field_nested=[[None, None], [None, None]]), [instance] ) def test_exact_nested_null_and_not_null(self): instance = NullableIntegerArrayModel.objects.create(field_nested=[[None, 1], [None, None]]) self.assertSequenceEqual( NullableIntegerArrayModel.objects.filter(field_nested=[[None, 1], [None, None]]), [instance] )
class ArrayRHSMixin: def __init__(self, lhs, rhs): if isinstance(rhs, (tuple, list)): expressions = [] for value in rhs: if not hasattr(value, "resolve_expression"): field = lhs.output_field value = Value(field.base_field.get_prep_value(value)) expressions.append(value) rhs = Func( *expressions, function="ARRAY", template="%(function)s[%(expressions)s]", ) super().__init__(lhs, rhs)
By looking at the rhs variable in both scenarios:
- test_exact_nested_null(bug)
Func(Value([None, None]), Value([None, None]), function=ARRAY, template=%(function)s[%(expressions)s])
- test_exact_nested_null_and_not_null(correct)
Func(Value([None, 1]), Value([None, None]), function=ARRAY, template=%(function)s[%(expressions)s])
The rhs variables look correct to me. I'll continue digging, so far I tend to go towards the psycopg2 bug hypothesis. Thanks a lot for these links Mariusz,
https://github.com/psycopg/psycopg2/issues/325 seems indeed to be pretty related to this one.
follow-up: 5 comment:3 by , 2 years ago
The rhs variables look correct to me. I'll continue digging, so far I tend to go towards the psycopg2 bug hypothesis.
As far as I'm aware it's not a psycopg2
issue but PostgreSQL-behavior. Regression tests work when we skip wrapping values in ARRAY()
for None
-only arrays, so this should be fixable on Django side.
comment:4 by , 2 years ago
Given the values above, Django generates queries as
'(ARRAY[%s, %s])::integer[][]', [[None, 1], [None, None]]'
with
def test_psycopg2_all_none(self): with connection.cursor() as cursor: cursor.execute("SELECT ARRAY[%s]::integer[][]", ([None, None],))
I reproduce the bug in psycopg2 (2.9.3) as it fails with
invalid input syntax for type integer: "{NULL,NULL}" LINE 1: SELECT ARRAY['{NULL,NULL}']::integer[][]
By removing the ArrayRHSMixin, the test added in the PR above passes but 12 others fail.
I'll open an issue in psycopg2 to get their point of view on
def test_psycopg2_all_none(self): with connection.cursor() as cursor: cursor.execute("SELECT ARRAY[%s]::integer[][]", ([None, None],))
Thanks again Mariusz!
comment:5 by , 2 years ago
Putting the link to https://github.com/psycopg/psycopg2/issues/1507 for the sake of tracking.
comment:7 by , 2 years ago
Has patch: | set |
---|---|
Owner: | changed from | to
Status: | new → assigned |
comment:8 by , 2 years ago
Thanks for the patch Ion :)
I have some time today to take a look at this otherwise Mariusz or Carlton will also take a look.
comment:9 by , 2 years ago
ok so just my 2¢ worth in investigating this:
Mix-n-matching array[]
constructor with array literals {}
seems to be the issue which I reckon has to do with postgres not understanding how to infer the appropriate types when nested (edit the following is inaccurate) and NULL
s are present. In my experience postgres generally won't do this for complex types including json; see my related ticket #33905. The solution often involves just being more explicit.
I tried manually altering the SQL Django has output for a sample model with nested ArrayField with the following problematic output:
# SELECT * FROM "ticket_34080_foo" WHERE "ticket_34080_foo"."field_nested" = (ARRAY['{NULL}'])::integer[][] LIMIT 21; ERROR: invalid input syntax for type integer: "{NULL}" LINE 3: WHERE "ticket_34080_foo"."field_nested" = (ARRAY['{NULL}'])... ^
Using array[]
at both levels works:
# SELECT * FROM "ticket_34080_foo" WHERE "ticket_34080_foo"."field_nested" = (ARRAY[ARRAY[NULL]])::integer[][] LIMIT 21; id | field | field_nested ----+-------+-------------- (0 rows)
Using an array literal for both levels works:
# SELECT * FROM "ticket_34080_foo" WHERE "ticket_34080_foo"."field_nested" = ('{{NULL}}')::integer[][] LIMIT 21; id | field | field_nested ----+-------+-------------- (0 rows)
Explicitly casting the literal also work when mixing with a constructor:
# SELECT * FROM "ticket_34080_foo" WHERE "ticket_34080_foo"."field_nested" = (ARRAY['{NULL}'::integer[]])::integer[][] LIMIT 21; id | field | field_nested ----+-------+-------------- (0 rows)
comment:10 by , 2 years ago
ok so further looking into this it appears that the array constructor doesn't try to infer types in literals at all even if the literal contains only int; using an array literal within a constructor should always be cast.
# SELECT * FROM "ticket_34080_foo" WHERE "ticket_34080_foo"."field_nested" = (ARRAY['{1}'])::integer[][] LIMIT 21; ERROR: invalid input syntax for type integer: "{1}" LINE 3: WHERE "ticket_34080_foo"."field_nested" = (ARRAY['{1}'])::i... ^
comment:11 by , 2 years ago
ok so…
after trailing through multiple threads here, psycopg2 and the ML I see the issue with ArrayRHSMixin being used in an incompatible way with psycopg2 :)
the stuff I reported above is interesting but can probably be ignored.
I've gone through the submitted PR and have some comments to make :)
comment:12 by , 2 years ago
Patch needs improvement: | set |
---|
comment:13 by , 2 years ago
Patch needs improvement: | unset |
---|---|
Triage Stage: | Accepted → Ready for checkin |
Thanks for the report (see related 84633905273fc916e3d17883810d9969c03f73c2 and #27808). It's an issue in ArrayRHSMixin.