Opened 3 years ago
Closed 3 years ago
#34080 closed Bug (fixed)
__exact lookup on nested arrays with None values fails on PostgreSQL.
| Reported by: | Ion Alberdi | Owned by: | Ion Alberdi |
|---|---|---|---|
| Component: | contrib.postgres | Version: | 4.1 |
| Severity: | Normal | Keywords: | |
| Cc: | Triage Stage: | Ready for checkin | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
PR:
https://github.com/pricemoov/django/pull/2/files
How to reproduce:
tox -e py39-postgres -- --settings=test_postgres postgres_tests.test_array.TestQuerying
django.db.utils.DataError: invalid input syntax for type integer: "{NULL,NULL}"
LINE 1: ...ullableintegerarraymodel"."field_nested" = (ARRAY['{NULL,NUL...
The error seems to be due to the query being generated as
...ARRAY['{NULL,NULL}']...
instead of
...ARRAY[ARRAY[NULL,NULL]]...
Note that the bug is not reproduced if at least one of the elements in the nested array is not null.
Change History (15)
follow-up: 2 comment:1 by , 3 years ago
| Component: | Database layer (models, ORM) → contrib.postgres |
|---|---|
| Summary: | Postgresql/ArrayField: exact filter on NULL nested arrays fails → __exact lookup on nested arrays with None values fails on PostgreSQL. |
| Triage Stage: | Unreviewed → Accepted |
follow-up: 4 comment:2 by , 3 years ago
Replying to Mariusz Felisiak:
Thanks for the report (see related 84633905273fc916e3d17883810d9969c03f73c2 and #27808). It's an issue in ArrayRHSMixin.
Actually by running both test cases (bug=None nested array, fix=nested array with one element that is not null.
def test_exact_nested_null(self):
instance = NullableIntegerArrayModel.objects.create(field_nested=[[None, None], [None, None]])
self.assertSequenceEqual(
NullableIntegerArrayModel.objects.filter(field_nested=[[None, None], [None, None]]), [instance]
)
def test_exact_nested_null_and_not_null(self):
instance = NullableIntegerArrayModel.objects.create(field_nested=[[None, 1], [None, None]])
self.assertSequenceEqual(
NullableIntegerArrayModel.objects.filter(field_nested=[[None, 1], [None, None]]), [instance]
)
class ArrayRHSMixin:
def __init__(self, lhs, rhs):
if isinstance(rhs, (tuple, list)):
expressions = []
for value in rhs:
if not hasattr(value, "resolve_expression"):
field = lhs.output_field
value = Value(field.base_field.get_prep_value(value))
expressions.append(value)
rhs = Func(
*expressions,
function="ARRAY",
template="%(function)s[%(expressions)s]",
)
super().__init__(lhs, rhs)
By looking at the rhs variable in both scenarios:
- test_exact_nested_null(bug)
Func(Value([None, None]), Value([None, None]), function=ARRAY, template=%(function)s[%(expressions)s])
- test_exact_nested_null_and_not_null(correct)
Func(Value([None, 1]), Value([None, None]), function=ARRAY, template=%(function)s[%(expressions)s])
The rhs variables look correct to me. I'll continue digging, so far I tend to go towards the psycopg2 bug hypothesis. Thanks a lot for these links Mariusz,
https://github.com/psycopg/psycopg2/issues/325 seems indeed to be pretty related to this one.
follow-up: 5 comment:3 by , 3 years ago
The rhs variables look correct to me. I'll continue digging, so far I tend to go towards the psycopg2 bug hypothesis.
As far as I'm aware it's not a psycopg2 issue but PostgreSQL-behavior. Regression tests work when we skip wrapping values in ARRAY() for None-only arrays, so this should be fixable on Django side.
comment:4 by , 3 years ago
Given the values above, Django generates queries as
'(ARRAY[%s, %s])::integer[][]', [[None, 1], [None, None]]'
with
def test_psycopg2_all_none(self):
with connection.cursor() as cursor:
cursor.execute("SELECT ARRAY[%s]::integer[][]", ([None, None],))
I reproduce the bug in psycopg2 (2.9.3) as it fails with
invalid input syntax for type integer: "{NULL,NULL}"
LINE 1: SELECT ARRAY['{NULL,NULL}']::integer[][]
By removing the ArrayRHSMixin, the test added in the PR above passes but 12 others fail.
I'll open an issue in psycopg2 to get their point of view on
def test_psycopg2_all_none(self):
with connection.cursor() as cursor:
cursor.execute("SELECT ARRAY[%s]::integer[][]", ([None, None],))
Thanks again Mariusz!
comment:5 by , 3 years ago
Putting the link to https://github.com/psycopg/psycopg2/issues/1507 for the sake of tracking.
comment:7 by , 3 years ago
| Has patch: | set |
|---|---|
| Owner: | changed from to |
| Status: | new → assigned |
comment:8 by , 3 years ago
Thanks for the patch Ion :)
I have some time today to take a look at this otherwise Mariusz or Carlton will also take a look.
comment:9 by , 3 years ago
ok so just my 2¢ worth in investigating this:
Mix-n-matching array[] constructor with array literals {} seems to be the issue which I reckon has to do with postgres not understanding how to infer the appropriate types when nested (edit the following is inaccurate) and NULLs are present. In my experience postgres generally won't do this for complex types including json; see my related ticket #33905. The solution often involves just being more explicit.
I tried manually altering the SQL Django has output for a sample model with nested ArrayField with the following problematic output:
# SELECT *
FROM "ticket_34080_foo"
WHERE "ticket_34080_foo"."field_nested" = (ARRAY['{NULL}'])::integer[][]
LIMIT 21;
ERROR: invalid input syntax for type integer: "{NULL}"
LINE 3: WHERE "ticket_34080_foo"."field_nested" = (ARRAY['{NULL}'])...
^
Using array[] at both levels works:
# SELECT * FROM "ticket_34080_foo" WHERE "ticket_34080_foo"."field_nested" = (ARRAY[ARRAY[NULL]])::integer[][] LIMIT 21; id | field | field_nested ----+-------+-------------- (0 rows)
Using an array literal for both levels works:
# SELECT *
FROM "ticket_34080_foo"
WHERE "ticket_34080_foo"."field_nested" = ('{{NULL}}')::integer[][]
LIMIT 21;
id | field | field_nested
----+-------+--------------
(0 rows)
Explicitly casting the literal also work when mixing with a constructor:
# SELECT *
FROM "ticket_34080_foo"
WHERE "ticket_34080_foo"."field_nested" = (ARRAY['{NULL}'::integer[]])::integer[][]
LIMIT 21;
id | field | field_nested
----+-------+--------------
(0 rows)
comment:10 by , 3 years ago
ok so further looking into this it appears that the array constructor doesn't try to infer types in literals at all even if the literal contains only int; using an array literal within a constructor should always be cast.
# SELECT *
FROM "ticket_34080_foo"
WHERE "ticket_34080_foo"."field_nested" = (ARRAY['{1}'])::integer[][]
LIMIT 21;
ERROR: invalid input syntax for type integer: "{1}"
LINE 3: WHERE "ticket_34080_foo"."field_nested" = (ARRAY['{1}'])::i...
^
comment:11 by , 3 years ago
ok so…
after trailing through multiple threads here, psycopg2 and the ML I see the issue with ArrayRHSMixin being used in an incompatible way with psycopg2 :)
the stuff I reported above is interesting but can probably be ignored.
I've gone through the submitted PR and have some comments to make :)
comment:12 by , 3 years ago
| Patch needs improvement: | set |
|---|
comment:13 by , 3 years ago
| Patch needs improvement: | unset |
|---|---|
| Triage Stage: | Accepted → Ready for checkin |
Thanks for the report (see related 84633905273fc916e3d17883810d9969c03f73c2 and #27808). It's an issue in ArrayRHSMixin.