#31894 closed Cleanup/optimization (fixed)
JSONField filter() and JSONField exclude() are sometimes not inverses of each other
Reported by: | Mikuael | Owned by: | Johnny Pribyl |
---|---|---|---|
Component: | Documentation | Version: | 3.1 |
Severity: | Normal | Keywords: | |
Cc: | Sage Abdullah, Johnny Pribyl | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Filtering based on a JSONField key-value pair seems to have some unexpected behavior when involving a key that not all records have. Strangely, filtering on an optional property key will not return the inverse result set that an exclude on the same property key will return.
In my database, I have:
- 2250 total records
- 49 records where jsonfieldname = {'propertykey': 'PropertyValue'}
- 296 records where jsonfieldname has a 'propertykey' key with some other value
- 1905 records where jsonfieldname does not have a 'propertykey' key at all
The following code:
q = Q(jsonfieldname__propertykey="PropertyValue") total_records = Record.objects.count() filtered_records = Record.objects.filter(q).count() excluded_records = Record.objects.exclude(q).count() filtered_plus_excluded_records = filtered_records + excluded_records print('Total: %d' % total_records) print('Filtered: %d' % filtered_records) print('Excluded: %d' % excluded_records) print('Filtered Plus Excluded: %d' % filtered_plus_excluded_records)
Will output this:
Total: 2250 Filtered: 49 Excluded: 296 Filtered Plus Excluded: 345
It is surprising that the filtered+excluded value is not equal to the total record count. It's surprising that the union of a expression plus its inverse does not equal the sum of all records. I am not aware of any other queries in Django that would return a result like this. I realize adding a check that the key exists would return a more expected results, but that doesn't stop the above from being surprising.
I'm not sure what a solution would be - either a note in the documentation that this behavior should be expected, or take a look at how this same expression is applied for both the exclude() and filter() queries and see why they are not opposites.
Change History (12)
comment:1 by , 4 years ago
Component: | Uncategorized → Documentation |
---|---|
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → Cleanup/optimization |
comment:2 by , 4 years ago
Cc: | added |
---|
comment:3 by , 4 years ago
Cc: | added |
---|---|
Owner: | changed from | to
Status: | new → assigned |
I can look at this one! It looks pretty straightforward.
comment:4 by , 4 years ago
I think we can change the existing Oracle note about exclude()
(here) to make it more general (not just with None
rhs) and remove the Oracle users
note title to just Note
or Warning
. Like what Carlton said, it would be great if we also add test cases for that. I think we should also explicitly say that in order to get the proper inverse of filter()
, users should also consider querying with __isnull
.
comment:6 by , 4 years ago
Sage, sorry just now saw your comment! I wound up adding a new section but am happy to combine the two if you think that makes more sense.
comment:7 by , 4 years ago
Looks like there are some changes to be made - I think we must only run the tests when the db supports json fields
comment:8 by , 4 years ago
Patch needs improvement: | set |
---|
comment:9 by , 4 years ago
Patch needs improvement: | unset |
---|---|
Triage Stage: | Accepted → Ready for checkin |
Hi Mikuael. Thanks for the report. Yes, let's take this as a Documentation issue: it's expected behaviour but worth calling out explicitly.
(If you wanted to add a few additional test cases for
exclude
that might be cool too, documenting and verifying behaviour, on each of the backend DBs.)