Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#31894 closed Cleanup/optimization (fixed)

JSONField filter() and JSONField exclude() are sometimes not inverses of each other

Reported by: Mikuael Owned by: Johnny Pribyl
Component: Documentation Version: 3.1
Severity: Normal Keywords:
Cc: Sage Abdullah, Johnny Pribyl Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Filtering based on a JSONField key-value pair seems to have some unexpected behavior when involving a key that not all records have. Strangely, filtering on an optional property key will not return the inverse result set that an exclude on the same property key will return.

In my database, I have:

  • 2250 total records
  • 49 records where jsonfieldname = {'propertykey': 'PropertyValue'}
  • 296 records where jsonfieldname has a 'propertykey' key with some other value
  • 1905 records where jsonfieldname does not have a 'propertykey' key at all

The following code:

q = Q(jsonfieldname__propertykey="PropertyValue")

total_records = Record.objects.count()
filtered_records = Record.objects.filter(q).count()
excluded_records = Record.objects.exclude(q).count()
filtered_plus_excluded_records = filtered_records + excluded_records
    
print('Total: %d' % total_records)
print('Filtered: %d' % filtered_records)
print('Excluded: %d' % excluded_records)
print('Filtered Plus Excluded: %d' % filtered_plus_excluded_records)

Will output this:

Total: 2250
Filtered: 49
Excluded: 296
Filtered Plus Excluded: 345

It is surprising that the filtered+excluded value is not equal to the total record count. It's surprising that the union of a expression plus its inverse does not equal the sum of all records. I am not aware of any other queries in Django that would return a result like this. I realize adding a check that the key exists would return a more expected results, but that doesn't stop the above from being surprising.

I'm not sure what a solution would be - either a note in the documentation that this behavior should be expected, or take a look at how this same expression is applied for both the exclude() and filter() queries and see why they are not opposites.

Change History (12)

comment:1 by Carlton Gibson, 4 years ago

Component: UncategorizedDocumentation
Triage Stage: UnreviewedAccepted
Type: UncategorizedCleanup/optimization

Hi Mikuael. Thanks for the report. Yes, let's take this as a Documentation issue: it's expected behaviour but worth calling out explicitly.

(If you wanted to add a few additional test cases for exclude that might be cool too, documenting and verifying behaviour, on each of the backend DBs.)

comment:2 by Sage Abdullah, 4 years ago

Cc: Sage Abdullah added

comment:3 by Johnny Pribyl, 4 years ago

Cc: Johnny Pribyl added
Owner: changed from nobody to Johnny Pribyl
Status: newassigned

I can look at this one! It looks pretty straightforward.

comment:4 by Sage Abdullah, 4 years ago

I think we can change the existing Oracle note about exclude() (here) to make it more general (not just with None rhs) and remove the Oracle users note title to just Note or Warning. Like what Carlton said, it would be great if we also add test cases for that. I think we should also explicitly say that in order to get the proper inverse of filter(), users should also consider querying with __isnull.

comment:6 by Johnny Pribyl, 4 years ago

Sage, sorry just now saw your comment! I wound up adding a new section but am happy to combine the two if you think that makes more sense.

comment:7 by Johnny Pribyl, 4 years ago

Looks like there are some changes to be made - I didn't realize quite how dependent this behavior was on db backend! I will have to look at this a bit more.

Version 1, edited 4 years ago by Johnny Pribyl (previous) (next) (diff)

comment:8 by Mariusz Felisiak, 4 years ago

Patch needs improvement: set

comment:9 by Mariusz Felisiak, 4 years ago

Patch needs improvement: unset
Triage Stage: AcceptedReady for checkin

comment:10 by Mariusz Felisiak <felisiak.mariusz@…>, 4 years ago

In 94e22381:

Refs #31894 -- Added tests for JSONField key lookups with QuerySet.exclude().

comment:11 by Mariusz Felisiak <felisiak.mariusz@…>, 4 years ago

Resolution: fixed
Status: assignedclosed

In 17407eca:

Fixed #31894 -- Added note about using JSONField key lookups with QuerySet.exclude() in docs.

comment:12 by Mariusz Felisiak <felisiak.mariusz@…>, 4 years ago

In 3fc63624:

[3.1.x] Fixed #31894 -- Added note about using JSONField key lookups with QuerySet.exclude() in docs.

Backport of 17407eca59b0572228067cdee51433f49a1e0adb from master

Note: See TracTickets for help on using tickets.
Back to Top