Opened 2 months ago

Closed 2 months ago

Last modified 2 months ago

#36607 closed Uncategorized (invalid)

Improve the documentation by mentioning that "filter(a,b)" may not be the same as "filter(a).filter(b)" on querysets.

Reported by: Aaron Owned by:
Component: Database layer (models, ORM) Version: 5.2
Severity: Normal Keywords:
Cc: Aaron, Egor R Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Dear community,

I just got informed about the fact that "filter(a,b)" might not be the same as
"filter(a).filter(b)". The following example may be useful for illustrating
this. It uses some code that is inspired by our code and won't run without
context, but I hope it is illustrating enough.

The intend is to filter for MaintenanceOperations. Therefore, we create one:

op = MaintenanceOperationFactory()

These objects are children of the MaintenanceOperation:

maint_a = UnscheduledMaintenanceFactory(operation=op, execution_start="2030-03-10T12:00:00Z")
maint_b = UnscheduledMaintenanceFactory(operation=op, execution_start="2030-05-10T12:00:00Z")

We construct a first queryset like this:

qs1 = MaintenanceOperation.objects.filter(maintenances__execution_start__gt="2030-04-10T12:00:00Z", maintenances__execution_start__lt="2030-04-20T12:00:00Z",)

This qs evaluates to: <MaintenanceOperationQuerySet []> (aka the empty list). This happens because
neither UnscheduledMaintenance meets both criteria. The underlying sql is this:

SELECT "maintenance_maintenanceoperation"."id" FROM
"maintenance_maintenanceoperation" INNER JOIN "maintenance_maintenance" ON
("maintenance_maintenanceoperation"."id" =
"maintenance_maintenance"."operation_id") WHERE
("maintenance_maintenanceoperation"."deleted_at" IS NULL AND
"maintenance_maintenance"."execution_start" > 2030-04-10 12:00:00+00:00 AND
"maintenance_maintenance"."execution_start" < 2030-04-20 12:00:00+00:00)

To demonstrate the second case, we build the following queryset:

qs2 = MaintenanceOperation.objects.filter(
maintenances__execution_start__gt="2030-04-10T12:00:00Z",).filter(
maintenances__execution_start__lt="2030-04-20T12:00:00Z",)

This qs evaluates to:<MaintenanceOperationQuerySet [<MaintenanceOperation: #1:
beauftragt>]>(This list contains the MaintenanceOperation we created at the start)
This is different from the first one, because the sql is done with another join:

SELECT "maintenance_maintenanceoperation"."id"
FROM "maintenance_maintenanceoperation" INNER JOIN "maintenance_maintenance" ON
("maintenance_maintenanceoperation"."id" =
"maintenance_maintenance"."operation_id") INNER JOIN "maintenance_maintenance"
T3 ON ("maintenance_maintenanceoperation"."id" = T3."operation_id") WHERE
("maintenance_maintenanceoperation"."deleted_at" IS NULL AND
"maintenance_maintenance"."execution_start" > 2030-04-10 12:00:00+00:00 AND
T3."execution_start" < 2030-04-20 12:00:00+00:00)

I think this should be mentioned in the filter documentation, particularly in the
section on joining filters. If this is met with approval I am open to sketching a
suggestion.

Change History (4)

comment:1 by Egor R, 2 months ago

Cc: Egor R added

comment:2 by Tanishq, 2 months ago

Hey Aaron,

the example you shared does a great job highlighting the difference between filter(a, b) and filter(a).filter(b)
—especially how SQL joins come into play. I agree it’s worth documenting this nuance, probably in the “Joining and Filtering” section of docs/topics/db/queries.txt. A quick note about how chained filters can trigger extra joins and potentially change result sets (like in your UnscheduledMaintenanceexample) would be super helpful.
I can work on this Issue and even update the docs to perfectly cover this section.

comment:3 by Aaron, 2 months ago

Hi Tanishq,

I appreciate the feedback and the interest to work on this. Let me know when you have done something and I will have a look at it, looking forward to it :)

comment:4 by Simon Charette, 2 months ago

Resolution: invalid
Status: newclosed

Hello Aaron, thank you for your report.

The behaviour difference between chaining and combining filter calls when spanning multi-valued relationships is already documented under the Making queries > Spanning multi-valued relationships section with a concrete example so I'm going to assume you missed it before opening the ticket.

Last edited 2 months ago by Simon Charette (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top