Opened 4 years ago

Closed 3 years ago

#32704 closed Bug (fixed)

QuerySet.defer() doesn't clear deferred field when chaining with only().

Reported by: Manuel Baclet Owned by: David Wobrock
Component: Database layer (models, ORM) Version: 3.1
Severity: Normal Keywords: defer only
Cc: Manuel Baclet, David Wobrock Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no
Pull Requests:14667 merged

Description

Considering a simple Company model with four fields: id, name, trade_number and country. If we evaluate a queryset containing a .defer() following a .only(), the generated sql query selects unexpected fields. For example:

Company.objects.only("name").defer("name")

loads all the fields with the following query:

SELECT "company"."id", "company"."name", "company"."trade_number", "company"."country" FROM "company"

and

Company.objects.only("name").defer("name").defer("country")

also loads all the fields with the same query:

SELECT "company"."id", "company"."name", "company"."trade_number", "company"."country" FROM "company"

In those two cases, i would expect the sql query to be:

SELECT "company"."id" FROM "company"

In the following example, we get the expected behavior:

Company.objects.only("name", "country").defer("name")

only loads "id" and "country" fields with the following query:

SELECT "company"."id", "company"."country" FROM "company" 

Change History (10)

comment:1 by Manuel Baclet, 4 years ago

Cc: Manuel Baclet added

in reply to:  description comment:2 by Mariusz Felisiak, 4 years ago

Summary: Unexpected behavior of .defer() and .only()QuerySet.defer() doesn't clear deferred field when chaining with only().
Triage Stage: UnreviewedAccepted

Replying to Manuel Baclet:

Considering a simple Company model with four fields: id, name, trade_number and country. If we evaluate a queryset containing a .defer() following a .only(), the generated sql query selects unexpected fields. For example:

Company.objects.only("name").defer("name")

loads all the fields with the following query:

SELECT "company"."id", "company"."name", "company"."trade_number", "company"."country" FROM "company"

This is an expected behavior, defer() removes fields from the list of fields specified by the only() method (i.e. list of fields that should not be deferred). In this example only() adds name to the list, defer() removes name from the list, so you have empty lists and all fields will be loaded. It is also documented:

# Final result is that everything except "headline" is deferred.
Entry.objects.only("headline", "body").defer("body")
Company.objects.only("name").defer("name").defer("country")

also loads all the fields with the same query:

SELECT "company"."id", "company"."name", "company"."trade_number", "company"."country" FROM "company"

I agree you shouldn't get all field, but only pk, name, and trade_number:

SELECT "ticket_32704_company"."id", "ticket_32704_company"."name", "ticket_32704_company"."trade_number" FROM "ticket_32704_company"

this is due to the fact that defer() doesn't clear the list of deferred field when chaining with only(). I attached a proposed patch.

Last edited 4 years ago by Mariusz Felisiak (previous) (diff)

by Mariusz Felisiak, 4 years ago

Attachment: ticket-32704.diff added

Draft.

comment:3 by Manuel Baclet, 4 years ago

After reading the documentation carefully, i cannot say that it is clearly stated that deferring all the fields used in a previous .only() call performs a reset of the deferred set.

Moreover, in the .defer() section, we have:

You can make multiple calls to defer(). Each call adds new fields to the deferred set:

and this seems to suggest that:

  • calls to .defer() cannot remove items from the deferred set and evaluating qs.defer("some_field") should never fetch the column "some_field" (since this should add "some_field" to the deferred set)
  • the querysets qs.defer("field1").defer("field2") and qs.defer("field1", "field2") should be equivalent

IMHO, there is a mismatch between the doc and the actual implementation.

Last edited 4 years ago by Manuel Baclet (previous) (diff)

in reply to:  3 comment:4 by Mariusz Felisiak, 4 years ago

  • calls to .defer() cannot remove items from the deferred set and evaluating qs.defer("some_field") should never fetch the column "some_field" (since this should add "some_field" to the deferred set)

Feel-free to propose a docs clarification (see also #24048).

  • the querysets qs.defer("field1").defer("field2") and qs.defer("field1", "field2") should be equivalent

That's why I accepted Company.objects.only("name").defer("name").defer("country") as a bug.

comment:5 by Manuel Baclet, 4 years ago

I think that what is described in the documentation is what users are expected and it is the implementation that should be fixed!

With your patch proposal, i do not think that:
Company.objects.only("name").defer("name").defer("country") is equivalent to Company.objects.only("name").defer("name", "country")

in reply to:  5 ; comment:6 by Mariusz Felisiak, 4 years ago

Replying to Manuel Baclet:

I think that what is described in the documentation is what users are expected and it is the implementation that should be fixed!

I don't agree, and there is no need to shout. As documented: "The only() method is more or less the opposite of defer(). You call it with the fields that should not be deferred ...", so .only('name').defer('name') should return all fields. You can start a discussion on DevelopersMailingList if you don't agree.

With your patch proposal, i do not think that:
Company.objects.only("name").defer("name").defer("country") is equivalent to Company.objects.only("name").defer("name", "country")

Did you check this? with proposed patch country is the only deferred fields in both cases. As far as I'm aware that's an intended behavior.

comment:7 by Manuel Baclet, 4 years ago

With the proposed patch, I think that:

Company.objects.only("name").defer("name", "country")

loads all fields whereas:

Company.objects.only("name").defer("name").defer("country")

loads all fields except "country".

Why is that?
In the first case:

existing.difference(field_names) == {"name"}.difference(["name", "country"]) == empty_set

and we go into the if branch clearing all the deferred fields and we are done.
In the second case:

existing.difference(field_names) == {"name"}.difference(["name"]) == empty_set

and we go into the if branch clearing all the deferred fields. Then we add "country" to the set of deferred fields.

in reply to:  6 comment:8 by David Wobrock, 3 years ago

Cc: David Wobrock added
Has patch: set
Owner: changed from nobody to David Wobrock
Status: newassigned

Hey all,

Replying to Mariusz Felisiak:

Replying to Manuel Baclet:

With your patch proposal, i do not think that:
Company.objects.only("name").defer("name").defer("country") is equivalent to Company.objects.only("name").defer("name", "country")

Did you check this? with proposed patch country is the only deferred fields in both cases. As far as I'm aware that's an intended behavior.

I believe Manuel is right. This happens because the set difference in one direction gives you the empty set that will clear out the deferred fields - but it is missing the fact that we might also be adding more defer fields than we had only fields in the first place, so that we actually switch from an .only() to a .defer() mode.

See the corresponding PR that should fix this behaviour https://github.com/django/django/pull/14667

comment:9 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

Resolution: fixed
Status: assignedclosed

In fd999318:

Fixed #32704 -- Fixed list of deferred fields when chaining QuerySet.defer() after only().

Note: See TracTickets for help on using tickets.
Back to Top