Opened 3 years ago

Closed 3 years ago

#22023 closed Bug (fixed)

.values() followed by defer() or only() results invalid data or crash

Reported by: jtiai Owned by: wiget
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: orm
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


Given model

class MyModel(models.Model):
    field_a = models.CharField(max_length=100)
    field_b = models.IntegerField()
    field_c = models.TextField()

Query MyModel.objects.values().only('field_b')
Returns totally incorrect set of fields

Query MyModel.objects.values().defer('field_b')
Returns all fields but data is shifted starting from field_b

Query MyModel.objects.values('field_a').only('field_b')
Results a crash due malformed SQL.

Change History (11)

comment:1 Changed 3 years ago by bmispelon

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted
  • Version changed from 1.5 to master

I can reproduce this on sqlite and it seems to be an old problem (I tested all versions of Django until 1.4 and they're all affected).

comment:2 Changed 3 years ago by Lauxley

I was curious so i tracked down the issue,
the iterator() method of the ValuesQuerySet doesn't use the deferred_loading attribute of the query, even though the docstring of Query.deferred_to_data says that :
"This [self.deferred_loading] is used to compute the columns to select from the database and also by the QuerySet class to work out which fields are being initialised on each model."

So a quick patch (django 1.4.5)

@@ -956,7 +956,14 @@
     def iterator(self):
         # Purge any extra columns that haven't been explicitly asked for
         extra_names = self.query.extra_select.keys()
-        field_names = self.field_names
+        deferred_fields, defer = self.query.deferred_loading
+        if deferred_fields:
+            if not defer:
+                field_names = list(deferred_fields)
+            else:
+                field_names = list(set(self.field_names) - deferred_fields)
+        else:
+            field_names = self.field_names
         aggregate_names = self.query.aggregate_select.keys()
         names = extra_names + field_names + aggregate_names

This is probably not the 'right' way to do it, only an hint.
I strongly feel that this should not be computed in the iterator method, i'm not even sure why the QuerySet class would need to do it itself, but i'm probably missing something.

.values('field_a').only('field_b') and .values('field_a').defer('field_a')
will still raise a DatabaseError, but i think it's fine in this case.

comment:3 Changed 3 years ago by Lauxley


--- db/models/
+++ db/models/
@@ -956,7 +956,14 @@
     def iterator(self):

comment:4 Changed 3 years ago by akaariai

I wonder if we should just disallow .only() and .defer() after .values(). Deferred loading for .values() doesn't seem sensible to me.

comment:5 Changed 3 years ago by Lauxley

I guess it depends if you want to keep the same API for a QuerySet and a ValuesQuerySet.
The thing is .only() before .values() does nothing at all today,
so raising a NotImplementedError in ValuesQuerySet's .only() and .defer() would only create a weird case where
.values(...).only(...) != .only(...).values(...)
I know that not all query methods are commutable, but is there a good reason for it in this case ?

Also, why does the QuerySet/ValuesQuerySet instance has to compute the fields to set in the object/dict, in what case is it different from the fields fetched by the Query instance ?

comment:6 Changed 3 years ago by wiget

  • Owner changed from nobody to wiget
  • Status changed from new to assigned

comment:7 Changed 3 years ago by bmispelon

For what it's worth, the current documentation [1] states that "[...] a ValuesQuerySet is a subclass of QuerySet, so it has all methods of QuerySet".

If we start changing the behavior of ValuesQuerySet significantly, we may want to reorganize the documentation a bit and introduce proper reference documentation for it. Currently, it's only documented inside the reference for Queryset.values().


comment:8 Changed 3 years ago by timo

I have to agree with @akaariai -- I don't understand what the use case for using values() along with only() and defer() would be? The proposal of having only() override the fields in values() seems confusing. I would expect any fields not already selected by values() not to be available later in the QuerySet.

comment:9 Changed 3 years ago by loic84

IMO "deferring" only makes sense if you get actual objects from which you can fetch deferred fields. While only() could conceptually modify the list of fields used by values() (but shouldn't since we made it the antonym of defer()), the "deferring" concept and therefore defer() are completely at odd with values()/values_list().

I recommend raising NotImplementedError for both ValuesQuerySet.defer() and ValuesQuerySet.only().

comment:11 Changed 3 years ago by Tim Graham <timograham@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In faf6a911ad7357c8dd1defa3be0317a8418febf7:

Fixed #22023 -- Raised an error for values() followed by defer() or only().

Previously, doing so resulted in invalid data or crash.

Thanks jtiai for the report and Karol Jochelson,
Jakub Nowak, Loic Bistuer, and Baptiste Mispelon for reviews.

Note: See TracTickets for help on using tickets.
Back to Top