Opened 10 years ago

Closed 10 years ago

#22023 closed Bug (fixed)

.values() followed by defer() or only() results invalid data or crash

Reported by: Jani Tiainen Owned by: wiget
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: orm
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Given model

class MyModel(models.Model):
    field_a = models.CharField(max_length=100)
    field_b = models.IntegerField()
    field_c = models.TextField()

Query MyModel.objects.values().only('field_b')
Returns totally incorrect set of fields

Query MyModel.objects.values().defer('field_b')
Returns all fields but data is shifted starting from field_b

Query MyModel.objects.values('field_a').only('field_b')
Results a crash due malformed SQL.

Change History (11)

comment:1 by Baptiste Mispelon, 10 years ago

Triage Stage: UnreviewedAccepted
Version: 1.5master

I can reproduce this on sqlite and it seems to be an old problem (I tested all versions of Django until 1.4 and they're all affected).

comment:2 by Lauxley, 10 years ago

I was curious so i tracked down the issue,
the iterator() method of the ValuesQuerySet doesn't use the deferred_loading attribute of the query, even though the docstring of Query.deferred_to_data says that :
"This [self.deferred_loading] is used to compute the columns to select from the database and also by the QuerySet class to work out which fields are being initialised on each model."

So a quick patch (django 1.4.5)

@@ -956,7 +956,14 @@
     def iterator(self):
         # Purge any extra columns that haven't been explicitly asked for
         extra_names = self.query.extra_select.keys()
-        field_names = self.field_names
+        deferred_fields, defer = self.query.deferred_loading
+        if deferred_fields:
+            if not defer:
+                field_names = list(deferred_fields)
+            else:
+                field_names = list(set(self.field_names) - deferred_fields)
+        else:
+            field_names = self.field_names
         aggregate_names = self.query.aggregate_select.keys()
 
         names = extra_names + field_names + aggregate_names

This is probably not the 'right' way to do it, only an hint.
I strongly feel that this should not be computed in the iterator method, i'm not even sure why the QuerySet class would need to do it itself, but i'm probably missing something.

Note:
.values('field_a').only('field_b') and .values('field_a').defer('field_a')
will still raise a DatabaseError, but i think it's fine in this case.

comment:3 by Lauxley, 10 years ago

woops

--- db/models/query.py
+++ db/models/query.py
@@ -956,7 +956,14 @@
     def iterator(self):

comment:4 by Anssi Kääriäinen, 10 years ago

I wonder if we should just disallow .only() and .defer() after .values(). Deferred loading for .values() doesn't seem sensible to me.

comment:5 by Lauxley, 10 years ago

I guess it depends if you want to keep the same API for a QuerySet and a ValuesQuerySet.
The thing is .only() before .values() does nothing at all today,
so raising a NotImplementedError in ValuesQuerySet's .only() and .defer() would only create a weird case where
.values(...).only(...) != .only(...).values(...)
I know that not all query methods are commutable, but is there a good reason for it in this case ?

Also, why does the QuerySet/ValuesQuerySet instance has to compute the fields to set in the object/dict, in what case is it different from the fields fetched by the Query instance ?

comment:6 by wiget, 10 years ago

Owner: changed from nobody to wiget
Status: newassigned

comment:7 by Baptiste Mispelon, 10 years ago

For what it's worth, the current documentation [1] states that "[...] a ValuesQuerySet is a subclass of QuerySet, so it has all methods of QuerySet".

If we start changing the behavior of ValuesQuerySet significantly, we may want to reorganize the documentation a bit and introduce proper reference documentation for it. Currently, it's only documented inside the reference for Queryset.values().

[1] https://docs.djangoproject.com/en/1.6/ref/models/querysets/#values

comment:8 by Tim Graham, 10 years ago

I have to agree with @akaariai -- I don't understand what the use case for using values() along with only() and defer() would be? The proposal of having only() override the fields in values() seems confusing. I would expect any fields not already selected by values() not to be available later in the QuerySet.

comment:9 by loic84, 10 years ago

IMO "deferring" only makes sense if you get actual objects from which you can fetch deferred fields. While only() could conceptually modify the list of fields used by values() (but shouldn't since we made it the antonym of defer()), the "deferring" concept and therefore defer() are completely at odd with values()/values_list().

I recommend raising NotImplementedError for both ValuesQuerySet.defer() and ValuesQuerySet.only().

comment:11 by Tim Graham <timograham@…>, 10 years ago

Resolution: fixed
Status: assignedclosed

In faf6a911ad7357c8dd1defa3be0317a8418febf7:

Fixed #22023 -- Raised an error for values() followed by defer() or only().

Previously, doing so resulted in invalid data or crash.

Thanks jtiai for the report and Karol Jochelson,
Jakub Nowak, Loic Bistuer, and Baptiste Mispelon for reviews.

Note: See TracTickets for help on using tickets.
Back to Top