Code

Opened 2 months ago

Closed 2 months ago

#22023 closed Bug (fixed)

.values() followed by defer() or only() results invalid data or crash

Reported by: jtiai Owned by: wiget
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: orm
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Given model

class MyModel(models.Model):
    field_a = models.CharField(max_length=100)
    field_b = models.IntegerField()
    field_c = models.TextField()

Query MyModel.objects.values().only('field_b')
Returns totally incorrect set of fields

Query MyModel.objects.values().defer('field_b')
Returns all fields but data is shifted starting from field_b

Query MyModel.objects.values('field_a').only('field_b')
Results a crash due malformed SQL.

Attachments (0)

Change History (11)

comment:1 Changed 2 months ago by bmispelon

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted
  • Version changed from 1.5 to master

I can reproduce this on sqlite and it seems to be an old problem (I tested all versions of Django until 1.4 and they're all affected).

comment:2 Changed 2 months ago by Lauxley

I was curious so i tracked down the issue,
the iterator() method of the ValuesQuerySet doesn't use the deferred_loading attribute of the query, even though the docstring of Query.deferred_to_data says that :
"This [self.deferred_loading] is used to compute the columns to select from the database and also by the QuerySet class to work out which fields are being initialised on each model."

So a quick patch (django 1.4.5)

@@ -956,7 +956,14 @@
     def iterator(self):
         # Purge any extra columns that haven't been explicitly asked for
         extra_names = self.query.extra_select.keys()
-        field_names = self.field_names
+        deferred_fields, defer = self.query.deferred_loading
+        if deferred_fields:
+            if not defer:
+                field_names = list(deferred_fields)
+            else:
+                field_names = list(set(self.field_names) - deferred_fields)
+        else:
+            field_names = self.field_names
         aggregate_names = self.query.aggregate_select.keys()
 
         names = extra_names + field_names + aggregate_names

This is probably not the 'right' way to do it, only an hint.
I strongly feel that this should not be computed in the iterator method, i'm not even sure why the QuerySet class would need to do it itself, but i'm probably missing something.

Note:
.values('field_a').only('field_b') and .values('field_a').defer('field_a')
will still raise a DatabaseError, but i think it's fine in this case.

comment:3 Changed 2 months ago by Lauxley

woops

--- db/models/query.py
+++ db/models/query.py
@@ -956,7 +956,14 @@
     def iterator(self):

comment:4 Changed 2 months ago by akaariai

I wonder if we should just disallow .only() and .defer() after .values(). Deferred loading for .values() doesn't seem sensible to me.

comment:5 Changed 2 months ago by Lauxley

I guess it depends if you want to keep the same API for a QuerySet and a ValuesQuerySet.
The thing is .only() before .values() does nothing at all today,
so raising a NotImplementedError in ValuesQuerySet's .only() and .defer() would only create a weird case where
.values(...).only(...) != .only(...).values(...)
I know that not all query methods are commutable, but is there a good reason for it in this case ?

Also, why does the QuerySet/ValuesQuerySet instance has to compute the fields to set in the object/dict, in what case is it different from the fields fetched by the Query instance ?

comment:6 Changed 2 months ago by wiget

  • Owner changed from nobody to wiget
  • Status changed from new to assigned

comment:7 Changed 2 months ago by bmispelon

For what it's worth, the current documentation [1] states that "[...] a ValuesQuerySet is a subclass of QuerySet, so it has all methods of QuerySet".

If we start changing the behavior of ValuesQuerySet significantly, we may want to reorganize the documentation a bit and introduce proper reference documentation for it. Currently, it's only documented inside the reference for Queryset.values().

[1] https://docs.djangoproject.com/en/1.6/ref/models/querysets/#values

comment:8 Changed 2 months ago by timo

I have to agree with @akaariai -- I don't understand what the use case for using values() along with only() and defer() would be? The proposal of having only() override the fields in values() seems confusing. I would expect any fields not already selected by values() not to be available later in the QuerySet.

comment:9 Changed 2 months ago by loic84

IMO "deferring" only makes sense if you get actual objects from which you can fetch deferred fields. While only() could conceptually modify the list of fields used by values() (but shouldn't since we made it the antonym of defer()), the "deferring" concept and therefore defer() are completely at odd with values()/values_list().

I recommend raising NotImplementedError for both ValuesQuerySet.defer() and ValuesQuerySet.only().

comment:11 Changed 2 months ago by Tim Graham <timograham@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In faf6a911ad7357c8dd1defa3be0317a8418febf7:

Fixed #22023 -- Raised an error for values() followed by defer() or only().

Previously, doing so resulted in invalid data or crash.

Thanks jtiai for the report and Karol Jochelson,
Jakub Nowak, Loic Bistuer, and Baptiste Mispelon for reviews.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.