Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#27535 closed New feature (wontfix)

RFE: Exclude pk from queries

Reported by: Alexander Todorov Owned by: nobody
Component: Database layer (models, ORM) Version: 1.10
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hello guys,
in models/sql/query.py there is this snippet:

   def deferred_to_data(self, target, callback):
       ... skip ...
       must_include = {orig_opts.concrete_model: {orig_opts.pk}}

introduced in https://github.com/django/django/commit/29050ef999e1931efb6c62471c7e07d2ea5e96ea, which is causing the following difference between queries:

>>> print(Book.objects.values('genre').query)
SELECT "demoproject_book"."genre_id" FROM "demoproject_book"
>>>

is different from

>>> print(Book.objects.only('genre').query)
SELECT "demoproject_book"."id", "demoproject_book"."genre_id" FROM "demoproject_book"

Because the "id" field is included in the second query it produces different values when .annotate()-ed.

In the environment I work with I don't want to use .values() b/c when I iterate over the results I need to have the objects, not a dict of values. I'm requesting a method for excluding the PK field from the query. Alternatively I'd like for the .values() method to have an alternative which returns model objects, not dictionary.

Can you point me in the right direction? I'm willing to provide a patch.

Change History (5)

comment:1 by Tim Graham, 7 years ago

Query construction in the ORM isn't my area of expertise, but I think it would help if you could describe the use case a little better with a psuedo-queryset and the resulting SQL you desire.

comment:2 by Shai Berger, 7 years ago

Hi Alexander, thanks for this suggestion.

The value of the PK is the link between the object in memory and the record in the database. An object with deferred fields must have it, or else any access to the deferred fields would be an error.

It seems like you can get what you want with some variation on the following (caveat: untested)

def values_as_objects(values_qset, model, *fields):
    for values in values_qset:
          obj = model(**{field: values[field] for field in fields})
          for k,v in values.items():
                if k not in fields:
                    setattr(obj, k, v)
          yield obj
    # or if you want to make it a little more complicated to write and easier to use,
    # you can get the model from the qset and the fields from the model)

objs = values_as_objects(Book.objects.values('genre').annotate(whatever=...), Book, 'genre')

while adding any of the requested features to Django querysets in general would, IMO, make the API much less clear and consistent.

(my original opinion was that this should be wontfix'ed, but in view of Tim's comment I am waiting for more input)

comment:3 by Simon Charette, 7 years ago

Resolution: wontfix
Status: newclosed

I agree with all of what Shai said, I think this should be wontfix'ed.

comment:4 by Alexander Todorov, 7 years ago

To answer the question from Tim. I want to be able to do the following:

Book.objects.values('genre').annotate(Avg('rating'))

SELECT "demoproject_book"."genre_id", AVG("demoproject_book"."rating") AS "rating__avg" FROM "demoproject_book" GROUP BY "demoproject_book"."genre_id"

However iterate over objects not dict. I'm the new maintainer of django-chartit, where the user passes query sets as the source for charts and lists of terms for the X and Y axes. The terms are usually field names. Recently users have requested to be able to pass model attributes as terms and be able to use RawQuerySet as the chart source. Internally it is easier to work only with objects instead of special casing the value query set and work with dictionaries.

I will the helper method proposed above as well. Thanks for the tip.

comment:5 by Marten Kenbeek, 7 years ago

.values() does a GROUP BY, so each record in the resulting queryset can represent multiple rows in the original table. It doesn't make much sense to represent these records as model instances: all model instance methods assume that an instance represent a single row, and model instances promise that all attributes are accessible. If a model instance represents multiple rows, it's unclear what value deferred instance attributes should have or what instance methods should do.

.only() is different because it does not remove information from the instance, it simply defers retrieving that information from the database. Each model instance still represent a single row in the database.

I'm -1 on this. If a method needs to support both kinds of querysets, it should explicitly support both regular querysets and .values() querysets. It's a tiny bit of extra code, but representing .values() querysets with model instances would simply be incorrect.

Note: See TracTickets for help on using tickets.
Back to Top