Opened 6 years ago

Last modified 14 months ago

#15648 assigned New feature

Allow QuerySet.values_list() to return a namedtuple

Reported by: Paul Miller Owned by: Paul Miller
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: namedtuple, tuple, queryset
Cc: kmike84@…, cg@…, dougal85@…, ShawnMilo, paulmillr@… Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Python 2.6 supports named tuples. Information about field names is stored in the tuple class, so there's no overhead like in dictionaries.
I propose to use them in querysets instead of values() / values_list().

qs = Items.objects.filter(...).namedtuples('title', 'amount', 'price')
for item in qs:
    print item.title, item.amount
    total += item.amount * item.price

Patch:

from itertools import imap
from collections import namedtuple

# python 2.5 doesn't support named tuples, so we can use this http://code.activestate.com/recipes/500261/

from django.db.models.query import ValuesQuerySet

class NamedTuplesQuerySet(ValuesQuerySet):
    def iterator(self):
        # get field names 
        extra_names = self.query.extra_select.keys()
        field_names = self.field_names
        aggregate_names = self.query.aggregate_select.keys()
        names = extra_names + field_names + aggregate_names
       
        # create named tuple class
        tuple_cls = namedtuple('%sTuple' % self.model.__name__, names)

        results_iter = self.query.get_compiler(self.db).results_iter()
        # wrap every string with our named tuple
        return imap(tuple_cls._make, results_iter)
from django.db.models.query import QuerySet

def namedtuples(self, *fields):
    return self._clone(klass=NamedTuplesQuerySet, setup=True, _fields=fields)
QuerySet.namedtuples = namedtuples

Attachments (2)

namedtuples.patch (11.2 KB) - added by Paul Miller <paulmillr@…> 5 years ago.
Named tuples patch
values_list_namedtuples.diff (5.1 KB) - added by Anssi Kääriäinen 5 years ago.

Download all attachments as: .zip

Change History (21)

comment:1 Changed 6 years ago by Mikhail Korobov

Cc: kmike84@… added
Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset

comment:2 Changed 6 years ago by me@…

Why not

Item.objects.all().only(*names_list)

?

comment:3 in reply to:  2 Changed 6 years ago by Paul Miller

Replying to me@…:

Why not

Item.objects.all().only(*names_list)

?

  1. repr(). values / values lists etc. are much easier to debug.
  2. deferred objects (only creates them) are not iterables

comment:4 Changed 6 years ago by Adrian Holovaty

Needs tests: set
Patch needs improvement: set
Triage Stage: UnreviewedAccepted

I like the idea of introducing a QuerySet that returns named tuples! Not crazy about the name, but I can't think of a better one at the moment.

Can you create an actual .patch file, include unit tests and include a fallback namedtuple implementation for our Python 2.4/2.5 users? A good place for that would be in django/utils/datastructures.py.

comment:5 Changed 6 years ago by Luke Plant

Type: New feature

comment:6 Changed 6 years ago by Luke Plant

Severity: Normal

comment:7 Changed 6 years ago by Christopher Grebs

Cc: cg@… added

comment:8 Changed 5 years ago by Dougal Matthews

Cc: dougal85@… added
Easy pickings: unset
Needs documentation: set

comment:9 Changed 5 years ago by ShawnMilo

Cc: ShawnMilo added

Changed 5 years ago by Paul Miller <paulmillr@…>

Attachment: namedtuples.patch added

Named tuples patch

comment:10 Changed 5 years ago by Paul Miller <paulmillr@…>

Cc: paulmillr@… added
Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset
UI/UX: unset

I've added necessary tests and fallback implementation for Python 2.5 users.

comment:11 Changed 5 years ago by Paul Miller

Owner: changed from nobody to Paul Miller
Status: newassigned

comment:12 Changed 5 years ago by Paul Miller

Triage Stage: AcceptedReady for checkin

I don't know much about ticket triaging, but I think that «ready for checkin» would be more correct for this

comment:13 Changed 5 years ago by Jannis Leidel

Needs documentation: set
Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

This definitely isn't ready for checkin yet as there are no docs.

comment:14 Changed 5 years ago by Jacob

milestone: 1.4

Milestone 1.4 deleted

comment:15 Changed 5 years ago by Anssi Kääriäinen

Why the namedtuples API? Is there some reason values_list should not return namedtuples directly? It should be backwards compatible, as index based fetching still work. I believe there will not be any serious performance impact.

If the performance is acceptable, then raw cursors should return named tuples, too. That would be a really nice addition for raw SQL users. But of course, this is another ticket's problem.

comment:16 Changed 5 years ago by Anssi Kääriäinen

I have a very quick implementation of returning namedtuples from .values_list(). The patch seems to be backwards compatible, that is, all tests pass.

The performance penalty seems to be about 20% for one-field values_list() call. It is a good question if that is acceptable. I guess yes as namedtuples are _much_ nicer than regular tuples. On the other hand the main use case for values_list is performance.

The results are using SQLite to fetch 200 objects with one field containing a four char value, and then additional fields containing NULLs. values_list with one field gives the 20% performance penalty, 5 fields gives 15% penalty. Notably in-memory SQLite with fields containing practically no data is pretty much the worst-case. I guess the real-world penalty will be in 5%-20% range depending on use case. My opinion is that getting namedtuples is worth it.

I guess similar performance penalty would be paid if cursors were to return named tuples directly, and that is even tougher call to make. Cursors are even more performance critical, and every query made by Django would need to pay the penalty. Although when creating model instance for example that penalty would be hidden by overhead of model.__init__.

It is of course possible to add the .namedtuples() API, but then we have three things doing nearly the same thing, and two doing exactly the same thing from user perspective.

Changed 5 years ago by Anssi Kääriäinen

comment:17 Changed 4 years ago by anonymous

This would be really, really nice.

comment:18 Changed 3 years ago by suprzer0@…

Perhaps adding a keyword argument to values_list() for namedtuples? something like

Item.objects.values_list(*names_list, named=True)

With named defaulting to False.

Calling .values_list(*names_list, named=True, flat=True) would need to throw some kind of exception I'd imagine.

comment:19 Changed 14 months ago by Tim Graham

Summary: [Feature request] NamedTupleQuerySetAllow QuerySet.values_list() to return a namedtuple
Note: See TracTickets for help on using tickets.
Back to Top