Opened 16 years ago

Closed 9 years ago

Last modified 9 years ago

#9619 closed Bug (fixed)

to_python not called when fetching data with .values(...)

Reported by: Valera Grishin Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: values, custom field, datetime
Cc: amlau@…, olau@…, Chris Chambers, Kevin Christopher Henry, django@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Consider following models:

from django.db import models

class BinaryField(models.Field):
    __metaclass__ = models.SubfieldBase

    def encode(self):
        # some encoding here

    def decode(self):
        # some decoding here

    def db_type(self):
        return 'text'

    def get_db_prep_value(self, value):
        return super(BinaryField, self).get_db_prep_value(BinaryField.__encode(value))

    def to_python(self, value):
        return super(BinaryField, self).to_python(BinaryField.__decode(value))

class Logo(models.Model):
    channel = models.CharField(max_length=100)
    logo = BinaryField(editable=False)

Now assume you want to fetch values of "logo" field:

print Logo.objects.values('logo')[1]['logo']

This will print encoded value as it is get stored in the database. That is to_python not called.

Change History (22)

comment:1 by , 16 years ago

Cc: amlau@… added

comment:2 by , 16 years ago

Cc: amlau@… added; amlau@… removed

comment:3 by Jacob, 16 years ago

milestone: 1.1
Triage Stage: UnreviewedAccepted

comment:4 by Malcolm Tredinnick, 16 years ago

Triage Stage: AcceptedDesign decision needed

This is a bit of a line-ball, since it's called values, not Python objects, so returning the data representation isn't a completely bad idea -- leaving it up to the user to decide what they want to do with it. For example, complete consistency here would then be converting foreign key values to their corresponding Python objects (kind of defeating the purpose of calling values() instead of a normal queryset). I've just wontfixed #8144 about that particular issue (largely because of the huge impact it would have on existing code).

I'm kind of in favour of staying away from django.db.models.fields.Field entirely when working with values() and values_list(), keeping it as the fast path for raw data.

I'm going to risk the wrath of Jacob and DDN this for a little bit. I think we need a discussion when there's a quiet moment.

comment:5 by Malcolm Tredinnick, 16 years ago

Also occurs to me that a new parameter to values() and values_list() might not be out of the realms of possibility here.

Something like do_the_extra_work_required_to_convert_to_python_even_though_I_could_use_a_normal_queryset_with_defer = True, perhaps. :-)

comment:6 by Alex Gaynor, 16 years ago

Malcolm, you mentions defer. Got something you want to share with the class :)

comment:7 by Malcolm Tredinnick, 16 years ago

Keep the ticket comments on-topic, please.

comment:8 by Malcolm Tredinnick, 16 years ago

milestone: 1.1

Having thought about this a lot more, it isn't really a bug in existing functionality since values() has always behaved this way and the name is pretty descriptive. Worthwhile having a discussion about adding some feature to this in the future, but I can't see that it's a bug in the 1.1-release sense.

comment:9 by benwilber@…, 15 years ago

Keywords: values custom field datetime added
Version: 1.0SVN

This seems like a bug to me. Please see my post for a better case:
http://groups.google.com/group/django-users/browse_thread/thread/7f6f75334ef1653c/0174e8ce8e1e514b?hl=en%C2%AEe8ce8e1e514b

It seems to me that 'values()' should return data as it's _meant_ to be consumed, not as it's stored. If you want raw data, get at something like a 'raw_values()' method or similar. As it stands now, I'm not sure how to use date_hierarchy and maybe others with a custom datetime field when 'to_python()' is not called without digging into the internals. Are there other standard apps that access API'd data 'through the back door' as well?

comment:10 by Ole Laursen, 14 years ago

Hey, while we're waiting for a design decision, could we please get a quick fix on this that updates the documentation? It never has occurred to me that values() wouldn't deserialize, I've always thought it would just be an easy way to get plain dicts/tuples instead of the heavier objects, and the documentation doesn't mention raw values at all:

"Returns a ValuesQuerySet -- a QuerySet that returns dictionaries when used as an iterable, rather than model-instance objects.

Each of those dictionaries represents an object, with the keys corresponding to the attribute names of model objects."

I have been using values() for a couple of years without noticing the difference before today, where it broke on me because I've defined a custom field.

Here's a suggestion for a wording: "Note that the values returned are directly from the database, no field-specific conversions are performed, e.g. for custom fields with to_python()."

Otherwise, +1 for getting this fixed, with raw=True/False or whatever. :)

comment:11 by Ole Laursen, 14 years ago

Cc: olau@… added

comment:12 by Chris Chambers, 14 years ago

Cc: Chris Chambers added

comment:13 by Luke Plant, 14 years ago

Severity: Normal
Type: Bug

comment:14 by sam@…, 13 years ago

Easy pickings: unset
UI/UX: unset

IMHO, if this isn't a bug it's definitely a gotcha (and the documentation should be updated). I spent the last several hours trying to debug why values_list was not returning the python objects defined by my custom model field's to_python method.

comment:15 by Aymeric Augustin, 13 years ago

I probably haven't been here long enough, but I'm finding this behavior is extremely unexpected.

I'll leave this ticket in DDN because I'm not sure I grasp all the consequences.

But my gut feeling is that we should swallow the backwards incompatibility and fix this at some point — or at the very least make it clear in the documentation that QuerySet.values() returns raw database values, as spit by the database adapter.

comment:16 by Anssi Kääriäinen, 12 years ago

Triage Stage: Design decision neededAccepted

I vote for raw kwarg, if True no conversion happens. One day the default should be False, but likely with backwards compatibility period where it default still to True.

It might be worth it to inspect if there is a way to skip those method calls that do not actually do anything, maybe just if 'col_field.to_python == Field.to_python'. The standard Field.to_python just returns the value as is. There might be need to ask the field explicitly if the given connection's values needs conversion, most of the to_python() implementations seem non-necessary (IntegerField does int(value) for example, but I don't believe this is necessary on any DB's values).

The overhead of calling no-ops in trivial test case is 0.7s vs 0.07s. The test is:

def noop(value):
    return value

myvals = [
    (0, 1, 2, 3, 4),
] * 100
def values():
    for row in myvals:
        yield row  # alternate: yield map(noop, row)

from datetime import datetime
start = datetime.now()
for _ in range(0, 10000):
    for val in values():
        pass
print datetime.now() - start

Still, the test passes in under a second even with the map call, so it might be that the overhead doesn't matter in practice.

Anyways, marking as accepted, it would be good to end up in a situation where you can opt out from conversion, but the default is to use to_python() for the values.

comment:17 by Kevin Christopher Henry, 11 years ago

Cc: Kevin Christopher Henry added

comment:18 by jedie, 9 years ago

Cc: django@… added

comment:19 by jedie, 9 years ago

If i see it right, this has been fixed with Django v1.8 and the change from to_python() to from_db_value() isn't it?

The docs at https://docs.djangoproject.com/en/1.8/howto/custom-model-fields/#converting-values-to-python-objects says:

If present for the field subclass, from_db_value() will be called in all circumstances when the data is loaded from the database, including in aggregates and values() calls.
Version 0, edited 9 years ago by jedie (next)

comment:20 by Tim Graham <timograham@…>, 9 years ago

In 9aac99e:

Refs #9619 -- Added a test that Field.from_db_value() is called on QuerySet.values()

comment:21 by Tim Graham, 9 years ago

Resolution: fixed
Status: newclosed

comment:22 by Markus Holtermann <info@…>, 9 years ago

In 64a4211:

Refs #9619 -- Fixed failing test case

Regression introduced in 9aac99e96d100b34b6daa3a137531eff4f17076e

Note: See TracTickets for help on using tickets.
Back to Top