Opened 8 years ago

Closed 16 months ago

Last modified 16 months ago

#9619 closed Bug (fixed)

to_python not called when fetching data with .values(...)

Reported by: Valera Grishin Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: values, custom field, datetime
Cc: amlau@…, olau@…, Chris Chambers, Kevin Christopher Henry, django@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Consider following models:

from django.db import models

class BinaryField(models.Field):
    __metaclass__ = models.SubfieldBase

    def encode(self):
        # some encoding here

    def decode(self):
        # some decoding here

    def db_type(self):
        return 'text'

    def get_db_prep_value(self, value):
        return super(BinaryField, self).get_db_prep_value(BinaryField.__encode(value))

    def to_python(self, value):
        return super(BinaryField, self).to_python(BinaryField.__decode(value))

class Logo(models.Model):
    channel = models.CharField(max_length=100)
    logo = BinaryField(editable=False)

Now assume you want to fetch values of "logo" field:

print Logo.objects.values('logo')[1]['logo']

This will print encoded value as it is get stored in the database. That is to_python not called.

Change History (22)

comment:1 Changed 8 years ago by

Cc: amlau@… added
Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset

comment:2 Changed 8 years ago by

Cc: amlau@… added; amlau@… removed

comment:3 Changed 8 years ago by Jacob

milestone: 1.1
Triage Stage: UnreviewedAccepted

comment:4 Changed 8 years ago by Malcolm Tredinnick

Triage Stage: AcceptedDesign decision needed

This is a bit of a line-ball, since it's called values, not Python objects, so returning the data representation isn't a completely bad idea -- leaving it up to the user to decide what they want to do with it. For example, complete consistency here would then be converting foreign key values to their corresponding Python objects (kind of defeating the purpose of calling values() instead of a normal queryset). I've just wontfixed #8144 about that particular issue (largely because of the huge impact it would have on existing code).

I'm kind of in favour of staying away from django.db.models.fields.Field entirely when working with values() and values_list(), keeping it as the fast path for raw data.

I'm going to risk the wrath of Jacob and DDN this for a little bit. I think we need a discussion when there's a quiet moment.

comment:5 Changed 8 years ago by Malcolm Tredinnick

Also occurs to me that a new parameter to values() and values_list() might not be out of the realms of possibility here.

Something like do_the_extra_work_required_to_convert_to_python_even_though_I_could_use_a_normal_queryset_with_defer = True, perhaps. :-)

comment:6 Changed 8 years ago by Alex Gaynor

Malcolm, you mentions defer. Got something you want to share with the class :)

comment:7 Changed 8 years ago by Malcolm Tredinnick

Keep the ticket comments on-topic, please.

comment:8 Changed 7 years ago by Malcolm Tredinnick

milestone: 1.1

Having thought about this a lot more, it isn't really a bug in existing functionality since values() has always behaved this way and the name is pretty descriptive. Worthwhile having a discussion about adding some feature to this in the future, but I can't see that it's a bug in the 1.1-release sense.

comment:9 Changed 7 years ago by benwilber@…

Keywords: values custom field datetime added
Version: 1.0SVN

This seems like a bug to me. Please see my post for a better case:
http://groups.google.com/group/django-users/browse_thread/thread/7f6f75334ef1653c/0174e8ce8e1e514b?hl=en%C2%AEe8ce8e1e514b

It seems to me that 'values()' should return data as it's _meant_ to be consumed, not as it's stored. If you want raw data, get at something like a 'raw_values()' method or similar. As it stands now, I'm not sure how to use date_hierarchy and maybe others with a custom datetime field when 'to_python()' is not called without digging into the internals. Are there other standard apps that access API'd data 'through the back door' as well?

comment:10 Changed 6 years ago by Ole Laursen

Hey, while we're waiting for a design decision, could we please get a quick fix on this that updates the documentation? It never has occurred to me that values() wouldn't deserialize, I've always thought it would just be an easy way to get plain dicts/tuples instead of the heavier objects, and the documentation doesn't mention raw values at all:

"Returns a ValuesQuerySet -- a QuerySet that returns dictionaries when used as an iterable, rather than model-instance objects.

Each of those dictionaries represents an object, with the keys corresponding to the attribute names of model objects."

I have been using values() for a couple of years without noticing the difference before today, where it broke on me because I've defined a custom field.

Here's a suggestion for a wording: "Note that the values returned are directly from the database, no field-specific conversions are performed, e.g. for custom fields with to_python()."

Otherwise, +1 for getting this fixed, with raw=True/False or whatever. :)

comment:11 Changed 6 years ago by Ole Laursen

Cc: olau@… added

comment:12 Changed 6 years ago by Chris Chambers

Cc: Chris Chambers added

comment:13 Changed 5 years ago by Luke Plant

Severity: Normal
Type: Bug

comment:14 Changed 5 years ago by sam@…

Easy pickings: unset
UI/UX: unset

IMHO, if this isn't a bug it's definitely a gotcha (and the documentation should be updated). I spent the last several hours trying to debug why values_list was not returning the python objects defined by my custom model field's to_python method.

comment:15 Changed 5 years ago by Aymeric Augustin

I probably haven't been here long enough, but I'm finding this behavior is extremely unexpected.

I'll leave this ticket in DDN because I'm not sure I grasp all the consequences.

But my gut feeling is that we should swallow the backwards incompatibility and fix this at some point — or at the very least make it clear in the documentation that QuerySet.values() returns raw database values, as spit by the database adapter.

comment:16 Changed 4 years ago by Anssi Kääriäinen

Triage Stage: Design decision neededAccepted

I vote for raw kwarg, if True no conversion happens. One day the default should be False, but likely with backwards compatibility period where it default still to True.

It might be worth it to inspect if there is a way to skip those method calls that do not actually do anything, maybe just if 'col_field.to_python == Field.to_python'. The standard Field.to_python just returns the value as is. There might be need to ask the field explicitly if the given connection's values needs conversion, most of the to_python() implementations seem non-necessary (IntegerField does int(value) for example, but I don't believe this is necessary on any DB's values).

The overhead of calling no-ops in trivial test case is 0.7s vs 0.07s. The test is:

def noop(value):
    return value

myvals = [
    (0, 1, 2, 3, 4),
] * 100
def values():
    for row in myvals:
        yield row  # alternate: yield map(noop, row)

from datetime import datetime
start = datetime.now()
for _ in range(0, 10000):
    for val in values():
        pass
print datetime.now() - start

Still, the test passes in under a second even with the map call, so it might be that the overhead doesn't matter in practice.

Anyways, marking as accepted, it would be good to end up in a situation where you can opt out from conversion, but the default is to use to_python() for the values.

comment:17 Changed 3 years ago by Kevin Christopher Henry

Cc: Kevin Christopher Henry added

comment:18 Changed 16 months ago by jedie

Cc: django@… added

comment:19 Changed 16 months ago by jedie

If i see it right, this has been fixed with Django v1.8 and the change from to_python() to from_db_value() isn't it?

The docs at https://docs.djangoproject.com/en/1.8/howto/custom-model-fields/#converting-values-to-python-objects says:

If present for the field subclass, from_db_value() will be called in all circumstances when the data is loaded from the database, including in aggregates and values() calls.

EDIT: I add a unittest in https://github.com/django/django/pull/4875 to test it.

Last edited 16 months ago by jedie (previous) (diff)

comment:20 Changed 16 months ago by Tim Graham <timograham@…>

In 9aac99e:

Refs #9619 -- Added a test that Field.from_db_value() is called on QuerySet.values()

comment:21 Changed 16 months ago by Tim Graham

Resolution: fixed
Status: newclosed

comment:22 Changed 16 months ago by Markus Holtermann <info@…>

In 64a4211:

Refs #9619 -- Fixed failing test case

Regression introduced in 9aac99e96d100b34b6daa3a137531eff4f17076e

Note: See TracTickets for help on using tickets.
Back to Top