Opened 6 years ago

Closed 13 months ago

#14518 closed Bug (worksforme)

Field.to_python not called on foreign key IDs

Reported by: David Wolever Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: David Wolever, kitsunde@…, info@…, anubhav9042@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


Given a subclass of Field like this:

class MyIDField(Field):
    def to_python(self, value):
        return to_base_36(value)

Used like this:

class Foo(m.Model):
    id = MyIDField(primary_key=True)

class Bar(m.Model):
    foo = m.ForeignKey(Foo)

Then the MyIDField.to_python function isn't called when Bar.foo_id is accessed:

>>> b = Bar.objects.all()[0]
>>> b.foo_id

This is annoying for a few reasons, but most significantly because it breaks dropdown menus in admin: even if a value is set for a foreign key, the menu selects the default item (usually ------). This is because the menu items *have* been to_python'd, but the default value has *not* been to_python'd).

I'm seeing this in Django 1.2.1.

If this is confirmed as a bug, I can submit test cases.

Attachments (1)

14518.diff (1.1 KB) - added by ANUBHAV JOSHI 2 years ago.

Download all attachments as: .zip

Change History (17)

comment:1 Changed 6 years ago by Russell Keith-Magee

Component: UncategorizedDatabase layer (models, ORM)
Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset
Triage Stage: UnreviewedAccepted

Accepting, because it's clearly an inconsistency.

However, I have a sneaking suspicion that this won't be easy to fix without breaking backwards compatibility. For historical reasons, it's possible to instantiate an object with string values, and those values aren't converted into field-appropriate values (integers, etc) until the object is saved to the database. This is an artefact of early-days Django form processing.

comment:2 Changed 6 years ago by David Wolever

Cc: David Wolever added

It may also be worth noting: I have a StackOverflow question asking the same thing here:

comment:3 Changed 5 years ago by Julien Phalip

Severity: Normal
Type: Bug

comment:4 Changed 5 years ago by avnimahajan@…

Easy pickings: unset
UI/UX: unset

I am also facing same problem. We have changed the default behavior of having primary keys. Throughout our code it has been replaced with custom UUIDFiled(given below). Till django 1.1 we were not facing any problem but while migrating to django1.3, I started facing problem during syncdb(duplicate key error while inserting in auth_permissions).
Since there is change in "django/contrib/auth/management/" create_permission code. Now it is referring to "content_type" which is a Foreign key. In my case its value is not being to_python'd. Thus it fails in comparison with a uuid.UUID type value (returned when referred as pk which is always to_python'd).

Please update by when this bug will be solved?

class UUIDField(models.Field):

    __metaclass__ = models.SubfieldBase
    empty_strings_allowed = False

    def __init__(self, *args, **kwargs):
        if kwargs.get('primary_key', False):
            kwargs['editable'] = False
        kwargs['db_index'] = True
        super(UUIDField, self).__init__(*args, **kwargs)

    def db_type(self, connection):
        return 'uuid'

    def get_internal_type(self):
        return 'UUIDField'

    def to_python(self, value):
        if (value is None) or isinstance(value, uuid.UUID):
            return value
            return uuid.UUID(value)
        except (AttributeError, TypeError, ValueError):
            return value

    def pre_save(self, obj, add):
        old_val = getattr(obj, self.attname)
        if (self.primary_key and add) and (old_val is None):
            value = uuid.uuid4()
            setattr(obj, self.attname, value)
            return value
            return old_val

    def get_db_prep_lookup(self, lookup_type, value,connection=None, prepared=False):
        if lookup_type == 'exact':
            return [self.get_db_prep_value(value,connection=None, prepared=False)]
        elif lookup_type == 'in':
            return [self.get_db_prep_value(v,connection=None, prepared=False) for v in value]
        elif lookup_type == 'isnull':
            return []
        raise TypeError(u"Field has invalid lookup type: %s" % lookup_type)

    def value_to_string(self, obj):
        value = getattr(obj, self.attname)
        if value is None:
            return value
            return unicode(value)

    def formfield(self, **kwargs):
        defaults = {
            'form_class': forms.RegexField,
            'regex': UUID_REGEX,
            'max_length': 47,
            'error_messages': {'invalid': u"Enter only valid UUID."}
        return super(UUIDField, self).formfield(**defaults)
Last edited 17 months ago by Tim Graham (previous) (diff)

comment:5 Changed 5 years ago by Kit Sunde

Cc: kitsunde@… added

comment:6 Changed 3 years ago by Markus Holtermann

Cc: info@… added

comment:7 Changed 2 years ago by ANUBHAV JOSHI

Well there have been some changes in the code and some things have changed.
Here is my

from django.db import models

class MyField(models.CharField):
	__metaclass__ = models.SubfieldBase

	def __init__(self, *args, **kwargs):
		super(MyField, self).__init__(*args, **kwargs)

	def to_python(self, value):
		return "Value: %s" % value

class Foo(models.Model):
   	field = MyField(max_length=30, primary_key=True)

class Bar(models.Model):
    foo = models.ForeignKey(Foo)

In shell, I get correct values here:

>>> m1 = Foo.objects.create(field="anubhavjoshi")
>>> m2 = Bar.objects.create(foo=m1)
>>> m2.foo_id
u'Value: Value: anubhavjoshi'
u'Value: Value: anubhavjoshi'

But if I do:

>>> m1 = Foo.objects.create(field="anubhavjoshi")
>>> m2 = Bar.objects.create(foo=m1)
>>> m2 = Bar.objects.all()[0]
>>> m2.foo_id
u'Value: Value: anubhavjoshi'
DoesNotExist: Foo matching query does not exist.
DoesNotExist: Foo matching query does not exist.

Following traceback:

DoesNotExist                              Traceback (most recent call last)
C:\Users\Anubhav\DjangoDev\apps\<ipython-input-7-3a4a21c18010> in <module>()
----> 1

c:\users\anubhav\django\django\db\models\fields\related.pyc in __get__(self, instance, instance_type)
    562                     qs = qs.filter(extra_filter, **params)
    563                 # Assuming the database enforces foreign keys, this won't fail.

--> 564                 rel_obj = qs.get()
    565                 if not self.field.rel.multiple:
    566                     setattr(rel_obj, self.field.related.get_cache_name(), instance)

c:\users\anubhav\django\django\db\models\query.pyc in get(self, *args, **kwargs)
    374             raise self.model.DoesNotExist(
    375                 "%s matching query does not exist." %
--> 376                 self.model._meta.object_name)
    377         raise self.model.MultipleObjectsReturned(
    378             "get() returned more than one %s -- it returned %s!" % (

comment:8 Changed 2 years ago by ANUBHAV JOSHI

Cc: anubhav9042@… added
Version: 1.2master

comment:9 Changed 2 years ago by ANUBHAV JOSHI

I am attaching a diff to fix what remains.

Changed 2 years ago by ANUBHAV JOSHI

Attachment: 14518.diff added

comment:10 Changed 2 years ago by ANUBHAV JOSHI

This might cause other tests to fail but might give some insight as to why this problem occured.
As queryset gets the correct value earlier on and is changed to empty qs after that condition.

comment:11 Changed 2 years ago by Harm Geerts <hgeerts@…>

The reason the first example works is because in Bar.objects.create(foo=m1) the m1 Foo instance is cached on the Bar instance, so does not perform a database query.

The second example fails because m2 has to perform a database lookup, and that's where things get fun.

Your MyField.to_python is called multiple times so the value stored in the database is actually "Value: Value: anubhavjoshi".
When you access the primary key value "Value: Value: anubhavjoshi" is passed through to_python again and becomes "Value: Value: Value: anubhavjoshi" and that primary key does not exist.

The reason your patch works is because you only had one Foo record in your dataset.
If you had multiple Foo records your test would raise MultipleObjectsReturned.
The only thing the patch does is disable related lookup filters which is not what you want ;)

It may be an idea to add a note to the documentation about custom field classes.

comment:12 Changed 2 years ago by ANUBHAV JOSHI

Thanks for your explanation. As I said my patch wasn't correct, it only was to tell which line caused errors.

Now that I tried the following:

def to_python(self, value):
	return ".".join(value)

So the lookup which failed earlier passes, but as explained by Harm above,

>>> m2.foo_id

But it should have been: u'a.n.u.b.h.a.v'

It seems that to_python is called too many times and that might be wrong when it is defined in a way like in my examples..

Also if we remove __metaclass__ = models.SubfieldBase from our custom field then to_python does gets called but something strange happens as it appears it does not serves the purpose as always u'anubhav' is returned by m1.field or m2.foo_id.
Didn't know if this metaclass is helpful or not, I just preferred to tell here.

Last edited 2 years ago by ANUBHAV JOSHI (previous) (diff)

comment:13 Changed 2 years ago by Nick Sandford

@anubhav9042: looks like this is a problem with the way you're subclassing the CharField. When to_python is called, it could well already be in the right format, or it might be unset. to_python gets called every time the field is assigned a value and you are responsible for ensuring it is in the right format ( Notice the two lines at the beginning of that example code. You will need to do something similar.

comment:14 Changed 2 years ago by antialiasis

Is there any chance this might be fixed in some way at some point?

Like others, we've been using UUIDs as primary keys on a PostgreSQL database, and this bug is completely breaking prefetch_related. The way prefetching is implemented (at least in 1.7), instances are collected into a dictionary with their to_python'd primary keys as the keys, and then it tries to access them using the non-to_python'd foreign key ID field. Needless to say, this results in a KeyError if to_python changes the representation of the field at all (in our case, to strip PostgreSQL's native hyphens).

comment:15 Changed 2 years ago by Tim Graham

Yes, once we have a working patch and someone reviews it.

comment:16 Changed 13 months ago by Tim Graham

Resolution: worksforme
Status: newclosed

I can't reproduce the behavior described in the ticket as far back as Django 1.2. b.foo_id and both return an integer. A look of the follow up comments use SubfieldBase which is deprecated in favor of Field.from_db_value(). If problems remain in newer versions of Django, please open a new ticket with details.

Note: See TracTickets for help on using tickets.
Back to Top