Opened 5 years ago

Closed 4 months ago

#14518 closed Bug (worksforme)

Field.to_python not called on foreign key IDs

Reported by: wolever Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: wolever, kitsunde@…, info@…, anubhav9042@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no


Given a subclass of Field like this:

class MyIDField(Field):
    def to_python(self, value):
        return to_base_36(value)

Used like this:

class Foo(m.Model):
    id = MyIDField(primary_key=True)

class Bar(m.Model):
    foo = m.ForeignKey(Foo)

Then the MyIDField.to_python function isn't called when Bar.foo_id is accessed:

>>> b = Bar.objects.all()[0]
>>> b.foo_id

This is annoying for a few reasons, but most significantly because it breaks dropdown menus in admin: even if a value is set for a foreign key, the menu selects the default item (usually ------). This is because the menu items *have* been to_python'd, but the default value has *not* been to_python'd).

I'm seeing this in Django 1.2.1.

If this is confirmed as a bug, I can submit test cases.

Attachments (1)

14518.diff (1.1 KB) - added by anubhav9042 20 months ago.

Download all attachments as: .zip

Change History (17)

comment:1 Changed 5 years ago by russellm

  • Component changed from Uncategorized to Database layer (models, ORM)
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

Accepting, because it's clearly an inconsistency.

However, I have a sneaking suspicion that this won't be easy to fix without breaking backwards compatibility. For historical reasons, it's possible to instantiate an object with string values, and those values aren't converted into field-appropriate values (integers, etc) until the object is saved to the database. This is an artefact of early-days Django form processing.

comment:2 Changed 5 years ago by wolever

  • Cc wolever added

It may also be worth noting: I have a StackOverflow question asking the same thing here:

comment:3 Changed 5 years ago by julien

  • Severity set to Normal
  • Type set to Bug

comment:4 Changed 4 years ago by avnimahajan@…

  • Easy pickings unset
  • UI/UX unset

I am also facing same problem. We have changed the default behavior of having primary keys. Throughout our code it has been replaced with custom UUIDFiled(given below). Till django 1.1 we were not facing any problem but while migrating to django1.3, I started facing problem during syncdb(duplicate key error while inserting in auth_permissions).
Since there is change in "django/contrib/auth/management/" create_permission code. Now it is referring to "content_type" which is a Foreign key. In my case its value is not being to_python'd. Thus it fails in comparison with a uuid.UUID type value (returned when referred as pk which is always to_python'd).

Please update by when this bug will be solved?

class UUIDField(models.Field):

    __metaclass__ = models.SubfieldBase
    empty_strings_allowed = False

    def __init__(self, *args, **kwargs):
        if kwargs.get('primary_key', False):
            kwargs['editable'] = False
        kwargs['db_index'] = True
        super(UUIDField, self).__init__(*args, **kwargs)

    def db_type(self, connection):
        return 'uuid'

    def get_internal_type(self):
        return 'UUIDField'

    def to_python(self, value):
        if (value is None) or isinstance(value, uuid.UUID):
            return value
            return uuid.UUID(value)
        except (AttributeError, TypeError, ValueError):
            return value

    def pre_save(self, obj, add):
        old_val = getattr(obj, self.attname)
        if (self.primary_key and add) and (old_val is None):
            value = uuid.uuid4()
            setattr(obj, self.attname, value)
            return value
            return old_val

    def get_db_prep_lookup(self, lookup_type, value,connection=None, prepared=False):
        if lookup_type == 'exact':
            return [self.get_db_prep_value(value,connection=None, prepared=False)]
        elif lookup_type == 'in':
            return [self.get_db_prep_value(v,connection=None, prepared=False) for v in value]
        elif lookup_type == 'isnull':
            return []
        raise TypeError(u"Field has invalid lookup type: %s" % lookup_type)

    def value_to_string(self, obj):
        value = getattr(obj, self.attname)
        if value is None:
            return value
            return unicode(value)

    def formfield(self, **kwargs):
        defaults = {
            'form_class': forms.RegexField,
            'regex': UUID_REGEX,
            'max_length': 47,
            'error_messages': {'invalid': u"Enter only valid UUID."}
        return super(UUIDField, self).formfield(**defaults)
Last edited 9 months ago by timgraham (previous) (diff)

comment:5 Changed 4 years ago by kitsunde

  • Cc kitsunde@… added

comment:6 Changed 2 years ago by MarkusH

  • Cc info@… added

comment:7 Changed 20 months ago by anubhav9042

Well there have been some changes in the code and some things have changed.
Here is my

from django.db import models

class MyField(models.CharField):
	__metaclass__ = models.SubfieldBase

	def __init__(self, *args, **kwargs):
		super(MyField, self).__init__(*args, **kwargs)

	def to_python(self, value):
		return "Value: %s" % value

class Foo(models.Model):
   	field = MyField(max_length=30, primary_key=True)

class Bar(models.Model):
    foo = models.ForeignKey(Foo)

In shell, I get correct values here:

>>> m1 = Foo.objects.create(field="anubhavjoshi")
>>> m2 = Bar.objects.create(foo=m1)
>>> m2.foo_id
u'Value: Value: anubhavjoshi'
u'Value: Value: anubhavjoshi'

But if I do:

>>> m1 = Foo.objects.create(field="anubhavjoshi")
>>> m2 = Bar.objects.create(foo=m1)
>>> m2 = Bar.objects.all()[0]
>>> m2.foo_id
u'Value: Value: anubhavjoshi'
DoesNotExist: Foo matching query does not exist.
DoesNotExist: Foo matching query does not exist.

Following traceback:

DoesNotExist                              Traceback (most recent call last)
C:\Users\Anubhav\DjangoDev\apps\<ipython-input-7-3a4a21c18010> in <module>()
----> 1

c:\users\anubhav\django\django\db\models\fields\related.pyc in __get__(self, instance, instance_type)
    562                     qs = qs.filter(extra_filter, **params)
    563                 # Assuming the database enforces foreign keys, this won't fail.

--> 564                 rel_obj = qs.get()
    565                 if not self.field.rel.multiple:
    566                     setattr(rel_obj, self.field.related.get_cache_name(), instance)

c:\users\anubhav\django\django\db\models\query.pyc in get(self, *args, **kwargs)
    374             raise self.model.DoesNotExist(
    375                 "%s matching query does not exist." %
--> 376                 self.model._meta.object_name)
    377         raise self.model.MultipleObjectsReturned(
    378             "get() returned more than one %s -- it returned %s!" % (

comment:8 Changed 20 months ago by anubhav9042

  • Cc anubhav9042@… added
  • Version changed from 1.2 to master

comment:9 Changed 20 months ago by anubhav9042

I am attaching a diff to fix what remains.

Changed 20 months ago by anubhav9042

comment:10 Changed 20 months ago by anubhav9042

This might cause other tests to fail but might give some insight as to why this problem occured.
As queryset gets the correct value earlier on and is changed to empty qs after that condition.

comment:11 Changed 20 months ago by Harm Geerts <hgeerts@…>

The reason the first example works is because in Bar.objects.create(foo=m1) the m1 Foo instance is cached on the Bar instance, so does not perform a database query.

The second example fails because m2 has to perform a database lookup, and that's where things get fun.

Your MyField.to_python is called multiple times so the value stored in the database is actually "Value: Value: anubhavjoshi".
When you access the primary key value "Value: Value: anubhavjoshi" is passed through to_python again and becomes "Value: Value: Value: anubhavjoshi" and that primary key does not exist.

The reason your patch works is because you only had one Foo record in your dataset.
If you had multiple Foo records your test would raise MultipleObjectsReturned.
The only thing the patch does is disable related lookup filters which is not what you want ;)

It may be an idea to add a note to the documentation about custom field classes.

comment:12 Changed 20 months ago by anubhav9042

Thanks for your explanation. As I said my patch wasn't correct, it only was to tell which line caused errors.

Now that I tried the following:

def to_python(self, value):
	return ".".join(value)

So the lookup which failed earlier passes, but as explained by Harm above,

>>> m2.foo_id

But it should have been: u'a.n.u.b.h.a.v'

It seems that to_python is called too many times and that might be wrong when it is defined in a way like in my examples..

Also if we remove __metaclass__ = models.SubfieldBase from our custom field then to_python does gets called but something strange happens as it appears it does not serves the purpose as always u'anubhav' is returned by m1.field or m2.foo_id.
Didn't know if this metaclass is helpful or not, I just preferred to tell here.

Last edited 20 months ago by anubhav9042 (previous) (diff)

comment:13 Changed 19 months ago by slurms

@anubhav9042: looks like this is a problem with the way you're subclassing the CharField. When to_python is called, it could well already be in the right format, or it might be unset. to_python gets called every time the field is assigned a value and you are responsible for ensuring it is in the right format ( Notice the two lines at the beginning of that example code. You will need to do something similar.

comment:14 Changed 16 months ago by antialiasis

Is there any chance this might be fixed in some way at some point?

Like others, we've been using UUIDs as primary keys on a PostgreSQL database, and this bug is completely breaking prefetch_related. The way prefetching is implemented (at least in 1.7), instances are collected into a dictionary with their to_python'd primary keys as the keys, and then it tries to access them using the non-to_python'd foreign key ID field. Needless to say, this results in a KeyError if to_python changes the representation of the field at all (in our case, to strip PostgreSQL's native hyphens).

comment:15 Changed 16 months ago by timgraham

Yes, once we have a working patch and someone reviews it.

comment:16 Changed 4 months ago by timgraham

  • Resolution set to worksforme
  • Status changed from new to closed

I can't reproduce the behavior described in the ticket as far back as Django 1.2. b.foo_id and both return an integer. A look of the follow up comments use SubfieldBase which is deprecated in favor of Field.from_db_value(). If problems remain in newer versions of Django, please open a new ticket with details.

Note: See TracTickets for help on using tickets.
Back to Top