﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
33414	Diamond inheritance causes duplicated PK error when creating an object, if the primary key field has a default.	Yu Li	Akash Kumar Sen	"Hi, I'm not sure if this is a bug or an unsupported feature. But I looked into the django/db/models/base.py source code and now have a pretty good understanding of what is happening. 

My business code uses a diamond shape inheritance to model different types of user posts: UserPost, ImagePost, VideoPost, and MixedContentPost. The inheritance goes like this: both ImagePost and VideoPost extend from UserPost, and the MixedContentPost inherits from ImagePost and VideoPost. All of them are concrete models

I read the doc and expected it to work, similar to the example 


{{{
class Piece(models.Model):
    pass

class Article(Piece):
    article_piece = models.OneToOneField(Piece, on_delete=models.CASCADE, parent_link=True)
    ...

class Book(Piece):
    book_piece = models.OneToOneField(Piece, on_delete=models.CASCADE, parent_link=True)
    ...

class BookReview(Book, Article):
    pass
}}}

However, I found out that the doc's example only works when these models use a primary key field that does not have a default. In my case, we are using a UUIDField as the primary key with a default of uuid4. Trying to create a BookReview in our case, causes a django.db.utils.IntegrityError: UNIQUE constraint failed error, because django tries to create the Piece object twice, with the same uuid.

The same behavior is found if I used a AutoField on Piece, with a custom default function, such as

{{{
id = 99

def increment():
    global id
    id += 1
    return id
}}}

This default function makes no sense in practice, but I just use it here to illustrate the root cause of the problem:

The _save_table method in db/models/base.py has a like this:

{{{
        # Skip an UPDATE when adding an instance and primary key has a default.
        if (
            not raw
            and not force_insert
            and self._state.adding
            and meta.pk.default
            and meta.pk.default is not NOT_PROVIDED
        ):
            force_insert = True
}}}

When a default is not present, which is the case of the documentation example, Django will first insert the first instance of the Piece object, and then for the second one, since force_insert is False, _save_table tries an update and goes through. Therefore there is not duplicate.

However, if a default is present, then the second Piece becomes an insertion as well (because meta.pk.default and meta.pk.default is not NOT_PROVIDED, force_insert is True). This causes a duplication on the primary key.

On the other hand, _insert_parent does an in-order traversal calling _save_table on each node, so even in the no-default pk case, it is calling a redundant update on the root node after the insertion..

So which function is at fault? 

The source code _save_table assumes that if you are calling it with a default pk then you can skip an update. This assumption looks weird to me: why only when there IS a default pk you can skip update? Why not just skip update as long as we know we are inserting? (via self._state.adding) Is it just to make it special so that AutoField works? If _save_table's responsibility is to try updating before inserting, except when the params force it to do an update or insert, then it shouldn't override that behavior by this self-assumeption  within it.

I think the solution is to simply move the check to save_base. And don't do this check in _save_parents.

Like this:


{{{

    def save_base(
        self,
        raw=False,
        force_insert=False,
        force_update=False,
        using=None,
        update_fields=None,
    ):
        """"""
        Handle the parts of saving which should be done only once per save,
        yet need to be done in raw saves, too. This includes some sanity
        checks and signal sending.

        The 'raw' argument is telling save_base not to save any parent
        models and not to do any changes to the values before save. This
        is used by fixture loading.
        """"""
        using = using or router.db_for_write(self.__class__, instance=self)
        assert not (force_insert and (force_update or update_fields))
        assert update_fields is None or update_fields
        cls = origin = self.__class__
        # Skip proxies, but keep the origin as the proxy model.
        if cls._meta.proxy:
            cls = cls._meta.concrete_model
        meta = cls._meta
        if not meta.auto_created:
            pre_save.send(
                sender=origin,
                instance=self,
                raw=raw,
                using=using,
                update_fields=update_fields,
            )
        # A transaction isn't needed if one query is issued.
        if meta.parents:
            context_manager = transaction.atomic(using=using, savepoint=False)
        else:
            context_manager = transaction.mark_for_rollback_on_error(using=using)
        with context_manager:
            parent_inserted = False
            if not raw:
                parent_inserted = self._save_parents(cls, using, update_fields)

            # Skip an UPDATE when adding an instance and primary key has a default.
            if (
                not raw
                and not force_insert
                and self._state.adding
                and meta.pk.default
                and meta.pk.default is not NOT_PROVIDED
            ):
                force_insert = True
            updated = self._save_table(
                raw,
                cls,
                force_insert or parent_inserted,
                force_update,
                using,
                update_fields,
            )
        # Store the database on which the object was saved
        self._state.db = using
        # Once saved, this is no longer a to-be-added instance.
        self._state.adding = False

        # Signal that the save is complete
        if not meta.auto_created:
            post_save.send(
                sender=origin,
                instance=self,
                created=(not updated),
                update_fields=update_fields,
                raw=raw,
                using=using,
            )

    save_base.alters_data = True

}}}

I have never contributed to Django before. If you think I'm right on this one I'll look into creating a PR.

"	Bug	closed	Database layer (models, ORM)	4.0	Normal	fixed		Simon Charette Abhijeet Viswa Akash Kumar Sen	Ready for checkin	1	0	0	0	0	0
