Opened 7 years ago

Closed 7 years ago

#28272 closed Bug (wontfix)

PostgreSQL: AutoField sequences don't increment when inserting objects with an explicitely specified PK

Reported by: François Freitag Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: postgresql sequence
Cc: Mariusz Felisiak Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by François Freitag)

Developers are allowed to specify primary keys for newly created python instances [1]. However, on PostgreSQL, the sequence attached to the serial is not updated to take manually set PKs into account.

The following queries fail on PostgreSQL:

CREATE TABLE foo (id serial primary key, bar text);
INSERT INTO foo VALUES (1, 'thing');
INSERT INTO foo (bar) VALUES ('other');
-- ERROR:  duplicate key value violates unique constraint "foo_pkey"
-- DETAIL:  Key (id)=(1) already exists.

Hence, the following test case fails in Django:

# Using models from tests/bulk_create/models.py
def test_create_explicit_pk(self):
    Country.objects.create(name='France', iso_two_letter='FR')
    next_pk = Country.objects.latest('pk').pk + 1
    Country.objects.create(pk=next_pk, name='US', iso_two_letter='US')
    Country.objects.create(name='NL', iso_two_letter='NL')  # FAILS because next_pk is already in use.
    # The sequence should have yielded next_pk + 1

In my opinion, the sequence attached to the primary key should be manually set to the next available PK after inserting an object with an explicitly specified PK (e.g. using setval [2]). The same issue can be observed for bulk_create, where I think the highest primary key of the batch should be used to determine the next value in the DB sequence.

MariaDB [3] and SQLite [4] update the next value correctly when autoincrement is used.

Oracle might be also be affected by this issue. I was able to find this old ticket on the bug tracker[5].

[1] https://docs.djangoproject.com/en/dev/ref/models/instances/#explicitly-specifying-auto-primary-key-values
[2] https://www.postgresql.org/docs/current/static/functions-sequence.html
[3] https://mariadb.com/kb/en/mariadb/auto_increment/#setting-explicit-values
[4] https://sqlite.org/autoinc.html
[5] https://code.djangoproject.com/ticket/6598

Change History (6)

comment:1 by François Freitag, 7 years ago

Description: modified (diff)

comment:2 by François Freitag, 7 years ago

Description: modified (diff)

comment:3 by Mariusz Felisiak, 7 years ago

Cc: Mariusz Felisiak added

Oracle backend is not affected on master, because we use identity columns https://github.com/django/django/commit/924a89e135fe54bc7622aa6f03405211e75c06e9. I confirm that there is an issue in the 1.11 on Oracle.

comment:4 by Tim Graham, 7 years ago

I'm doubtful that trying to make Django automatically set the sequence is a good idea. In particular, I'm thinking about thread safety of such a solution. Quoting James Bennett from comments 1 & 3 of #11423,

When you do [manually specify a primary key] on Postgres, you need to reset the sequence from which automatic values are generated (since that sequence doesn't know that you manually inserted some values, and so those values are no longer available for use); this is a known feature of Postgres, and Django provides ​a management command (sqlsequencereset) which will output the appropriate SQL for you to run to reset the sequence.

Friendly or not, this is the way Postgres, Oracle, DB2 and many other databases work. MySQL and SQLite are really more the exceptions here, since sequences are part of the SQL standard.

comment:5 by François Freitag, 7 years ago

I do not think that manually specifying PK is a good idea, and I agree that resetting the sequence is subject to race conditions and might not be a good idea.

I think we should at least update the documentation to make it clear that specifying pk in create or bulk_create can lead to future issues on PostgreSQL. The documentation already states:

Explicitly specifying auto-primary-key values is mostly useful for bulk-saving objects, when you’re confident you won’t have primary-key collision.

I think manually setting primary keys on PostgreSQL should discouraged, or a link to sqlresetsequence should be added (even though using this management command seem like a very manual process).

comment:6 by Tim Graham, 7 years ago

Resolution: wontfix
Status: newclosed

Documenting the current behavior is a duplicate of #18485.

Note: See TracTickets for help on using tickets.
Back to Top