Opened 4 years ago

Closed 3 years ago

Last modified 3 years ago

#17415 closed Bug (fixed)

django.contrib.sites.management.create_default_site populates invalid data in DB

Reported by: niko@… Owned by: nobody
Component: contrib.sites Version: master
Severity: Release blocker Keywords: site admin create db orm
Cc: anssi.kaariainen@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by aaugustin)

With a PostgeSQL backend, in trunk revision 1.4 pre-alpha SVN-17189

During syncdb, in the function django.contrib.sites.management.create_default_site,
the default Site 'exemple.com' is created.
But for some reason, the postgres sequence 'django_site_id_seq' is then in an incorrect state.
The last value = 1 (this is ok)
but the flag "Will increment last value before returning next value (is_called)" = No
And this parameter is different in any other table generated by the ORM.

Name	Start value	Last value	Increment by	Max value	Min value	Cache value	Log count	Can cycle?	Will increment last value before returning next value (is_called)?
django_site_id_seq	1	1	1	9223372036854775807	1	1	1	No	No

What happens then?
If you go to the admin and want to add a second Site, when u save, you get the exception:

Exception Type: 	IntegrityError

duplicate key value violates unique constraint "django_site_pkey"
DETAIL:  Key (id)=(1) already exists.

At this point, the add in the DB failed, but the ORM just changed the flag in the Sequence django_site_id_seq.is_called = Yes

This will allow the next attempt to save this second Site (reload the page and resend the POST data, by exemple) to succeed.

How to reproduce the bug at this stage:
in the postgresql, delete the table django_site, run manage.py syncdb, this will create a new table and repopulate the exemple.com default Site, and the buggy is_called tag on the django_site_id_seq sequence.

Then, i dont know why the function django.contrib.sites.management.create_default_site is putting this tag to False.

I havent tested on mysql, andi dont know how increment index is working there.

Attachments (2)

17415-testcase.diff (1.4 KB) - added by aaugustin 4 years ago.
17415.diff (2.1 KB) - added by akaariai 4 years ago.

Download all attachments as: .zip

Change History (15)

comment:1 Changed 4 years ago by aaugustin

  • Description modified (diff)
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Type changed from Uncategorized to Bug

comment:2 Changed 4 years ago by aaugustin

  • Triage Stage changed from Unreviewed to Accepted

I could reproduce this issue.

Here's what I did in a shell:

% PYTHONPATH=../django-trunk ./manage.py syncdb
Creating tables ...
Creating table auth_permission
Creating table auth_group_permissions
Creating table auth_group
Creating table auth_user_user_permissions
Creating table auth_user_groups
Creating table auth_user
Creating table django_content_type
Creating table django_session
Creating table django_site
Creating table django_admin_log

You just installed Django's auth system, which means you don't have any superusers defined.
Would you like to create one now? (yes/no): yes
Username (leave blank to use 'myk'): admin
E-mail address: admin@example.com
Password: 
Password (again): 
Superuser created successfully.
Installing custom SQL ...
Installing indexes ...
Installed 0 object(s) from 0 fixture(s)

% PYTHONPATH=../django-trunk ./manage.py dbshell
psql (9.0.4)
Type "help" for help.

django=# drop table django_site;
DROP TABLE
django=# \q

% PYTHONPATH=../django-trunk ./manage.py syncdb
Creating tables ...
Creating table django_site
Installing custom SQL ...
Installing indexes ...
Installed 0 object(s) from 0 fixture(s)

Then I logged in to the admin, tried to create a new Site object, and got this traceback:

Environment:


Request Method: POST
Request URL: http://localhost:8000/admin/sites/site/add/

Django Version: 1.4 pre-alpha SVN-17229
Python Version: 2.7.2
Installed Applications:
('django.contrib.auth',
 'django.contrib.contenttypes',
 'django.contrib.sessions',
 'django.contrib.sites',
 'django.contrib.messages',
 'django.contrib.staticfiles',
 'django.contrib.admin')
Installed Middleware:
('django.middleware.common.CommonMiddleware',
 'django.contrib.sessions.middleware.SessionMiddleware',
 'django.middleware.locale.LocaleMiddleware',
 'django.contrib.auth.middleware.AuthenticationMiddleware',
 'django.contrib.messages.middleware.MessageMiddleware')


Traceback:
File "/Users/myk/Documents/dev/django-trunk/django/core/handlers/base.py" in get_response
  111.                         response = callback(request, *callback_args, **callback_kwargs)
File "/Users/myk/Documents/dev/django-trunk/django/contrib/admin/options.py" in wrapper
  367.                 return self.admin_site.admin_view(view)(*args, **kwargs)
File "/Users/myk/Documents/dev/django-trunk/django/utils/decorators.py" in _wrapped_view
  91.                     response = view_func(request, *args, **kwargs)
File "/Users/myk/Documents/dev/django-trunk/django/views/decorators/cache.py" in _wrapped_view_func
  88.         response = view_func(request, *args, **kwargs)
File "/Users/myk/Documents/dev/django-trunk/django/contrib/admin/sites.py" in inner
  192.             return view(request, *args, **kwargs)
File "/Users/myk/Documents/dev/django-trunk/django/utils/decorators.py" in _wrapper
  25.             return bound_func(*args, **kwargs)
File "/Users/myk/Documents/dev/django-trunk/django/utils/decorators.py" in _wrapped_view
  91.                     response = view_func(request, *args, **kwargs)
File "/Users/myk/Documents/dev/django-trunk/django/utils/decorators.py" in bound_func
  21.                 return func(self, *args2, **kwargs2)
File "/Users/myk/Documents/dev/django-trunk/django/db/transaction.py" in inner
  209.                 return func(*args, **kwargs)
File "/Users/myk/Documents/dev/django-trunk/django/contrib/admin/options.py" in add_view
  955.                 self.save_model(request, new_object, form, False)
File "/Users/myk/Documents/dev/django-trunk/django/contrib/admin/options.py" in save_model
  709.         obj.save()
File "/Users/myk/Documents/dev/django-trunk/django/contrib/sites/models.py" in save
  51.         super(Site, self).save(*args, **kwargs)
File "/Users/myk/Documents/dev/django-trunk/django/db/models/base.py" in save
  464.         self.save_base(using=using, force_insert=force_insert, force_update=force_update)
File "/Users/myk/Documents/dev/django-trunk/django/db/models/base.py" in save_base
  552.                 result = manager._insert([self], fields=fields, return_id=update_pk, using=using, raw=raw)
File "/Users/myk/Documents/dev/django-trunk/django/db/models/manager.py" in _insert
  203.         return insert_query(self.model, objs, fields, **kwargs)
File "/Users/myk/Documents/dev/django-trunk/django/db/models/query.py" in insert_query
  1569.     return query.get_compiler(using=using).execute_sql(return_id)
File "/Users/myk/Documents/dev/django-trunk/django/db/models/sql/compiler.py" in execute_sql
  848.             cursor.execute(sql, params)
File "/Users/myk/Documents/dev/django-trunk/django/db/backends/util.py" in execute
  36.             return self.cursor.execute(sql, params)
File "/Users/myk/Documents/dev/django-trunk/django/db/backends/postgresql_psycopg2/base.py" in execute
  52.             return self.cursor.execute(query, args)

Exception Type: IntegrityError at /admin/sites/site/add/
Exception Value: duplicate key value violates unique constraint "django_site_pkey"
DETAIL:  Key (id)=(1) already exists.

comment:3 Changed 4 years ago by aaugustin

The query sent to PostgreSQL is:

sql:     'INSERT INTO "django_site" ("domain", "name") VALUES (E\'foo.com\', E\'Foo\') RETURNING "django_site"."id"'
params:  (u'foo.com', u'Foo')

comment:4 Changed 4 years ago by aaugustin

This is reproducible with any model: MyModel.objects(pk=1, **kwargs).save() followed by MyModel(**kwargs).save() will produced this error.

Changed 4 years ago by aaugustin

comment:5 Changed 4 years ago by aaugustin

  • Component changed from contrib.sites to Database layer (models, ORM)
  • Severity changed from Release blocker to Normal
  • Summary changed from django.contrib.sites.management.create_default_site populates invalid data in DB to Creating an object with an explicit primary key value causes the next creation to fail under PostgreSQL

And here's the same sequence in a PostgreSQL shell:

django=# INSERT INTO "testapp_dummy" ("id") VALUES (1);
INSERT 0 1
django=# INSERT INTO "testapp_dummy" ("id") VALUES (DEFAULT) RETURNING "testapp_dummy"."id";
ERROR:  duplicate key value violates unique constraint "tz_app_dummy_pkey"
DETAIL:  Key (id)=(1) already exists.
django=# INSERT INTO "tz_app_dummy" ("id") VALUES (DEFAULT) RETURNING "tz_app_dummy"."id";
 id 
----
  2
(1 row)

INSERT 0 1

To sum up:

  • under PostgreSQL, if you create a new object where you force the primary key, and then create another object where you leave the auto-generated primary key, the second operation will fail; this is the root cause of the bug;
  • this problem has probably existed for a long time, but is more likely to bite users on Site objects since #16353;
  • even though it's annoying, I wouldn't qualify this as a release blocker, as the workaround is trivial and intuitive: just retry.

I'm changing the title and setting flags to reflect this.

Last edited 4 years ago by aaugustin (previous) (diff)

comment:6 Changed 4 years ago by russellm

I just saw this ticket go past as part of the 1.4 rundown, and it occurred to me that the fix is obvious, and has precedence in Django's source tree. Once you've manually created *any* object with a primary key, you need to invoke the SQL generated by the sqlsequencereset management command. This finds the largest PK value, and sets the sequence to select the next available PK value. This is an issue on Postgres and Oracle, because they both use sequences for autogenerated PKs; MySQL and SQLite both use "largest PK + 1" internal mechanisms, IIRC.

We already do this as part of fixture loading. Loading a fixture involves manually specifying a whole lot of primary key values; one of the last things the loaddata command does before committing is to reset the sequences. This ensures that the next object with an autogenerated PK value will have a valid PK value.

From my reading, this bug would have existed for as long as the sites framework has existed; although it would have been masked on any site that loaded fixtures (since loading a single fixture would have reset the sequence as a side effect).

comment:7 Changed 4 years ago by kmtracey

  • Severity changed from Normal to Release blocker

Prior to r16868 create_default_site did not specify the pk when adding default site object, so I think this sequence reset behavior is new with that changeset. Moving back to release blocker since it is a regression (which doesn't imply it needs to be fixed before alpha, necessarily, but I do think we should fix it before 1.4 final).

comment:8 Changed 4 years ago by kmtracey

  • Component changed from Database layer (models, ORM) to contrib.sites

Also switching back to sites component; I believe fixing the general issue at the ORM level (detecting specified PKs and resetting sequences) is out of scope for Django, but in this case where the sites code is specifying a PK we should ensure the sequence value is correct afterwards.

comment:9 Changed 4 years ago by akaariai

There are some concurrency problems when setting the sequence to "largest PK + 1". The problem is that you will not see objects created in other concurrent transactions, and thus largest PK is not guaranteed to be actually the largest PK. This could cause subtle bugs where everything seems to work, except in rare concurrent cases where it doesn't. This would be very hard to debug. It is worth considering that creating an object with autofield PK set should be an error, except if force_insert is defined, or in raw mode (that is, in fixture loading). Document that if you use force_insert for a model with set autofield PK, then you are responsible for resetting the sequence (or do that automatically in force_insert case?).

The underlying problem is that on SQL level, you should not insert your own values to a column which gets its value from a sequence. It is OK to do this in non-concurrent setting (as in restore from backup, load fixtures). You should not expect it to work in concurrent setting. If Django automatically resets the sequence, it makes it look like it is OK to insert values to a serial field. But it is not. Unless you lock the table for inserts. The whole idea of sequences is that they allow concurrent inserts to a table without using max(id) + 1, which just doesn't work. (SQLite has one transaction at a time, so there are no concurrency problems, MySQL probably does some tricks under the hood, so that it can get the actual largest id even in concurrent cases).

Consider this case, with two transactions T1 and T2, and id_seq starting from 10. Largest ID in the table is 10:

T1: begin;
T1: Obj.save() # gets nextval('id_seq') => 11
T1: Obj.save() # gets nextval('id_seq') => 12
T2: begin;
T2: Obj(pk=1).save() # where pk=1 does not exist
T2: reset_sequence('id_seq'): finds the largest visible id: 10 sets sequence to 11.
T1: commit; T2: commit;

Now you have: largest id in table: 12, id_seq starts from 11.

Yes, this is a rare condition. But I believe Django should not expose this possible concurrency problem. So, I suggest this is fixed by "disallow insert into autofield with user defined value, unless it is asked explicitly". If reset_sequence is going to be used, then also lock the table (reset sequence should probably do this in any case...).

comment:10 Changed 4 years ago by akaariai

  • Cc anssi.kaariainen@… added
  • Has patch set

Having re-read the posts above, it seems I misunderstood what was proposed. I somehow managed to read Russell's comment suggesting that reset should be done automatically in save() if pk is set and there isn't a matching value in the DB. I now see this is not the case. Sorry for the wasted time...

Attached is a first-draft patch which solves the create_default_site() issue by resetting the sequence in the function.

Changed 4 years ago by akaariai

comment:11 Changed 3 years ago by aaugustin

  • Summary changed from Creating an object with an explicit primary key value causes the next creation to fail under PostgreSQL to django.contrib.sites.management.create_default_site populates invalid data in DB

Restoring the original title, per Russell's and Karen's comments.

comment:12 Changed 3 years ago by aaugustin

  • Resolution set to fixed
  • Status changed from new to closed

In [17343]:

Fixed #17415 -- Reset database sequence for Site's pk after creating the default site with an explicit pk. Thanks niko AT neagee net for the report, Russell and Karen for describing the fix, and Anssi for drafting the patch.

comment:13 Changed 3 years ago by aaugustin

In [17344]:

[1.3.X] Fixed #17415 -- Reset database sequence for Site's pk after creating the default site with an explicit pk. Backport of r17343 from trunk.

Note: See TracTickets for help on using tickets.
Back to Top