Opened 15 years ago

Closed 15 years ago

Last modified 15 years ago

#9841 closed (wontfix)

Determine UPDATE or INSERT without running a SELECT statement

Reported by: sp.hogan@… Owned by: nobody
Component: Core (Other) Version: 1.0
Severity: Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Right now, every time one calls save() on a model that has a primary key set, Django must execute two SQL statements: a SELECT to see if an item exists in the database with that primary key and then the appropriate UPDATE or INSERT. So, every UPDATE operation becomes a two SQL call process. For most purposes, it probably isn't a big deal, but I like efficiency.

Clearly, you can't just take the presence/absence of a primary key to determine whether you should update/insert. But Rails actually determines it without doing a select statement. Basically, each ActiveRecord object in Rails has a boolean new_record flag that tells you whether it's a new record or not. If new_record is true, insert, else update.

So, basically, it could work like this:

>>>p = Person.objects.get(pk=23)
>>>p._new_record
False

>>>p2 = Person(name="Sean", fav_ice_cream="Mint")
>>>p2._new_record
True

This wouldn't be a small change and *would* change one part of the API. Currently the docs read: "The one gotcha here is that you should be careful not to specify a primary-key value explicitly when saving new objects, if you cannot guarantee the primary-key value is unused." This would change it so that if you specified a pk that was a duplicate, it would be caught when obj._new_record = True and it tried an insert and got a duplicate key error. Currently, the system silently updates whatever has the duplicate pk.

While this would change the behavior, the documentation on it seems to indicate that it would be desirable for an error to occur with many warnings that you'll get an unintended update rather than an error should you duplicate a key.

So, this change would save a SQL call for all UPDATES and correct what the documentation indicates is a bad situation that should (but doesn't) give you a duplicate key error. It doesn't look like a small change though. save_base() would need to be changed in db/models/base.py and the _new_record attribute would need to be properly set based on whether the object was initialized from the database or not. Someone else should probably comment on the feasibility of the code changes.

Relavant documentation sections:
http://docs.djangoproject.com/en/dev/ref/models/instances/#how-django-knows-to-update-vs-insert
http://docs.djangoproject.com/en/dev/ref/models/instances/#explicitly-specifying-auto-primary-key-values

P.S. If one wanted to maintain backward compatibility, it could be implemented so that it uses the _new_record boolean to determine update/insert, but if an insert gives a duplicate key error, it then just runs the update. Since this case shouldn't be the norm, it would still cut down on a lot of SQL executions.

Change History (2)

comment:1 by Jacob, 15 years ago

Resolution: wontfix
Status: newclosed

This has been discussed before -- check django-dev for details -- and decided that the current behavior is by design.

comment:2 by Malcolm Tredinnick, 15 years ago

When/if we add support for "only save fields that have been changed", this behaviour will also fall out by default.

Note: See TracTickets for help on using tickets.
Back to Top