#16865 closed Bug (wontfix)
get_or_create defaults to _for_write even when it's just reading
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.3 |
Severity: | Normal | Keywords: | |
Cc: | v.oostveen@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | yes | UI/UX: | no |
Description
First a little background information. I am currently using an asynchronous master slave(s) setup which requires me to pin a request to the master as soon as Django does any writes.
This results in all code reading from the slaves when possible, and switching to the master as soon as data is written.
The problem comes with the get_or_create
method. Most of the code that uses get_or_create
expects to get a result and uses the create
as a fallback if it doesn't exist. The result of this is that (atleast in all the cases where we use get_or_create
), the get_or_create
method is 99% reads and only a tiny bit of writes. With the current code however, Django defaults to using the write database as soon as get_or_create
is called before ever executing an insert.
The link to the line that is causing this:
https://code.djangoproject.com/browser/django/trunk/django/db/models/query.py#L407
Was this done by design? If so, please explain the reasoning behind it.
Otherwise I would classify this as a bug and in that case it would be a good idea to move the self._for_write = True
line to the except self.model.DoesNotExist:
block where the actual writes happen.
Attachments (4)
Change History (22)
comment:1 by , 13 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 13 years ago
I think this would be a bad idea. get_or_create is inherently a write-operation and as such is currently doing the correct thing. Changing this behavior would break a lot of existing code and damage the stability and usefulness of get_o_create.
comment:3 by , 13 years ago
There's too great a chance that the slave DB is a little out of date, get_or_create sees no record, tries to create it, and then fails because it already exists in the master DB.
get_or_create() is more designed for data imports than to be used in every request. If 99% of the time the record is expected to be there, and performance is critical, why not use a get() and catch the exception and call get_or_create() in those rare cases.
comment:4 by , 13 years ago
I think the suggestion here has been misunderstood (at any rate I misunderstood it when overhearing part of a conversation about it...sorry!) and that the suggested change would not break anything. The suggestion is to simply move the _for_write=True
down to where a write is actually going to be attempted. Yes, that means the initial attempt to get
may falsely say "not here!" when in fact the instance has already been created in one DB but not propagated to the DB chosen for the initial get
attempt. But the subsequent code that tries the create already handles the case where the create
fails due to the object already existing: it re-attempts the get
, which at that point will query the same database as the write was attempted on and will succeed (assuming this code works at all, which it doesn't on MySQL/InnoDB with repeatable read transaction isolation level, but that's an entirely different ticket).
comment:5 by , 13 years ago
@calvinspealman: it is called get_or_create isn't it? All of the Django code that uses it makes the assumption that it is always available and just falls back to creating if it is indeed new. And as you can see in the code, it is implemented as such. It does a get with a fallback to create. In your case it would be a create with a fallback to get if you get an integrity error.
@kmtracey: You are right, it all depends on your replication method which it why this code has been working flawlessly for me so far. Regardless, the current system definately breaks any kind of master slave setup which is just silly.
Perhaps the current database routers need an extra method for that. Right now we have the db_for_write
and db_for_read
but in this case it might be useful to have a db_for_realtime_read
or something. Assuming that the database which you write to is also the one you want to read from is not in all cases right.
comment:6 by , 13 years ago
@carljm has correctly hit the reason this was done -- get_or_create needs to get a reliable answer as to whether the object exists, and due to the potential for replication lag, the most reliable answer comes from the write source, not the read source.
That said, I can certainly see the value in moving the *initial* read load off the maste, and from the looks of it, @kmtracey's analysis is correct -- as long as the fallback read is performed on the master, we should get the benefits of a distributed initial read, while retaining a reliable write.
by , 13 years ago
Attachment: | patch_16865_get_or_create_on_multidb_stuff.diff added |
---|
comment:7 by , 13 years ago
Has patch: | set |
---|
This patch does what we can expect:
if the object does not exist on the first db (read db) and if the read db is different than the write db,
then we retry the get() on the write db,
else we re-raise a new DoesNotExist exception.
I prefer if someone who is 'experimented' could check this.
comment:8 by , 13 years ago
Cc: | added |
---|
Isn't this patch a bit more complicated then it's need to be ?
When we just move down self._for_write = True
from the outer most try-except one level then we would have the same effect right ?
It will try:
- a read from the slave (or whatever choice is made by the router)
- try to insert the new record on the master
- retry the get operation but this time from the master explicitly since self._for_write is set.
And we save doing second get() on the master since it would make sense that most of the time the slaves are up-to-date and so that there is no master record.
The drawback of that approach is that a transaction is made and executed in the case that the slave was lagging and the object already exists. (this could easily happen in a situation where a context processes or something is executing some code that does the get_or_create and when the page loads you request multiple pages doing the same get_or_create. First request actually creates the object but the slaves have not caught up a ms later that the other request for the same page come in. If you have a high traffic site you might want to safe on transactions and opt for more selects instead.)
by , 12 years ago
Attachment: | patch_16865_get_or_create_move_for_write.diff added |
---|
comment:9 by , 12 years ago
Added patch, patch_16865_get_or_create_move_for_write.diff, which moves the _for_write to where changes are (potentially) are made to the database.
Since this patch is extremely simple I hope it can be accepted so we can some progress on this ticket.
by , 12 years ago
Attachment: | patch_16865_get_or_create_move_for_write-2.diff added |
---|
comment:11 by , 12 years ago
This patch is the exact same as we have been using at Fashiolista.com for a year now. This should indicate at least some stability since Fashiolista.com is probably one of the largest Django sites around.
by , 12 years ago
Attachment: | patch_16865_get_or_create_move_for_write-3.diff added |
---|
comment:12 by , 12 years ago
Updated the patch with some documentation to docs/refs/models/queryset.txt about this.
FYI: I have been using this patch in production for about 7 months now, which include some e-commerce and high traffic gaming sites.
comment:13 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:15 by , 12 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
follow-up: 17 comment:16 by , 12 years ago
Resolution: | → wontfix |
---|---|
Status: | reopened → closed |
Closing as wontfix, as after discussion with jdunck I no longer see how we can do this and still guarantee correctness. We can reopen if somebody can make a proposal that addresses the type of problem outlined above.
comment:17 by , 12 years ago
Replying to carljm:
Closing as wontfix, as after discussion with jdunck I no longer see how we can do this and still guarantee correctness. We can reopen if somebody can make a proposal that addresses the type of problem outlined above.
I agree I don't see an easy way out of this problem. Pretty sure me and the reporter of this ticket only use get_or_create() in situation where we never delete the object that is potentially created by get_or_create().
Could we at least put a little hint in the documentation of get_or_create() that notes that this function will always use the for-write database via db routers for consistency ? That way future people are warned not to expect this to scale over multiple databases.
comment:18 by , 12 years ago
If this case is such a problem, wouldn't a setting fix it? Or a parameter for the get_or_create
method?
There has to be some way to enable this within the ORM. Having to write your own try/except
every time you use it (especially with the required savepoints) is quite the hassle.
This seems reasonable; the downside is if you first read from the read slave it might increase the likelihood of collisions (the
IntegrityError
case further down) due to replication lag. Still seems like that might be an OK exchange for getting the majority of reads onto the read slave; it's a reasonable assumption that the common case for most uses ofget_or_create
is that the object already exists.Tentatively accepting, but I'd like to hear from Alex or Russell, who are more familiar with the multi-db stuff.