Opened 16 years ago
Closed 11 years ago
#8892 closed Bug (duplicate)
ForeignKey relation not saved as expected
Reported by: | Julien Phalip | Owned by: | blacklwhite |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.0 |
Severity: | Normal | Keywords: | |
Cc: | blacklwhite | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
Here are two simple models:
class ModelA(models.Model): name = models.CharField(max_length=10) class ModelB(models.Model): name = models.CharField(max_length=10) a = models.ForeignKey(ModelA, blank=True, null=True)
Now, see the following sequence. When setting the relation *before* saving the related object, the relation itself is not properly saved:
>>> a = ModelA(name='foo') >>> b = ModelB(name='bar') >>> b.a = a # Set relation *before* saving related object >>> a.save() # Save related object >>> b.save() >>> b.a <ModelA: ModelA object> >>> b = ModelB.objects.get(name='bar') >>> b.a <NoneType: None>
Logically I'd expect that, since a
is saved before b
, the relation would be properly saved but it is not.
Yet, if setting the relation *after* saving the related object, then it works fine (in the below example I only swapped 2 lines: b.a = a
and a.save()
):
>>> a = ModelA(name='foo') >>> b = ModelB(name='bar') >>> a.save() # Save related object >>> b.a = a # Set relation *after* saving related object >>> b.save() >>> b.a <ModelA: ModelA object> >>> b = ModelB.objects.get(name='bar') >>> b.a <ModelA: ModelA object>
I've tested with MySQL and SQLite.
Attachments (2)
Change History (26)
follow-up: 3 comment:1 by , 16 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
comment:2 by , 16 years ago
Resolution: | wontfix |
---|---|
Status: | closed → reopened |
The documentation still needs to be clarified and we can look at raising a more explicit error message.
comment:3 by , 16 years ago
Replying to mrts:
This is quite expected as
a
doesn't have a primary key before saving. It can not have it theoretically. This is documented in http://docs.djangoproject.com/en/dev/ref/models/instances/#auto-incrementing-primary-keys .
I already knew that it's only after saving the object once that it gets a primary key. But in this case I just assumed that creating the relation (b.a = a
) would not be affected whether it was set before or after the related object was saved. I assumed that subsequently saving a
then b
would also save the relation, no matter the prior sequence of operations.
I've been using Django for a while now, and I've always been quite careful with this logic. But here I admit I was quite surprised by this behaviour (in fact, I was equally surprised that I didn't get hit by this before, which is why I identified it as a bug in Django).
Malcolm, you're talking about improving the error message, but the code above doesn't actually generate any error. In any case, I'd be happy to learn more about how that all works when the docs are updated.
follow-up: 5 comment:4 by , 16 years ago
Triage Stage: | Unreviewed → Accepted |
---|
I know it doesn't generate an error. It should be changed from showing no error to showing something.
Your code is incorrect and Django should tell you that.
comment:5 by , 16 years ago
Replying to mtredinnick:
I know it doesn't generate an error. It should be changed from showing no error to showing something.
Your code is incorrect and Django should tell you that.
That would be very helpful! Thanks Malcolm for this clarification.
comment:6 by , 16 years ago
I step at the same problem last week and think julien is right in his assumption. It is quite natural to instantiate models first, than fill them with appropriate data, and finally save models in bulk.
Imagine a case when ModelA instance is irrelevant without corresponding ModelB instance. Separated saves will cause one model, say ModelA, to be saved, then find that ModelB cannot be saved for some reason, thus causing ModelA to be deleted right after it has been saved.
comment:7 by , 14 years ago
Cc: | added |
---|---|
Owner: | changed from | to
Severity: | → Normal |
Status: | reopened → new |
Type: | → Uncategorized |
comment:8 by , 14 years ago
Type: | Uncategorized → Bug |
---|
by , 13 years ago
Attachment: | 8892.patch added |
---|
comment:9 by , 13 years ago
Easy pickings: | unset |
---|---|
Has patch: | set |
Triage Stage: | Accepted → Ready for checkin |
UI/UX: | unset |
I fixed the bug as follows. If you save a model with a foreign key to a model which is not saved until this point, django will show an error:
>>> a = ModelA(name="a") >>> b = ModelB(name="b") >>> a.b = b >>> a.save() Traceback (most recent call last): File "<console>", line 1, in <module> File "django\db\models\base.py", line 492, in save self.save_base(using=using, force_insert=force_insert, force_update=force_update) File "django\db\models\base.py", line 521, in save_base self.refresh_foreign_keys() File "django\db\models\base.py", line 475, in refresh_foreign_keys ". Have you already saved the " + str(fk_ref) + " object?") ValueError: Cannot find a primary key for b as a foreign key in an instance of ModelA. Have you already saved the b object?
If you save a related model after setting the related one to the first object, it works as expected.
>>> a = ModelA(name="a") >>> b = ModelB(name="b") >>> a.b = b >>> b.save() >>> a.save()
Problems I've considered (sometimes I use id as synonym for the foreign key of an object):
- Compatible to users accessing to the foreign key field as follows:
>>> a = ModelA(name="a") >>> b = ModelB(name="b") >>> b.save() >>> a.b = b >>> a.b_id
So it is neccessary to set the id to a_id during the set method of a.b = b. However the assertion that the id of b is the same as a.b_id must be before writing a to the database.
- Not loading the related object from the database.
>>> a = ModelA.objects.get(name="a") >>> a.name = "b" >>> a.save()
The instance of Object b will not be loaded from the database to reduce db-queries.
- This patch does only try to set the primary key of the related object, if no primary key is set and a related object exists. In this way it will never do an additional database query. Of course it produces an overhead for every call of the save() method, but by comparison to the following database-query it is negligible.
comment:10 by , 13 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:11 by , 13 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
comment:12 by , 13 years ago
Needs tests: | set |
---|---|
Triage Stage: | Ready for checkin → Accepted |
Please don't mark your own patches as RFC — see the contribution guidelines.
comment:14 by , 13 years ago
Patch needs improvement: | set |
---|
@blacklwhite:
Thanks for having a go at this. I think this patch is the wrong approach, however. It is unnecessarily complicated, and changes the semantics of saving an object in some significant ways, by adding a new point where the object's fields are set from the related object.
We want a smaller fix, which only changes existing behaviour for the case when a developer has made a mistake - as in the example above. We already have checking at the point of assignment, such as checking that the object is of the right type, which is in SingleRelatedObjectDescriptor
, and that's where the new check should be made. It should be a two liner.
Of course, this also needs tests.
On a point of style, you should use % interpolation for building up strings.
by , 13 years ago
Attachment: | t8892_2.patch added |
---|
new bugfix in (Reverse-)SingleRelatedObjectDescriptor and tests
comment:15 by , 13 years ago
Needs tests: | unset |
---|---|
Patch needs improvement: | unset |
Thank you for your feedback. Your hint made it much more simpler to find the correct piece of code to bugfix it. And - as you said - it fits formerly as a two liner into the existing code :-)
The patch will now raise an error and results in the following behaviour:
>>> a = ModelA(name="a") >>> b = ModelB(name="b") >>> a.b = b ValueError: Cannot assign "<ModelB: ModelB object>": "ModelB.b" must have a primary key.
It does the same for the reversed scenario.
>>> a = ModelA(name="a") >>> b = ModelB(name="b") >>> b.a_set.add(a)
Is the error message ok or would it be better to append "You have to save <ModelB: ModelB object> before assignment." or something like that?
Are the both tests in the right class / method?
What about compatibility to people using a workaround like follows; don't considering about that?
>>> a = ModelA(name="a") >>> b = ModelB(name="b") >>> a.b = b >>> print a.b.name >>> b.save() >>> a.b_id = a.b.id >>> a.save()
follow-up: 17 comment:16 by , 13 years ago
Patch needs improvement: | set |
---|
This is much better, but the tests don't test some of the new code. If you remove the hunk of the patch for SingleRelatedObjectDescriptor
, the tests added still pass. (Hint: the cycle described on Test-driven development would have prevented this - always test first and make sure the test fails).
comment:17 by , 13 years ago
Replying to lukeplant:
This is much better, but the tests don't test some of the new code. If you remove the hunk of the patch for
SingleRelatedObjectDescriptor
, the tests added still pass. (Hint: the cycle described on Test-driven development would have prevented this - always test first and make sure the test fails).
I used the following command runtests.py --settings=test_sqlite many_to_one
to test django. I've tried it again with the current django trunk and the first added assertion ends in a failed test. The second one ends in an errored test because of raising an IntegrityError.
After installing the patch the tests work fine.
I used TDD so your statement about the passing tests is curious for me. Where could my mistake be?
comment:18 by , 12 years ago
I wonder if the set is the proper place to raise the error. One could construct model instances without ever saving them. Isn't the error really happening at save time?
The check would be to go though foreign key fields in model.save(). If the instance's __dict__
contains a model for the foreign key field's name, but the value for field.attname in self.__dict__
is missing or is None, then raise an error.
comment:20 by , 12 years ago
Status: | reopened → new |
---|
comment:21 by , 11 years ago
I am a little confused why the original example isn't supported.
>>> a = ModelA(name='foo') >>> b = ModelB(name='bar') >>> b.a = a # Set relation *before* saving related object >>> a.save() # Save related object >>> a.pk 123 >>> b.a.pk 123 >>> b.save() >>> b = ModelB.objects.get(name='bar') >>> b.a <NoneType: None>
While I agree the docs suggests auto-increment keys can't be determined upfront, the django model above shows a.pk is already known BEFORE I save b. Hence I don't see why Django can't save object b correctly.
Instead, I have to use the below suggested hack in the StackOverlow question. Yet this seems cumbersome and unnecessary.
>>> b.a = b.a >>> b.save()
comment:22 by , 11 years ago
The original example isn't supported because, deep inside the ORM, b.a = a
is implemented as b.a_id = a.id
.
Changing this design would require rewriting most of the related fields code, which isn't for the faint of heart ;-)
comment:23 by , 11 years ago
What we could possibly do is check that b.a.id == b.a_id. If not, then throw an error. This would break some code for sure, but on the other hand it would be a nice protection. Instead of saving corrupted data the ORM layer forces you to write code where both b.a.id and b.a_id agree on the value.
There are a couple of other options, too, like when a.save() is called, then b.a_id is set (requires that a knows it is assigned to b), or on b.save() use the value of b.a.id instead of b.a_id (this will break code that relies on saving b.a_id).
Raising an exception whenever b.a_id doesn't agree on b.a.id's value is worth considering. I believe exception is a lot better than the possibility for bugs that lead to silent data corruption.
comment:24 by , 11 years ago
Resolution: | → duplicate |
---|---|
Status: | new → closed |
Looks like a duplicate of #10811.
This is quite expected as
a
doesn't have a primary key before saving. It can not have it theoretically. This is documented in http://docs.djangoproject.com/en/dev/ref/models/instances/#auto-incrementing-primary-keys .