Code

Opened 6 years ago

Last modified 13 months ago

#8892 new Bug

ForeignKey relation not saved as expected

Reported by: julien Owned by: blacklwhite
Component: Database layer (models, ORM) Version: 1.0
Severity: Normal Keywords:
Cc: blacklwhite Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Here are two simple models:

class ModelA(models.Model):
    name = models.CharField(max_length=10)

class ModelB(models.Model):
    name = models.CharField(max_length=10)
    a = models.ForeignKey(ModelA, blank=True, null=True)

Now, see the following sequence. When setting the relation *before* saving the related object, the relation itself is not properly saved:

>>> a = ModelA(name='foo')
>>> b = ModelB(name='bar')
>>> b.a = a # Set relation *before* saving related object
>>> a.save() # Save related object
>>> b.save()
>>> b.a
<ModelA: ModelA object>
>>> b = ModelB.objects.get(name='bar')
>>> b.a
<NoneType: None>

Logically I'd expect that, since a is saved before b, the relation would be properly saved but it is not.

Yet, if setting the relation *after* saving the related object, then it works fine (in the below example I only swapped 2 lines: b.a = a and a.save()):

>>> a = ModelA(name='foo')
>>> b = ModelB(name='bar')
>>> a.save() # Save related object
>>> b.a = a # Set relation *after* saving related object
>>> b.save()
>>> b.a
<ModelA: ModelA object>
>>> b = ModelB.objects.get(name='bar')
>>> b.a
<ModelA: ModelA object>

I've tested with MySQL and SQLite.

Attachments (2)

8892.patch (2.7 KB) - added by blacklwhite 3 years ago.
t8892_2.patch (2.7 KB) - added by blacklwhite 3 years ago.
new bugfix in (Reverse-)SingleRelatedObjectDescriptor and tests

Download all attachments as: .zip

Change History (22)

comment:1 follow-up: Changed 6 years ago by mrts

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Resolution set to wontfix
  • Status changed from new to closed

This is quite expected as a doesn't have a primary key before saving. It can not have it theoretically. This is documented in http://docs.djangoproject.com/en/dev/ref/models/instances/#auto-incrementing-primary-keys .

comment:2 Changed 6 years ago by mtredinnick

  • Resolution wontfix deleted
  • Status changed from closed to reopened

The documentation still needs to be clarified and we can look at raising a more explicit error message.

comment:3 in reply to: ↑ 1 Changed 6 years ago by julien

Replying to mrts:

This is quite expected as a doesn't have a primary key before saving. It can not have it theoretically. This is documented in http://docs.djangoproject.com/en/dev/ref/models/instances/#auto-incrementing-primary-keys .

I already knew that it's only after saving the object once that it gets a primary key. But in this case I just assumed that creating the relation (b.a = a) would not be affected whether it was set before or after the related object was saved. I assumed that subsequently saving a then b would also save the relation, no matter the prior sequence of operations.

I've been using Django for a while now, and I've always been quite careful with this logic. But here I admit I was quite surprised by this behaviour (in fact, I was equally surprised that I didn't get hit by this before, which is why I identified it as a bug in Django).

Malcolm, you're talking about improving the error message, but the code above doesn't actually generate any error. In any case, I'd be happy to learn more about how that all works when the docs are updated.

comment:4 follow-up: Changed 6 years ago by mtredinnick

  • Triage Stage changed from Unreviewed to Accepted

I know it doesn't generate an error. It should be changed from showing no error to showing something.

Your code is incorrect and Django should tell you that.

comment:5 in reply to: ↑ 4 Changed 6 years ago by julien

Replying to mtredinnick:

I know it doesn't generate an error. It should be changed from showing no error to showing something.
Your code is incorrect and Django should tell you that.

That would be very helpful! Thanks Malcolm for this clarification.

comment:6 Changed 5 years ago by Valera Grishin

I step at the same problem last week and think julien is right in his assumption. It is quite natural to instantiate models first, than fill them with appropriate data, and finally save models in bulk.

Imagine a case when ModelA instance is irrelevant without corresponding ModelB instance. Separated saves will cause one model, say ModelA, to be saved, then find that ModelB cannot be saved for some reason, thus causing ModelA to be deleted right after it has been saved.

comment:7 Changed 3 years ago by blacklwhite

  • Cc blacklwhite added
  • Owner changed from nobody to blacklwhite
  • Severity set to Normal
  • Status changed from reopened to new
  • Type set to Uncategorized

comment:8 Changed 3 years ago by lukeplant

  • Type changed from Uncategorized to Bug

Changed 3 years ago by blacklwhite

comment:9 Changed 3 years ago by blacklwhite

  • Easy pickings unset
  • Has patch set
  • Triage Stage changed from Accepted to Ready for checkin
  • UI/UX unset

I fixed the bug as follows. If you save a model with a foreign key to a model which is not saved until this point, django will show an error:

>>> a = ModelA(name="a")
>>> b = ModelB(name="b")
>>> a.b = b
>>> a.save()
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "django\db\models\base.py", line 492, in save
    self.save_base(using=using, force_insert=force_insert, force_update=force_update)
  File "django\db\models\base.py", line 521, in save_base
    self.refresh_foreign_keys()
  File "django\db\models\base.py", line 475, in refresh_foreign_keys
    ". Have you already saved the " + str(fk_ref) + " object?")
ValueError: Cannot find a primary key for b as a foreign key in an instance
 of ModelA. Have you already saved the b object?

If you save a related model after setting the related one to the first object, it works as expected.

>>> a = ModelA(name="a")
>>> b = ModelB(name="b")
>>> a.b = b
>>> b.save()
>>> a.save()

Problems I've considered:

  • Compatible to users accessing to the foreign key field as follows:
    >>> a = ModelA(name="a")
    >>> b = ModelB(name="b")
    >>> b.save()
    >>> a.b = b
    >>> a.b_id
    

So it is neccessary to set the id to a_id during the set method of a.b = b. However, the assertion that the fk of b is the same as a.b_id must be true before writing a to the database.

  • Not loading the related object from the database.
    >>> a = ModelA.objects.get(name="a")
    >>> a.name = "b"
    >>> a.save()
    

The instance of Object b will not be loaded from the database to reduce db-queries.

  • This patch does only try to set the primary key of the related object, if no primary key is set and a related object exists. In this way it will never do an additional database query. Of course it produces an overhead for every call of the save() method, but by comparison to the following database-query it is negligible.
Last edited 3 years ago by blacklwhite (previous) (diff)

comment:10 Changed 3 years ago by blacklwhite

  • Resolution set to fixed
  • Status changed from new to closed

comment:11 Changed 3 years ago by blacklwhite

  • Resolution fixed deleted
  • Status changed from closed to reopened

comment:12 Changed 3 years ago by aaugustin

  • Needs tests set
  • Triage Stage changed from Ready for checkin to Accepted

Please don't mark your own patches as RFC — see the contribution guidelines.

comment:13 Changed 3 years ago by blacklwhite

Sorry, I misinterpreted the guideline chart.

comment:14 Changed 3 years ago by lukeplant

  • Patch needs improvement set

@blacklwhite:

Thanks for having a go at this. I think this patch is the wrong approach, however. It is unnecessarily complicated, and changes the semantics of saving an object in some significant ways, by adding a new point where the object's fields are set from the related object.

We want a smaller fix, which only changes existing behaviour for the case when a developer has made a mistake - as in the example above. We already have checking at the point of assignment, such as checking that the object is of the right type, which is in SingleRelatedObjectDescriptor, and that's where the new check should be made. It should be a two liner.

Of course, this also needs tests.

On a point of style, you should use % interpolation for building up strings.

Changed 3 years ago by blacklwhite

new bugfix in (Reverse-)SingleRelatedObjectDescriptor and tests

comment:15 Changed 3 years ago by blacklwhite

  • Needs tests unset
  • Patch needs improvement unset

Thank you for your feedback. Your hint made it much more simpler to find the correct piece of code to bugfix it. And - as you said - it fits formerly as a two liner into the existing code :-)

The patch will now raise an error and results in the following behaviour:

>>> a = ModelA(name="a")
>>> b = ModelB(name="b")
>>> a.b = b
ValueError: Cannot assign "<ModelB: ModelB object>": "ModelB.b" must have
a primary key.

It does the same for the reversed scenario.

>>> a = ModelA(name="a")
>>> b = ModelB(name="b")
>>> b.a_set.add(a)

Is the error message ok or would it be better to append "You have to save <ModelB: ModelB object> before assignment." or something like that?
Are the both tests in the right class / method?
What about compatibility to people using a workaround like follows; don't considering about that?

>>> a = ModelA(name="a")
>>> b = ModelB(name="b")
>>> a.b = b
>>> print a.b.name
>>> b.save()
>>> a.b_id = a.b.id
>>> a.save()
Last edited 3 years ago by blacklwhite (previous) (diff)

comment:16 follow-up: Changed 3 years ago by lukeplant

  • Patch needs improvement set

This is much better, but the tests don't test some of the new code. If you remove the hunk of the patch for SingleRelatedObjectDescriptor, the tests added still pass. (Hint: the cycle described on Test-driven development would have prevented this - always test first and make sure the test fails).

comment:17 in reply to: ↑ 16 Changed 3 years ago by blacklwhite

Replying to lukeplant:

This is much better, but the tests don't test some of the new code. If you remove the hunk of the patch for SingleRelatedObjectDescriptor, the tests added still pass. (Hint: the cycle described on Test-driven development would have prevented this - always test first and make sure the test fails).

I used the following command runtests.py --settings=test_sqlite many_to_one to test django. I've tried it again with the current django trunk and the first added assertion ends in a failed test. The second one ends in an errored test because of raising an IntegrityError.

After installing the patch the tests work fine.

I used TDD so your statement about the passing tests is curious for me. Where could my mistake be?

comment:18 Changed 21 months ago by akaariai

I wonder if the set is the proper place to raise the error. One could construct model instances without ever saving them. Isn't the error really happening at save time?

The check would be to go though foreign key fields in model.save(). If the instance's __dict__ contains a model for the foreign key field's name, but the value for field.attname in self.__dict__ is missing or is None, then raise an error.

comment:19 Changed 18 months ago by anonymous

Here's a StackOverflow question about this

comment:20 Changed 13 months ago by aaugustin

  • Status changed from reopened to new

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as new
The owner will be changed from blacklwhite to anonymous. Next status will be 'assigned'
as The resolution will be set. Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.