Opened 6 years ago

Closed 2 years ago

Last modified 2 years ago

#11811 closed Bug (fixed)

No error raised on update(foreignkey=unsavedobject) on nullable fk

Reported by: Afief Owned by: aaugustin
Component: Database layer (models, ORM) Version: 1.1
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Found this while coding at 2AM:
When there is a field that can be NULL, and you do something like this:
obj1=Obj1(...)
obj2=Obj2.objects.all().update(foreignobj=obj1)

django will assign the fields to NULL.

I'm not sure if this behaviour is made to be so by design, but it seems to me that if someone passes an unsaved object instead of None they made a mistake and django should raise an exception.

Attachments (2)

11811-patch-rev-3.diff (6.0 KB) - added by ashearer 6 years ago.
Patch
11811-patch-rev-5.diff (7.2 KB) - added by ashearer 6 years ago.
Patch (with error on save instead of on assignment)

Download all attachments as: .zip

Change History (19)

comment:1 Changed 6 years ago by russellm

  • milestone set to 1.2
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

comment:2 Changed 6 years ago by ashearer

  • Owner changed from nobody to ashearer
  • Status changed from new to assigned

Changed 6 years ago by ashearer

Patch

comment:3 Changed 6 years ago by ashearer

The same surprising behavior occurs whether you assign to the ForeignKey field by attribute assignment, a model initializer, model.objects.create(), or queryset.update(). In any of the cases, passing an unsaved model acts the same as passing None. (That's assuming no manual pk assignment.) The first three use a different code path than the last.

I'm attaching a strawman patch that will raise a ValueError with a helpful message in all four cases. The docs had left the behavior undefined.

However, the patch needs further work:

  1. Raising an exception at assignment time may be a bad idea, even though the inconsistency between my_fk_field (an object) and my_fk_field_id (None) starts then. Everything would have worked OK while the same object remained in memory; only the copy saved in the DB will have no link at all to the other instance. Checking at save time may be better.
  2. To match the scope of the ticket, I'm only handling assignment for non-null fields. Non-nullable fields already raise an exception, though it occurs later (on save), and the error message is less specific as a result. But the queryset.update() code path now always throws the new exception regardless of whether the field is nullable, because field information isn't available to it.
  3. Two tests of multi-DB support now fail, because they make the same mistake the new exception is designed to guard against. Either putting explicit IDs in the tests or changing the timing of the exceptions as per point 1 may help. The rest of the test suite has no additional failures.

The patch includes tests but not docs. I'll attach a docs patch separately once the implementation settles down.

comment:4 Changed 6 years ago by ashearer

I shouldn't have said "passing an unsaved model acts the same as passing None". That's true only for the ForeignKeyField as saved in the DB. (It doesn't matter whether the unsaved foreign model itself is saved after the assignment.) In memory, the attribute still points to the unsaved model instance.

Changed 6 years ago by ashearer

Patch (with error on save instead of on assignment)

comment:5 Changed 6 years ago by ashearer

  • Has patch set

The updated patch (rev-5) doesn't raise exceptions for model instances that are never saved anyway, which fixes the multi-DB test failures.

The one remaining design issue is a backward compatibility vs. consistency tradeoff. The patch errs on the side of backward compatibility: if the foreign key field is declared non-null, assignment works as before, and lets the database raise its own integrity error instead of the new ValueError exception with a detailed error message. The queryset.update() patch can't tell how the field is declared, though, so it always raises ValueError. Raising ValueError in all cases would be more consistent and probably more helpful, but it could break compatibility for code that's already expecting to catch a particular integrity error.

Also, the patch could attempt to be smart. When it detects that the foreign key value is missing but there is a cached instance, it could fill the key in from that instance, in case the instance was saved after the assignment. But that's starting to get magical, so it doesn't do that.

comment:6 Changed 6 years ago by ashearer

Here are some examples to make this more concrete.

# Nullable foreign key field, without patch, previous surprising behavior:
>>> s = Site(name='My new site')
>>> f = Flatpage(id=1, text='I am being published on that site')
>>> f.site = s              # set a site for the Flatpage
>>> s.save()                # (site is saved now, but doesn't change outcome)
>>> f.site                  # site value is still there, looks good
<Site: My new site>

Old behavior from that point:

>>> f.save()                # save it to DB
>>> f = Flatpage.objects.get(id=1)  # read it back
>>> print f.site            # ...and the Flatpage's site value disappears!
None

Patched behavior from that point:

>>> f.save()                # will now raise an exception about the anomaly
Traceback (most recent call last):
...
ValueError: Cannot save a foreign key referring to the instance "<Site: My new site>" into the field "Flatpage.site", because the key value for the instance was not available at the time of assignment to the field. (For autonumbered keys, the related instance may need to be saved prior to the assignment.)

The behavior for assignment to non-null fields with the rev-5 patch is the same as before:

# Test a non-null field. Behavior is unchanged.
>>> d = Department(name='An Unsaved Department')
>>> w = Worker(id=1, name='Milton Waddams', department=d)
>>> w.save()
Traceback (most recent call last):
...
IntegrityError: model_regress_worker.department_id may not be NULL
or
Warning: Column 'department_id' cannot be null
or
other DB-specific error

However, the batch queryset.update() method raises a new ValueError exception regardless of nullability:

>>> d1 = Department(id=1, name='A Real Department')
>>> d1.save()
>>> d2 = Department(name='An Unsaved Department')
>>> Worker.objects.create(id=2, name='Milton Waddams', department=d1)
<Worker: Milton Waddams>
>>> Worker.objects.filter(id=2).update(department=d2)
Traceback (most recent call last):
...
ValueError: Cannot assign the instance "<Department: An Unsaved Department>" to a foreign key field, because its primary key is not available. (For autonumbered keys, the instance would need to be saved first.)

comment:7 Changed 6 years ago by ashearer

  • Needs documentation set

comment:8 Changed 5 years ago by russellm

  • milestone changed from 1.2 to 1.3

Not critical for 1.2

comment:9 Changed 4 years ago by julien

  • Severity set to Normal
  • Type set to Bug

comment:10 Changed 4 years ago by jacob

  • milestone 1.3 deleted

Milestone 1.3 deleted

comment:11 Changed 4 years ago by aaugustin

  • UI/UX unset

Change UI/UX from NULL to False.

comment:12 Changed 4 years ago by aaugustin

  • Easy pickings unset

Change Easy pickings from NULL to False.

comment:13 Changed 3 years ago by aaugustin

Related to #10811.

comment:14 Changed 3 years ago by aaugustin

  • Owner changed from ashearer to aaugustin

comment:15 Changed 2 years ago by aaugustin

In fact, this may be fixed independently from #10811, which turned out to be extremely hard.

While I can imagine use cases for the behavior #10811 is trying to prevent, I don't see any for this one. It's clearly a data-loss bug.

comment:16 Changed 2 years ago by Aymeric Augustin <aymeric.augustin@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In b4cd8169de604943f8aaee3666282c95e650e5f4:

Fixed #11811 -- Data-loss bug in queryset.update.

It's now forbidden to call queryset.update(field=instance) when instance
hasn't been saved to the database ie. instance.pk is None.

comment:17 Changed 2 years ago by Aymeric Augustin <aymeric.augustin@…>

In f855058c35db9d35ba7f310d8f9db6f4533b7424:

[1.6.x] Fixed #11811 -- Data-loss bug in queryset.update.

It's now forbidden to call queryset.update(field=instance) when instance
hasn't been saved to the database ie. instance.pk is None.

Conflicts:

tests/queries/tests.py

Backport of b4cd8169 from master.

Note: See TracTickets for help on using tickets.
Back to Top