Code

Opened 2 years ago

Closed 16 months ago

Last modified 14 months ago

#17541 closed Bug (fixed)

Unexpected ForeignKey Behavior with self.pk == None

Reported by: sheats Owned by: anonymous
Component: Database layer (models, ORM) Version: 1.3
Severity: Release blocker Keywords:
Cc: anssi.kaariainen@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I'm not sure if this is technically a bug but I wanted to report it because it really surprised me today.

class Manufacturer(models.Model):
    name = models.CharField(blank=True, max_length=100)

class Car(models.Model):
    manufacturer = models.ForeignKey(Manufacturer, blank=True, null=True)
    name = models.CharField(blank=True, max_length=100)


Car.objects.create(name='Accord')
Car.objects.create(name='Civic')

print Manufacturer().car_set.all().count()    
>>> 2

Even though neither Car is related to a Manufacturer, they are both returned through the related qs. I'm assuming this is because the query looks something like this:

manufacturer = Manufacturer()
Car.objects.filter(manufacturer=manufacturer.id)

I would say it is incorrect behavior to return anything through that related queryset that is not explicitly related.

Attachments (3)

17541.diff (837 bytes) - added by akaariai 2 years ago.
17541-2.diff (1.6 KB) - added by claudep 2 years ago.
Patch with test
17541-3.diff (2.1 KB) - added by nate_b 2 years ago.

Download all attachments as: .zip

Change History (26)

comment:1 Changed 2 years ago by akaariai

  • Cc anssi.kaariainen@… added
  • Has patch set
  • Needs documentation unset
  • Needs tests set
  • Patch needs improvement set
  • Triage Stage changed from Unreviewed to Accepted

While I don't think this is the worst bug ever, I do agree the behavior is incorrect.

Probably the cleanest solution is to check if manufacturer.id is None in car_set manager, and if it is, return an EmptyQuerySet. Attached patch contains the idea of the correction.

Changed 2 years ago by akaariai

Changed 2 years ago by claudep

Patch with test

comment:2 Changed 2 years ago by claudep

  • Needs tests unset
  • Patch needs improvement unset

comment:3 Changed 2 years ago by nate_b

  • Patch needs improvement set

Patched behavior as is does not behave as I would expect in a finer-grained fashion. Query as constructed should correctly reflect the expected results. For example, these two queries appear to be equivalent, but when run, have different results:

>>> print Manufacturer().car_set.all().query
SELECT "cars_car"."id", "cars_car"."manufacturer_id", "cars_car"."name" FROM "cars_car"
>>> Manufacturer().car_set.all()
[]

>>> print Car.objects.all().query
SELECT "cars_car"."id", "cars_car"."manufacturer_id", "cars_car"."name" FROM "cars_car"
>>> Car.objects.all()
[<Car: Car object>, <Car: Car object>]

This is a consequence of EmptyQuerySet inheriting the query during the clone().

In fact, I'm not even sure that the call to none() is the correct thing to do at all. Since getting related fields of unsaved model instances (the only ones whose id can be None) is such an odd edge case, it almost seems like it would only come up as a result of a programming error. Therefore, I would think it would be better treated as an error, by raising an exception, perhaps a ValueError.

comment:4 Changed 2 years ago by akaariai

About the EmptyQuerySet query string behaviour: That is how it works. That is a deficiency of .none(), and works just like any other query using .none(). Fixing that would be another ticket's problem.

However, if raising error here is the correct thing to do is an interesting question. It does have its plus-sides: it would allow us to change the implementation later if need be, and it would be very obvious to the programmer what is happening. On the other hand, in this case .car_set would not behave like queryset. This could be a problem when you have a model like:

class SomeModel:
    id = models.AutoField(pk=True)
    some_f = models.CharField(max_length=200, unique=True, null=True)

class RelatedModel:
    somemodel = models.ForeignKey(SomeModel, to_field='some_f', null=True)

Now, you could have an instance of somemodel in the DB, with some_f null and related models with null somemodel field. Now, that is a bit unlikely case, but here raising an error could be a little dangerous. I wonder what joys multicolumn primary keys and especially foreign keys will bring us in the future.

comment:5 Changed 2 years ago by claudep

I was preparing a response too, while akaariai posted his comment. So agreed on the peculiarity of the representation of the query attribute on EmptyQuerySet.

I still think that at a semantic level, an unsaved model should have empty related sets, and querying those should not raise errors.

Last edited 2 years ago by claudep (previous) (diff)

Changed 2 years ago by nate_b

comment:6 Changed 2 years ago by nate_b

  • Patch needs improvement unset

Okay, I agree. I had never noticed that behavior of EmptyQuerySet before.

Taking those considerations into account, what do you think about throwing the none() onto the filtered result instead, as in my suggested patch? That way, the query would at least no longer be peculiar. Then, if the EmptyQuerySet behavior is ever fixed, the correctness of this query will get fixed as a consequence.

So, whether you like my suggested change, or reject it, you've convinced me, and I no longer think this patch needs improvement.

comment:7 Changed 2 years ago by akaariai

Your approach is much better. That is the right spot to add the .none() call.

Would there be a need for similar handling for M2M and GenericRelations. I don't know and do not have time to check this right now.

As for .none() fixing: I think that could be fixed in an elegant way by introducing EmptyClause which you would add to the query's where when .none() is called. The EmptyClause would throw an EmptyResultSet from its .as_sql() method (which would be called by compiler.as_sql() -> where.as_sql() -> for each node, call as_sql()). Thus, you would not need the EmptyQuerySet class at all. You would get EmptyResultSet exception from print qs.query. I think that would work. Not sure, as I haven't tried. But as said, different ticket's problem.

comment:8 Changed 2 years ago by nate_b

I spent some time looking at the rest of related.py, namely ManyRelatedManager, and found that in that case, it raises a ValueError (and has for many years). Check out lines 538-539, inside ManyRelatedManager.__init__, and see #1745. One to one fields fail in this manner as well (this would need to be fixed in SingleRelatedObjectDescriptor.__get__, as it does not use a Manager).

So, I'm going to change my position on this slightly: for this bug to be closed, they should all be consistent. Either ValueError should be raised in all three cases, or it should silently succeed in all three cases.

comment:9 Changed 2 years ago by lukeplant

This is a duplicate of #15146. I don't know which one to close at this point

comment:10 Changed 2 years ago by nate_b

The two patches are similar, though the patch in #15146 also results in a peculiar query, so I would be partial to closing it, and leaving this ticket open. I'm not sure where the tests belong, though. Maybe include the tests from both patches, if that's not just redundant?

I will still encourage, though, that this ticket be used to address the behavior in all related object descriptors, making the behavior consistent across them all.

comment:11 Changed 2 years ago by oinopion

  • Triage Stage changed from Accepted to Ready for checkin

I've check the code. Looks good. Fixes the problem.

comment:12 Changed 2 years ago by akaariai

  • Triage Stage changed from Ready for checkin to Design decision needed

I don't think this is ready for checkin, as there is still the question of how to make this consistent across all the different related object descriptors.

My opinion is that the code should return or qs.none() in the null field case. It is a nice feature that if you fetch objects from the DB, and then fetch related objects to those objects, you will never get ValueError. There is a use-case here: you can have nullable unique field which is then referenced by some other model - in that case there can be no objects related to the instance, but throwing ValueError isn't a nice API either. Granted, this use case isn't that common, but we can make it work for free.

So, Im marking this as DDN.

comment:13 Changed 16 months ago by stavros

Please upgrade the severity of this. It has just caused entirely sane-looking code to delete over half our database rows. Please have this fixed ASAP, especially when the fix is rather trivial and the bug has been reported over two years ago.

comment:14 follow-ups: Changed 16 months ago by russellm

  • Severity changed from Normal to Release blocker

The reason this hasn't been addressed for 2 years is because it's an edge case, that to date, hasn't manifested as anything other than slightly unexpected query results. There is a *very* large Django users community, and there hasn't been a mass outrage at this edge case -- mostly because, as reported, it requires the user to do something somewhat odd to start with, and the only side effect are results that are slightly confusing, but are at least internally consistent (provided you accept that an unsaved Manufacturer is equivalent to "no manufacturer").

If this bug has caused data loss, then this is more concerning. However, you haven't provided any details of the "sane looking code" that caused the data loss, so we're not in a position to judge whether your definition of sanity is the same as ours. If this is a genuine data loss bug, then it would be a release-blocking issue.

I'm tentatively marking this as a release blocker on the basis that your report is of a data-loss related issue. However, we need specifics. Once we have those specifics, if our interpretation of "sane" doesn't match yours, the outcome may well be to downgrade it back to a non release-blocking bug.

comment:15 in reply to: ↑ 14 Changed 16 months ago by akaariai

  • Owner changed from nobody to anonymous
  • Status changed from new to assigned

I will likely go with slightly modified version of Nate's patch, plan is to commit to master + 1.5.x. We might want to commit to 1.4 too if this is seen as data loss bug.

In any case, I will post a patch for review before committing.

I think one scenario for data loss is this code:

for obj in someqs:
    obj.related_qs.all().delete()

Here the related_qs can return all null FK objects resulting in data loss. To me it seems like it is correct to blame this bug for data loss. This can only happen if you have to_field defined and the relation is nullable on both sides. Somewhat unlikely, but a real possibility.

comment:16 Changed 16 months ago by akaariai

  • Triage Stage changed from Design decision needed to Accepted

I have two versions of the patch. One is minimal change to fix the immediate bug, another is larger refactoring which makes m2m and fk relations work equivalently. The first is against 1.5, the second against master. The minimal patch is available from here and the "make relations consistent" patch is available from here.

The minimal patch is almost the same as Nate's latest patch. It has a bit better tests for nullable relations. The added test case shows how you can get corrupt data back using *saved* models. Deletion does work correctly, it is just the reverse related object querying which is failing.

I think the best plan is to push the minimal patch both to 1.5.x and master, then open another ticket for making reverse fk and m2m relations work consistently. There are some considerations for the consistency approach: Can we force such consistency without deprecation periods? Does the patch cover all needed relations (generic relations and o2o come to mind here)?

comment:17 Changed 16 months ago by akaariai

I am going with the minimalistic approach for 1.5 and master, and then open a new ticket for unification of m2m and fk relation handling in the nullable case.

Not going to backpatch to 1.4 at the same time. I am not yet sure of the data-loss potential, waiting for more info about that.

comment:18 Changed 16 months ago by Anssi Kääriäinen <akaariai@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In 55da775ce1bfba20db33b56c29957faa63917980:

Fixed #17541 -- Fixed non-saved/nullable fk querying

comment:19 Changed 16 months ago by Anssi Kääriäinen <akaariai@…>

In 9d6d0de7c1704fa39efb1f725bd8f80961ce3496:

[1.5.x] Fixed #17541 -- Fixed non-saved/nullable fk querying

Backpatch of 55da775ce1bfba20db33b56c29957faa63917980

comment:20 in reply to: ↑ 14 Changed 15 months ago by stavros

Replying to russellm:

I'm tentatively marking this as a release blocker on the basis that your report is of a data-loss related issue. However, we need specifics. Once we have those specifics, if our interpretation of "sane" doesn't match yours, the outcome may well be to downgrade it back to a non release-blocking bug.

Certainly. The bug is fixed now, but I can provide specifics for posterity. The application was running on AppEngine, so the datastore isn't relational, and I needed to delete related data manually. I had overridden a delete() method of a UserProfile and added a bunch of self.user.related_models.delete(). When a User object got deleted without the corresponding UserProfile object being deleted, I deleted it manually in code.

However, since this isn't a relational datastore, profile.user was None, and related_model.all() pointed to *over half the rows*, since it was an optional attribute.

As you can see, this is very ordinary code, with the "sin" of running on a non-relational datastore.

Apart from that, it has caused multiple bugs in our current codebase, which is using a relational DB. It hasn't caused data loss yet, but it has caused us multiple instances of incorrect permissions, users not being able to create objects due to allocation-changing code that misbehaves, etc.

I'm very glad to see this going into the next release, I can finally remove a bunch of overrides from throughout our codebase (not to mention avoid losing data on GAE in the future).

comment:21 Changed 15 months ago by Anssi Kääriäinen <akaariai@…>

In 9d6d0de7c1704fa39efb1f725bd8f80961ce3496:

[1.5.x] Fixed #17541 -- Fixed non-saved/nullable fk querying

Backpatch of 55da775ce1bfba20db33b56c29957faa63917980

comment:22 Changed 14 months ago by Anssi Kääriäinen <akaariai@…>

In 09fcb70c804b76fccc8fc0ac545873e5ab30c00a:

Fixed empty strings + to_field regression on Oracle

Querying the reverse side of nullable to_field relation, where both
sides can contain null values resulted in incorrect results. The reason
was not detecting as NULL.

Refs #17541

comment:23 Changed 14 months ago by Anssi Kääriäinen <akaariai@…>

In 8a99d718f7db2d018a37f78a8e5639ff2d1d2aa8:

[1.5.x] Fixed empty strings + to_field regression on Oracle

Querying the reverse side of nullable to_field relation, where both
sides can contain null values resulted in incorrect results. The reason
was not detecting as NULL.

Refs #17541, backpatch of 09fcb70c804b76fccc8fc0ac545873e5ab30c00a.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.