#17541 closed Bug (fixed)
Unexpected ForeignKey Behavior with self.pk == None
Reported by: | Peter Sheats | Owned by: | anonymous |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.3 |
Severity: | Release blocker | Keywords: | |
Cc: | anssi.kaariainen@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I'm not sure if this is technically a bug but I wanted to report it because it really surprised me today.
class Manufacturer(models.Model): name = models.CharField(blank=True, max_length=100) class Car(models.Model): manufacturer = models.ForeignKey(Manufacturer, blank=True, null=True) name = models.CharField(blank=True, max_length=100) Car.objects.create(name='Accord') Car.objects.create(name='Civic') print Manufacturer().car_set.all().count() >>> 2
Even though neither Car is related to a Manufacturer, they are both returned through the related qs. I'm assuming this is because the query looks something like this:
manufacturer = Manufacturer() Car.objects.filter(manufacturer=manufacturer.id)
I would say it is incorrect behavior to return anything through that related queryset that is not explicitly related.
Attachments (3)
Change History (26)
comment:1 by , 13 years ago
Cc: | added |
---|---|
Has patch: | set |
Needs tests: | set |
Patch needs improvement: | set |
Triage Stage: | Unreviewed → Accepted |
by , 13 years ago
Attachment: | 17541.diff added |
---|
comment:2 by , 13 years ago
Needs tests: | unset |
---|---|
Patch needs improvement: | unset |
comment:3 by , 13 years ago
Patch needs improvement: | set |
---|
Patched behavior as is does not behave as I would expect in a finer-grained fashion. Query as constructed should correctly reflect the expected results. For example, these two queries appear to be equivalent, but when run, have different results:
>>> print Manufacturer().car_set.all().query SELECT "cars_car"."id", "cars_car"."manufacturer_id", "cars_car"."name" FROM "cars_car" >>> Manufacturer().car_set.all() [] >>> print Car.objects.all().query SELECT "cars_car"."id", "cars_car"."manufacturer_id", "cars_car"."name" FROM "cars_car" >>> Car.objects.all() [<Car: Car object>, <Car: Car object>]
This is a consequence of EmptyQuerySet
inheriting the query during the clone()
.
In fact, I'm not even sure that the call to none()
is the correct thing to do at all. Since getting related fields of unsaved model instances (the only ones whose id
can be None
) is such an odd edge case, it almost seems like it would only come up as a result of a programming error. Therefore, I would think it would be better treated as an error, by raising an exception, perhaps a ValueError
.
comment:4 by , 13 years ago
About the EmptyQuerySet
query string behaviour: That is how it works. That is a deficiency of .none(), and works just like any other query using .none(). Fixing that would be another ticket's problem.
However, if raising error here is the correct thing to do is an interesting question. It does have its plus-sides: it would allow us to change the implementation later if need be, and it would be very obvious to the programmer what is happening. On the other hand, in this case .car_set would not behave like queryset. This could be a problem when you have a model like:
class SomeModel: id = models.AutoField(pk=True) some_f = models.CharField(max_length=200, unique=True, null=True) class RelatedModel: somemodel = models.ForeignKey(SomeModel, to_field='some_f', null=True)
Now, you could have an instance of somemodel in the DB, with some_f null and related models with null somemodel field. Now, that is a bit unlikely case, but here raising an error could be a little dangerous. I wonder what joys multicolumn primary keys and especially foreign keys will bring us in the future.
comment:5 by , 13 years ago
I was preparing a response too, while akaariai posted his comment. So agreed on the peculiarity of the representation of the query attribute on EmptyQuerySet.
I still think that at a semantic level, an unsaved model should have empty related sets, and querying those should not raise errors.
by , 13 years ago
Attachment: | 17541-3.diff added |
---|
comment:6 by , 13 years ago
Patch needs improvement: | unset |
---|
Okay, I agree. I had never noticed that behavior of EmptyQuerySet
before.
Taking those considerations into account, what do you think about throwing the none()
onto the filtered result instead, as in my suggested patch? That way, the query
would at least no longer be peculiar. Then, if the EmptyQuerySet
behavior is ever fixed, the correctness of this query will get fixed as a consequence.
So, whether you like my suggested change, or reject it, you've convinced me, and I no longer think this patch needs improvement.
comment:7 by , 13 years ago
Your approach is much better. That is the right spot to add the .none() call.
Would there be a need for similar handling for M2M and GenericRelations
. I don't know and do not have time to check this right now.
As for .none() fixing: I think that could be fixed in an elegant way by introducing EmptyClause
which you would add to the query's where when .none() is called. The EmptyClause
would throw an EmptyResultSet
from its .as_sql() method (which would be called by compiler.as_sql() -> where.as_sql() -> for each node, call as_sql()). Thus, you would not need the EmptyQuerySet
class at all. You would get EmptyResultSet
exception from print qs.query. I think that would work. Not sure, as I haven't tried. But as said, different ticket's problem.
comment:8 by , 13 years ago
I spent some time looking at the rest of related.py
, namely ManyRelatedManager
, and found that in that case, it raises a ValueError
(and has for many years). Check out lines 538-539, inside ManyRelatedManager.__init__
, and see #1745. One to one fields fail in this manner as well (this would need to be fixed in SingleRelatedObjectDescriptor.__get__
, as it does not use a Manager).
So, I'm going to change my position on this slightly: for this bug to be closed, they should all be consistent. Either ValueError
should be raised in all three cases, or it should silently succeed in all three cases.
comment:9 by , 13 years ago
This is a duplicate of #15146. I don't know which one to close at this point
comment:10 by , 13 years ago
The two patches are similar, though the patch in #15146 also results in a peculiar query, so I would be partial to closing it, and leaving this ticket open. I'm not sure where the tests belong, though. Maybe include the tests from both patches, if that's not just redundant?
I will still encourage, though, that this ticket be used to address the behavior in all related object descriptors, making the behavior consistent across them all.
comment:11 by , 13 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
I've check the code. Looks good. Fixes the problem.
comment:12 by , 13 years ago
Triage Stage: | Ready for checkin → Design decision needed |
---|
I don't think this is ready for checkin, as there is still the question of how to make this consistent across all the different related object descriptors.
My opinion is that the code should return or qs.none() in the null field case. It is a nice feature that if you fetch objects from the DB, and then fetch related objects to those objects, you will never get ValueError
. There is a use-case here: you can have nullable unique field which is then referenced by some other model - in that case there can be no objects related to the instance, but throwing ValueError
isn't a nice API either. Granted, this use case isn't that common, but we can make it work for free.
So, Im marking this as DDN.
comment:13 by , 12 years ago
Please upgrade the severity of this. It has just caused entirely sane-looking code to delete over half our database rows. Please have this fixed ASAP, especially when the fix is rather trivial and the bug has been reported over two years ago.
follow-ups: 15 20 comment:14 by , 12 years ago
Severity: | Normal → Release blocker |
---|
The reason this hasn't been addressed for 2 years is because it's an edge case, that to date, hasn't manifested as anything other than slightly unexpected query results. There is a *very* large Django users community, and there hasn't been a mass outrage at this edge case -- mostly because, as reported, it requires the user to do something somewhat odd to start with, and the only side effect are results that are slightly confusing, but are at least internally consistent (provided you accept that an unsaved Manufacturer is equivalent to "no manufacturer").
If this bug has caused data loss, then this is more concerning. However, you haven't provided any details of the "sane looking code" that caused the data loss, so we're not in a position to judge whether your definition of sanity is the same as ours. If this is a genuine data loss bug, then it would be a release-blocking issue.
I'm tentatively marking this as a release blocker on the basis that your report is of a data-loss related issue. However, we need specifics. Once we have those specifics, if our interpretation of "sane" doesn't match yours, the outcome may well be to downgrade it back to a non release-blocking bug.
comment:15 by , 12 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
I will likely go with slightly modified version of Nate's patch, plan is to commit to master + 1.5.x. We might want to commit to 1.4 too if this is seen as data loss bug.
In any case, I will post a patch for review before committing.
I think one scenario for data loss is this code:
for obj in someqs: obj.related_qs.all().delete()
Here the related_qs can return all null FK objects resulting in data loss. To me it seems like it is correct to blame this bug for data loss. This can only happen if you have to_field defined and the relation is nullable on both sides. Somewhat unlikely, but a real possibility.
comment:16 by , 12 years ago
Triage Stage: | Design decision needed → Accepted |
---|
I have two versions of the patch. One is minimal change to fix the immediate bug, another is larger refactoring which makes m2m and fk relations work equivalently. The first is against 1.5, the second against master. The minimal patch is available from here and the "make relations consistent" patch is available from here.
The minimal patch is almost the same as Nate's latest patch. It has a bit better tests for nullable relations. The added test case shows how you can get corrupt data back using *saved* models. Deletion does work correctly, it is just the reverse related object querying which is failing.
I think the best plan is to push the minimal patch both to 1.5.x and master, then open another ticket for making reverse fk and m2m relations work consistently. There are some considerations for the consistency approach: Can we force such consistency without deprecation periods? Does the patch cover all needed relations (generic relations and o2o come to mind here)?
comment:17 by , 12 years ago
I am going with the minimalistic approach for 1.5 and master, and then open a new ticket for unification of m2m and fk relation handling in the nullable case.
Not going to backpatch to 1.4 at the same time. I am not yet sure of the data-loss potential, waiting for more info about that.
comment:18 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:20 by , 12 years ago
Replying to russellm:
I'm tentatively marking this as a release blocker on the basis that your report is of a data-loss related issue. However, we need specifics. Once we have those specifics, if our interpretation of "sane" doesn't match yours, the outcome may well be to downgrade it back to a non release-blocking bug.
Certainly. The bug is fixed now, but I can provide specifics for posterity. The application was running on AppEngine, so the datastore isn't relational, and I needed to delete related data manually. I had overridden a delete()
method of a UserProfile and added a bunch of self.user.related_models.delete()
. When a User object got deleted without the corresponding UserProfile object being deleted, I deleted it manually in code.
However, since this isn't a relational datastore, profile.user
was None, and related_model.all()
pointed to *over half the rows*, since it was an optional attribute.
As you can see, this is very ordinary code, with the "sin" of running on a non-relational datastore.
Apart from that, it has caused multiple bugs in our current codebase, which is using a relational DB. It hasn't caused data loss yet, but it has caused us multiple instances of incorrect permissions, users not being able to create objects due to allocation-changing code that misbehaves, etc.
I'm very glad to see this going into the next release, I can finally remove a bunch of overrides from throughout our codebase (not to mention avoid losing data on GAE in the future).
While I don't think this is the worst bug ever, I do agree the behavior is incorrect.
Probably the cleanest solution is to check if manufacturer.id is None in car_set manager, and if it is, return an
EmptyQuerySet
. Attached patch contains the idea of the correction.