Opened 7 years ago
Closed 5 years ago
#28297 closed Bug (duplicate)
Same queryset result in two different queries on ORM.
Reported by: | Marcus Renno | Owned by: | Marcus Renno |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.11 |
Severity: | Normal | Keywords: | join, annotation, F |
Cc: | akaariai@…, tom@…, jeppe@… | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Sometimes when I run a set of filter/annotation the result is different for the same variables. I'm using Django 1.11.2
The models used (simplified)
class Recipe(models.Model): name = models.CharField(max_length=50) steps = models.ManyToManyField(StepRecipe) class StepRecipe(models.Model): ingredients = models.ManyToManyField(RecipeIngredient) class RecipeIngredient(models.Model): ingredient = models.ForeignKey(Ingredient) class Ingredient(models.Model): name = models.CharField(max_length=50)
This is the queryset command:
ingredients = ['tomato'] self.queryset = Recipe.objects.all() self.queryset = self.queryset.annotate(total=Count('steps__ingredients', distinct=True)) self.queryset = self.queryset.filter(steps__ingredients__ingredient__name__in=ingredients) self.queryset = self.queryset.annotate(available=Count('steps__ingredients', distinct=True)) self.queryset = self.queryset.filter(total=F('available'))
This is the wrong query result that comes often times:
SELECT recipebook_recipe.id, recipebook_recipe.name, recipebook_recipe.dificulty, recipebook_recipe.duration, COUNT(DISTINCT recipebook_steprecipe_ingredients.recipeingredient_id) AS total, COUNT(DISTINCT recipebook_steprecipe_ingredients.recipeingredient_id) AS available FROM recipebook_recipe LEFT OUTER JOIN recipebook_recipe_steps ON (recipebook_recipe.id = recipebook_recipe_steps.recipe_id) LEFT OUTER JOIN recipebook_steprecipe ON (recipebook_recipe_steps.steprecipe_id = recipebook_steprecipe.id) LEFT OUTER JOIN recipebook_steprecipe_ingredients ON (recipebook_steprecipe.id = recipebook_steprecipe_ingredients.steprecipe_id) INNER JOIN recipebook_recipe_steps T6 ON (recipebook_recipe.id = T6.recipe_id) INNER JOIN recipebook_steprecipe T7 ON (T6.steprecipe_id = T7.id) INNER JOIN recipebook_steprecipe_ingredients T8 ON (T7.id = T8.steprecipe_id) INNER JOIN recipebook_recipeingredient T9 ON (T8.recipeingredient_id = T9.id) INNER JOIN recipebook_ingredient ON (T9.ingredient_id = recipebook_ingredient.id) WHERE recipebook_ingredient.name IN ("tomato") GROUP BY recipebook_recipe.id HAVING COUNT(DISTINCT recipebook_steprecipe_ingredients.recipeingredient_id) = (COUNT(DISTINCT recipebook_steprecipe_ingredients.recipeingredient_id)) ORDER BY NULL
And this is the right query that shows up sometimes:
SELECT recipebook_recipe.id, recipebook_recipe.name, recipebook_recipe.dificulty, recipebook_recipe.duration, COUNT(DISTINCT recipebook_steprecipe_ingredients.recipeingredient_id) AS total, COUNT(DISTINCT T8.recipeingredient_id) AS available FROM recipebook_recipe LEFT OUTER JOIN recipebook_recipe_steps ON (recipebook_recipe.id = recipebook_recipe_steps.recipe_id) LEFT OUTER JOIN recipebook_steprecipe ON (recipebook_recipe_steps.steprecipe_id = recipebook_steprecipe.id) LEFT OUTER JOIN recipebook_steprecipe_ingredients ON (recipebook_steprecipe.id = recipebook_steprecipe_ingredients.steprecipe_id) INNER JOIN recipebook_recipe_steps T6 ON (recipebook_recipe.id = T6.recipe_id) INNER JOIN recipebook_steprecipe T7 ON (T6.steprecipe_id = T7.id) INNER JOIN recipebook_steprecipe_ingredients T8 ON (T7.id = T8.steprecipe_id) INNER JOIN recipebook_recipeingredient T9 ON (T8.recipeingredient_id = T9.id) INNER JOIN recipebook_ingredient ON (T9.ingredient_id = recipebook_ingredient.id) WHERE recipebook_ingredient.name IN ("tomato") GROUP BY recipebook_recipe.id HAVING COUNT(DISTINCT recipebook_steprecipe_ingredients.recipeingredient_id) = (COUNT(DISTINCT T8.recipeingredient_id)) ORDER BY NULL
As you can see they are different. The wrong one does not set the variable 'available' appropriately. I wonder if this is something on the method .query.join()
Change History (36)
comment:1 by , 7 years ago
Description: | modified (diff) |
---|
comment:2 by , 7 years ago
comment:3 by , 7 years ago
Resolution: | → needsinfo |
---|---|
Status: | new → closed |
comment:4 by , 7 years ago
Description: | modified (diff) |
---|---|
Resolution: | needsinfo |
Status: | closed → new |
comment:5 by , 7 years ago
Description: | modified (diff) |
---|
comment:6 by , 7 years ago
Description: | modified (diff) |
---|
comment:7 by , 7 years ago
I think annotating + filter on M2M field is the issue here.
Please check this thread on django-users: https://groups.google.com/forum/#!topic/django-users/RPU-PnygbLg
comment:8 by , 7 years ago
Type: | Uncategorized → Bug |
---|
comment:9 by , 7 years ago
This appears to be related to a dictionary iteration somewhere. If you set a fixed PYTHONHASHSEED
value (e.g PYTHONHASHSEED=1) then the query is stable, otherwise it does indeed switch between two types of SQL.
In my experiments PYTHONHASHSEED=1
produces the wrong query, whereas PYTHONHASHSEED=2
produces the correct one.
comment:10 by , 7 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
follow-up: 12 comment:11 by , 7 years ago
It seems to boil down to this line: https://github.com/django/django/blob/stable/1.11.x/django/db/models/sql/query.py#L927
The alias_map.items()
isn't stable, with PYTHONHASHSEED=1
then reuse[0]
is recepie_steps
, which is incorrect, but with it set to 2
then reuse[0]
is (correctly) T6
.
If anyone knows how to fix this then please let me know, else I will try and find a suitable fix.
follow-up: 15 comment:12 by , 7 years ago
That seems to be right, Tom. join()
is relying on the order of the dictionary order resulting from .items()
.
It's weird that setting the hash seed to 2 would solve this problem, though. I feel like the framework should not rely on the user changing the hashing seed to a static number to fix this problem, because it breaks the purpose (security) of the the hashing being random.
Replying to Tom:
It seems to boil down to this line: https://github.com/django/django/blob/stable/1.11.x/django/db/models/sql/query.py#L927
The
alias_map.items()
isn't stable, withPYTHONHASHSEED=1
thenreuse[0]
isrecepie_steps
, which is incorrect, but with it set to2
thenreuse[0]
is (correctly)T6
.
If anyone knows how to fix this then please let me know, else I will try and find a suitable fix.
comment:13 by , 7 years ago
Triage Stage: | Unreviewed → Accepted |
---|
Query results depending on dict ordering is a bug.
comment:14 by , 7 years ago
Cc: | added |
---|
Looks like this was introduced in [ab89414f40db1598364a7fe4cfac1766cacd2668].
comment:15 by , 7 years ago
I wasn't suggesting setting the hash seed value as a fix, merely as a debugging aide for anyone else looking at the ticket.
I've made a MR with what I think is the correct fix: https://github.com/django/django/pull/8631
The issue was that there where two joins, with differing join_types
, that where considered equal in the Join.__eq__
method, and so randomly one would be chosen, which was sometimes incorrect.
Replying to Marcus Renno:
That seems to be right, Tom.
join()
is relying on the order of the dictionary resulting from.items()
.
It's weird that setting the hash seed to 2 would solve this problem, though. I feel like the framework should not rely on the user changing the hashing seed to a static number to fix this problem, because it breaks the purpose (security) of the the hashing being random.
Replying to Tom:
It seems to boil down to this line: https://github.com/django/django/blob/stable/1.11.x/django/db/models/sql/query.py#L927
The
alias_map.items()
isn't stable, withPYTHONHASHSEED=1
thenreuse[0]
isrecepie_steps
, which is incorrect, but with it set to2
thenreuse[0]
is (correctly)T6
.
If anyone knows how to fix this then please let me know, else I will try and find a suitable fix.
comment:16 by , 7 years ago
Has patch: | set |
---|
comment:17 by , 7 years ago
Patch needs improvement: | set |
---|
comment:18 by , 7 years ago
This looks a bit similar to #26522 which made alias_map
an OrderedDict
to prevent such deterministic failures.
I wonder if it wouldn't be better to figure out what's adding values to self.alias_map
in a non-deterministic order instead as it seems to be the underlying issue here.
comment:19 by , 7 years ago
Hmm, you're right Simon. I missed the OrderedDict change, and tested initially on 1.11 (where it's not applied). With this, on master, the results are added deterministically, but the first item is always the incorrect join.
So I don't think my ticket fixes the issue in a convincing way. If you extend the example in the ticket with another annotate and filter:
self.queryset = self.queryset.annotate(total=Count('steps__ingredients', distinct=True)) self.queryset = self.queryset.filter(steps__ingredients__pk=1) self.queryset = self.queryset.annotate(available=Count('steps__ingredients', distinct=True)) self.queryset = self.queryset.filter(steps__ingredients__pk=2) self.queryset = self.queryset.annotate(available2=Count('steps__ingredients', distinct=True)) self.queryset = self.queryset.filter(total=F('available')).filter(total=F('available2'))
Then reuse
will have *three* aliases found: [('test_app_recipe_steps', 'LEFT OUTER JOIN'), ('T6', 'INNER JOIN'), ('T10', 'INNER JOIN')]
In this case my patch will fetch T6, and we have the same issue.
follow-up: 22 comment:20 by , 7 years ago
I've dug into this a fair bit and I'm sorry to say I'm stuck. I also think there is a pretty big bug lurking here that is out of my league. Or maybe it's a feature we don't support?
As far as I can tell, if you annotate over foreign keys as well as m2m:
Publisher.objects .annotate(num_books=Count('book', distinct=True)) .filter(book__rating__gt=3.0) .annotate(num_rated=Count('book', distinct=True)) .filter(num_books=F('num_rated'))
Then the resulting SQL will have the following HAVING
clause:
HAVING COUNT(DISTINCT "publisher"."id") = (COUNT(DISTINCT "publisher"."id"))
Am I correct in thinking that this should be:
HAVING COUNT(DISTINCT "publisher"."id") = (COUNT(DISTINCT "T3"."id"))
, where T3
is the correct join?
In any case, either the generated query is incorrect or the documentation needs to be updated.
I think, at least in the case of a m2m join, one of the issues are that there seems to be no way to resolve which alias is used by two identical annotations which are affected by a previous .filter()
in the query.join
method.
comment:21 by , 7 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:22 by , 7 years ago
Tom, I agree with you that it should be the second query and I believe that the feature should be supported. It's just a bug and we will fix it ;)
I'll try to work on this issue when I get home. I think you did a fantastic job narrowing down the problem and it's close to be solved.
Replying to Tom:
I've dug into this a fair bit and I'm sorry to say I'm stuck. I also think there is a pretty big bug lurking here that is out of my league. Or maybe it's a feature we don't support?
As far as I can tell, if you annotate over foreign keys as well as m2m:
Publisher.objects .annotate(num_books=Count('book', distinct=True)) .filter(book__rating__gt=3.0) .annotate(num_rated=Count('book', distinct=True)) .filter(num_books=F('num_rated'))Then the resulting SQL will have the following
HAVING
clause:
HAVING COUNT(DISTINCT "publisher"."id") = (COUNT(DISTINCT "publisher"."id"))
Am I correct in thinking that this should be:
HAVING COUNT(DISTINCT "publisher"."id") = (COUNT(DISTINCT "T3"."id"))
, whereT3
is the correct join?
In any case, either the generated query is incorrect or the documentation needs to be updated.
I think, at least in the case of a m2m join, one of the issues are that there seems to be no way to resolve which alias is used by two identical annotations which are affected by a previous
.filter()
in thequery.join
method.
comment:23 by , 7 years ago
Cc: | added |
---|
comment:24 by , 7 years ago
Owner: | set to |
---|---|
Status: | new → assigned |
First patch I have ever done to Django here: https://github.com/django/django/pull/8640
comment:25 by , 7 years ago
Patch needs improvement: | unset |
---|---|
Triage Stage: | Accepted → Unreviewed |
follow-up: 27 comment:26 by , 7 years ago
Triage Stage: | Unreviewed → Accepted |
---|
"Triage Stage" isn't reset when adding a patch, see Triaging Tickets.
comment:27 by , 7 years ago
Sorry for that, Tim. Thanks for the useful information!
Replying to Tim Graham:
"Triage Stage" isn't reset when adding a patch, see Triaging Tickets.
comment:28 by , 7 years ago
Cc: | added |
---|
comment:29 by , 7 years ago
Patch needs improvement: | set |
---|
As noted on the PR, I'm not sure that the current approach is ideal.
comment:30 by , 7 years ago
I updated the code in the PR. If someone has some time to check, please go ahead and share your comments.
follow-up: 32 comment:31 by , 7 years ago
To me it seems that pre https://code.djangoproject.com/changeset/ab89414f40db1598364a7fe4cfac1766cacd2668 we always used the *first* join, not the last join possible in join(). I believe this has been that way for a long time, so changing that should be considered carefully.
I'm not exactly sure why do we get random order for self.alias_map in join. It's an OrderedDict, so that shouldn't be the case...
comment:32 by , 7 years ago
Replying to holvianssi:
To me it seems that pre https://code.djangoproject.com/changeset/ab89414f40db1598364a7fe4cfac1766cacd2668 we always used the *first* join, not the last join possible in join(). I believe this has been that way for a long time, so changing that should be considered carefully.
I'm not exactly sure why do we get random order for self.alias_map in join. It's an OrderedDict, so that shouldn't be the case...
Hm... I always had the impression it was the last join. For me, it makes sense to use the tables in sequence, because you use actions (aka operations) in sequence. If it always grab the first joined table this wouldn't be possible. From what I understood, here in this ticket, until recent changes the order was random.
comment:33 by , 7 years ago
Patch needs improvement: | unset |
---|
comment:35 by , 7 years ago
Patch needs improvement: | set |
---|
As noted on the PR, the change introduces a non-deterministic test failure.
comment:36 by , 5 years ago
Has patch: | unset |
---|---|
Patch needs improvement: | unset |
Resolution: | → duplicate |
Status: | assigned → closed |
Summary: | Same queryset result in two different queries on ORM → Same queryset result in two different queries on ORM. |
Duplicate of #29530. Fixed in 0e64e046a481c345358b1040df7c89480ad198e8.
Hello Marcus,
You'll have to provide a minimal models setup to allow us to identify the issue as it's almost impossible to figure out what could be wrong it. Please do so and reopen this ticket with the details.
Also, did you reproduce against Django 1.11 and the master branch?