#26522 closed Bug (fixed)
Non-deterministic crash in django.db.models.sql.Query.combine()
Reported by: | Ole Laursen | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.9 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Hi,
There's a logical error somewhere in Query.combine or friends.
I have a pretty complicated query with many-to-many self-joins and ORs, and after the upgrade to Django 1.9 and Python 3 combine() sometimes, but not always, crashes with the assertion error
assert set(change_map.keys()).intersection(set(change_map.values())) == set()
Inspecting the change_map, it does indeed contain a circular reference (BTW shouldn't you use "not" to test for an empty set rather than comparing with set()?).
At first, I didn't understand how it could crash sometimes and sometimes not - we're talking about the same query being executed through a view. But when I examine change_map, its content does indeed change from reload to reload - I suppose this may have something to do with the order of dicts/sets the combine() algorithm is using.
Here comes some code, I cut out a bunch of unused stuff, but otherwise it's the same:
class Invoice(models.Model): customer = models.ForeignKey(Customer) date_created = models.DateField(default=datetime.date.today, db_index=True) date_sent = models.DateField(null=True, blank=True) date_due = models.DateField(null=True, blank=True) date_paid = models.DateField(null=True, blank=True) date_credited = models.DateField(null=True, blank=True) date_collect = models.DateField(null=True, blank=True) invoice_type = models.CharField(default="invoice", max_length=32) reminders = models.ManyToManyField("Invoice", related_name="reminded_set", blank=True) reminder_counter = models.IntegerField(null=True, blank=True)
That's the model, and in the view:
import datetime from django.db.models import Q date = datetime.datetime.now() invoices = Invoice.objects.filter( Q(date_created__lte=date), Q(date_paid__gt=date) | Q(date_paid=None), Q(date_credited__gt=date) | Q(date_credited=None), customer=1, ) filtered_invoices = Invoice.objects.none() not_due = Q(date_due__gte=date) | Q(date_due=None) not_reminded_yet = ~Q(reminders__date_created__lte=date) not_collected = Q(date_collect__gt=date) | Q(date_collect=None) filtered_invoices |= invoices.filter(not_due, not_collected, date_sent__lte=date, invoice_type="invoice") filtered_invoices |= invoices.filter(not_collected, not_reminded_yet, date_sent__lte=date, date_due__lt=date, invoice_type="invoice") for r in [1, 2, 3]: qs = invoices.filter(not_collected, reminders__date_created__lte=date, reminders__reminder_counter=r, invoice_type="invoice") for i in range(r + 1, 3 + 1): qs = qs.filter(~Q(reminders__reminder_counter=i) | Q(reminders__reminder_counter=i, reminders__date_created__gt=date)) filtered_invoices |= qs
I realize it's pretty complicated but I'm not sure what's essential and what's not. I hope someone knowledgeable of how combine() works can figure out why ordering in some cases matters.
Attachments (1)
Change History (16)
comment:2 by , 8 years ago
Could you bisect the regression? If you can provide the test app that you use for that process, it'll be helpful.
comment:3 by , 8 years ago
I've attached a test project. Run it with
cd combinebug DJANGO_SETTINGS_MODULE=combinebug.settings PYTHONPATH=../Django-1.7.5:. python combinebug/triggerbug.py
If it crashes, congrats, you've reproduced it. If not, try running it multiple times.
I tested it with Python 2.7 and Django 1.7.5 and could reproduce the crash and Python 3 and Django 1.9.4 and could also reproduce it.
1.6.2 seems to be working - that was the version I upgraded the whole project from and it's been stable.
comment:4 by , 8 years ago
Summary: | Bug in django.db.models.sql.Query.combine → Non-determinstic crash in django.db.models.sql.Query.combine() |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:5 by , 8 years ago
#26959 looks like a duplicate and includes a more minimal model to reproduce.
comment:6 by , 8 years ago
Has patch: | set |
---|
PR at https://github.com/django/django/pull/8139
I ran into this bug yesterday and spent some time trying to figure out what's going on. It seems like the circular reference arises when alias_map
is iterated in Query.join
in order to find the next alias to reuse:
reuse = [a for a, j in self.alias_map.items() if (reuse is None or a in reuse) and j == join] if reuse: self.ref_alias(reuse[0]) return reuse[0]
Because alias_map
is a dict, the alias to reuse is chosen non-deterministically. Using an OrderedDict
for alias_map
solves this problem.
It can also by solved by changing the reuse code to
reuse = [a for a, j in self.alias_map.items() if (reuse is None or a in reuse) and j == join] if reuse: if join.table_alias in reuse: to_use = join.table_alias else: to_use = reuse[0] self.ref_alias(to_use) return to_use
But the OrderedDict
seems cleaner to me.
I don't understand much of the join internals here so I'd love to hear if someone has a better solution.
comment:7 by , 8 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:9 by , 7 years ago
I just ran into this problem as well after updating a project from 1.8 to 1.11.2. Unfortunately, implementing the fix by changing alias_map
to be an OrderedDict did not help. I managed to work around it by just rewriting the query, (because it was doing some dumb things which were obfuscated by chaining manager methods, as you can see below), but I thought I would share the smallest test case I could reproduce it with, just in case someone else runs into it as well.
My models:
class Assessment(models.Model): pass class AnswerGroup(models.Model): assessment = models.ForeignKey(Assessment) author = models.ForeignKey( User, related_name='created_answergroups') about = models.ForeignKey( User, blank=True, null=True, related_name='subject_answergroups)
And my test code where I get the same AssertionError:
u = User.objects.get(pk=1) assessments = Assessment.objects.all() groups = AnswerGroup.objects.filter(assessment__in=assessments) groups_1 = groups.filter(author=u, about=u) groups_2 = groups.filter(author=u).exclude(about=None).exclude(about=u) list(groups_1 | groups_2)
Note that when I remove the assessment__in
filter, the code works fine. Also note that change_map
is something like {u'auth_user': 'T4', 'T4': u'auth_user'}
both with and without the filter, and thus both when an error occurs and when it does not.
I've worked around it by just using one filter and ditching the combine by just using filter(author=u).exclude(about=None)
, but of course not everyone has the option to rewrite their query that easily.
comment:10 by , 7 years ago
Summary: | Non-determinstic crash in django.db.models.sql.Query.combine() → Non-deterministic crash in django.db.models.sql.Query.combine() |
---|
comment:12 by , 7 years ago
It's a very tricky issue. I think, it can damage data in some cases.
In my case Django ORM didn't fail with assertion error: it generated wrong SQL query so I got wrong data from my DB. On development environment I didn't get this issue because it appears randomly. I'm a lucky person - in my case I query for some data for UI, but if you perform some calculations using data from a queryset that silently returns wrong data - you are in trouble. It's quite possible that 4/5 of your processes work properly, but one gives wrong results.
Here is simplified version of my issue.
Models:
from django.db import models class Category(models.Model): title = models.CharField(max_length=255, unique=True) class Meta: verbose_name_plural = 'categories' def __str__(self): return self.title class Document(models.Model): title = models.CharField(blank=True, max_length=255) categories = models.ManyToManyField('Category', blank=True, related_name='+') def __str__(self): return self.title
Build a query set that could return wrong data:
from app_with_models.models import Document, Category queryset = Document.objects.all() # Exclude documents without taxonomy (Category) tags queryset = queryset.filter(categories__in=Category.objects.all()) # Apply predefined taxonomy filters category_predefined = [1] # Should be in fixtures queryset = queryset.filter(categories__in=category_predefined) queryset = queryset.distinct() used_category_ids = queryset.values_list('categories__pk', flat=True) print(used_category_ids.query)
This code should generate the following SQL query (formatted):
SELECT DISTINCT "app_with_models_document_categories"."category_id" FROM "app_with_models_document" LEFT OUTER JOIN "app_with_models_document_categories" ON ("app_with_models_document"."id" = "app_with_models_document_categories"."document_id") INNER JOIN "app_with_models_document_categories" T4 ON ("app_with_models_document"."id" = T4."document_id") WHERE ( "app_with_models_document_categories"."category_id" IN (SELECT U0."id" AS Col1 FROM "app_with_models_category" U0) AND T4."category_id" IN (1) );
But in some cases it generates this:
SELECT DISTINCT T4."category_id" FROM "app_with_models_document" LEFT OUTER JOIN "app_with_models_document_categories" ON ("app_with_models_document"."id" = "app_with_models_document_categories"."document_id") INNER JOIN "app_with_models_document_categories" T4 ON ("app_with_models_document"."id" = T4."document_id") WHERE ( "app_with_models_document_categories"."category_id" IN (SELECT U0."id" AS Col1 FROM "app_with_models_category" U0) AND T4."category_id" IN (1) );
This query generates absolutely wrong results. In my example it will always return 1
as category_id
.
Some more details:
- Can be reproduced on
Django==1.10.7
,Django==1.11.4
and, most likely, on order versions too. Fixed in 9bbb6e2d2536c4ac20dc13a94c1f80494e51f8d9 - Can be reproduced on Python 3.5, but not on Python 3.6 (at least, I didn't manage to do that). I guess it's because of new dict implementation.
- I've tested with Postgres 9.6.3 and
psycopg2==2.6.1
andpsycopg2==2.7.3.1
, but it looks like the issue is not DB engine/adapter specific.
Hope it helps.
Can it be backported into supported (or at least into LTS)? I would really appreciate it. Me an my colleague have spent 4 days debugging our code an I'm afraid that it could cause more serious issues for other people.
comment:13 by , 7 years ago
I guess it's okay to backport to 1.11. There's some indication based on the comments that this may be a regression. PR for the stable/1.11.x branch.
Oh, I forgot to mention - it crashes on the very last line
filtered_invoices |= qs
in the above with r = 2. So after having constructed a pretty complicated thing.