Opened 5 years ago
Closed 5 years ago
#30863 closed Cleanup/optimization (duplicate)
Queryset __repr__ can overload a database server in some cases
Reported by: | Matt Johnson | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 2.2 |
Severity: | Normal | Keywords: | queryset repr __repr__ |
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | yes | UI/UX: | no |
Description
Consider a model like this:
class Result(models.Model): # A Result object represents someone who took a quiz result_id = models.AutoField(primary_key=True, ...) quiz = models.ForeignKey("Quiz", ...) # assume this boils down to an integer field name = models.CharField(...) Meta: ordering = ['name']
Assume it has hundreds of millions of records, and no index on the "name" column.
Typical usage might be something like
Result.objects.filter(quiz_id=123)
Now consider a bug in the usage, like:
Result.objects.filter(quiz_id="somestring") # notice we used a string to filter
Django will throw an exception (rightfully so).
As part of the usual error reporting process in debug mode, Django may eventually call repr() on the "base" queryset (that is essentially Result.objects.all()).
QuerySet.repr tries to be helpful by printing the first 21 results of the evaluated query. Because the base queryset orders by the un-indexed "name" column, this can easily overload the database when it does "SELECT ... FROM Result ORDER BY name LIMIT 21" (trying to sort hundreds of millions of rows by an unindexed column)
Even with debug mode turned off, some error reporting tools like Sentry will call repr on the queryset, creating the same problem in production.
I suggest not showing any query data in Queryset.repr.
Change History (1)
comment:1 by , 5 years ago
Resolution: | → duplicate |
---|---|
Status: | new → closed |
Type: | Uncategorized → Cleanup/optimization |
Duplicate of #20393 (see comment).