Opened 5 years ago

Closed 5 years ago

#30863 closed Cleanup/optimization (duplicate)

Queryset __repr__ can overload a database server in some cases

Reported by: Matt Johnson Owned by: nobody
Component: Database layer (models, ORM) Version: 2.2
Severity: Normal Keywords: queryset repr __repr__
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description

Consider a model like this:

class Result(models.Model):
    # A Result object represents someone who took a quiz
    result_id = models.AutoField(primary_key=True, ...)
    quiz = models.ForeignKey("Quiz", ...) # assume this boils down to an integer field
    name = models.CharField(...)

    Meta:
        ordering = ['name']

Assume it has hundreds of millions of records, and no index on the "name" column.

Typical usage might be something like

Result.objects.filter(quiz_id=123)

Now consider a bug in the usage, like:

Result.objects.filter(quiz_id="somestring") # notice we used a string to filter

Django will throw an exception (rightfully so).

As part of the usual error reporting process in debug mode, Django may eventually call repr() on the "base" queryset (that is essentially Result.objects.all()).

QuerySet.repr tries to be helpful by printing the first 21 results of the evaluated query. Because the base queryset orders by the un-indexed "name" column, this can easily overload the database when it does "SELECT ... FROM Result ORDER BY name LIMIT 21" (trying to sort hundreds of millions of rows by an unindexed column)

Even with debug mode turned off, some error reporting tools like Sentry will call repr on the queryset, creating the same problem in production.

I suggest not showing any query data in Queryset.repr.

Change History (1)

comment:1 by Mariusz Felisiak, 5 years ago

Resolution: duplicate
Status: newclosed
Type: UncategorizedCleanup/optimization

Duplicate of #20393 (see comment).

Note: See TracTickets for help on using tickets.
Back to Top