Opened 5 years ago

Closed 5 years ago

#30389 closed Bug (invalid)

Duplicate object when ordering through a foreign key

Reported by: Ajabep Owned by: nobody
Component: Database layer (models, ORM) Version:
Severity: Normal Keywords: ordering, foreign key
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When we are using an ordering through a foreign key, a same object can be resolved several times.

PoC

Models

class Team(models.Model):
    name = models.CharField(max_length=255, primary_key=True)

    class Meta:
        ordering = ['-persons__creationtime']

class Person(models.Model):
    uuid = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
    creationtime = models.DateTimeField(auto_now_add=True)
    team = models.ForeignKey(Team, on_delete=models.CASCADE)

    class Meta:
        default_related_name = 'persons'

Query

Team.objects.filter(name="R&D")

Buggy result

<QuerySet [<Team: Team object (R&D)>, <Team: Team object (R&D)>]>

The same team is selected 2 times.

Expected result

<QuerySet [<Team: Team object (R&D)>]>

Each teams (here, only 1) only 1 time.

Small analysis

By dumping the SQL request, we observe that the ordering is translated by a join instruction.

SELECT "poc_team"."name" FROM "poc_team" LEFT OUTER JOIN "poc_person" ON ("poc_team"."name" = "poc_person"."team_id") WHERE "poc_team"."name" = R&D ORDER BY "poc_person"."creationtime" DESC

Thus, if a Team object is linked to two Person objects, the Team will be selected 2 times. If it is linked to 3 Person, the Team will be selected 3 times.

This bug occurred also when you are using listing some teams, joined by a ManyToMany relation.

Workaround, waiting a fix

To avoid this bug, while there is no official fix, use the distinct() method:

Team.objects.filter(name="R&D").distinct()

Change History (1)

comment:1 by Simon Charette, 5 years ago

Resolution: invalid
Status: newclosed

By dumping the SQL request, we observe that the ordering is translated by a join instruction.

Hello Ajabep, while this might be surprising if you are not familiar with the ORM I'm afraid this isn't a bug; Django will translates all multivaluefield__value lookups into LEFT JOIN and order_by is not an exception.

The only other way to express this query would be perform a subquery pushdown but it's unfortunately not possible to do it in a performant way of all support database backends.

e.g.

SELECT "poc_team"."name"
FROM "poc_team"
WHERE "poc_team"."name" = 'R&D'
ORDER BY (
    SELECT "poc_person"."creationtime"
    FROM "poc_person"
    WHERE "poc_team"."name" = "poc_person"."team_id"
    ORDER BY "poc_person"."creationtime" DESC
    LIMIT 1
)

If you really want to order by a multi valued relation without using distinct() I suggest you manually perform the pushdown by ordering by a Subquery expression.

e.g.

.order_by(
    Subquery(
        Person.objects.filter(
            team=OuterRef('pk')
        ).order_by('-creationtime').values('creationtime')
    )
)

Note that you might experience performance issues on some backends (older versions of MySQL for example). So another alternative might be order by a Max('persons__creationtime') annotation.

By the way please ensure the problem you are encountering is a valid bug before submitting a ticket through this tracker. You'll likely get a faster response through support channels and that'll reduce the ticket triaging burden of contributors, thanks!

Note: See TracTickets for help on using tickets.
Back to Top