Opened 5 years ago

Closed 5 years ago

#31004 closed Bug (invalid)

Using FilteredRelation on M2M relationship duplicates result rows.

Reported by: idemchenko-wrk Owned by: nobody
Component: Database layer (models, ORM) Version: 2.2
Severity: Normal Keywords: FilteredRelation
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Using FilteredRelation on many to many relationship could unexpectedly duplicate result rows.

Models:

class Discount(models.Model):
    name = models.CharField(max_length=255)
    active_till = models.DateTimeField()


class Item(models.Model):
    name = models.CharField(max_length=255)
    discounts = models.ManyToManyField(Discount)

Query:

Item.objects
.annotate(
    available_discounts=FilteredRelation(
        'discounts', condition=Q(discounts__active_till__gte=now)
    )
).annotate(
    name_with_discount=Concat(F('name'), F('available_discounts__name'))
)

SQL:

SELECT 
    "sales_item"."id",
    "sales_item"."name",
    CONCAT("sales_item"."name", available_discounts."name") AS "name_with_discount"
FROM "sales_item"
    LEFT OUTER JOIN "sales_item_discounts" 
        ON ("sales_item"."id" = "sales_item_discounts"."item_id")
    LEFT OUTER JOIN "sales_discount" available_discounts 
        ON (
            "sales_item_discounts"."discount_id" = available_discounts."id"
            AND (available_discounts."active_till" >= '2019-11-17T21:32:42.501283+00:00'::timestamptz)
        );

The problem is that the intermediate table (sales_item_discounts) joins with the target table (sales_discount) using the left join.
So even if rows of the target table are filtered out, rows of the intermediate table are still there, and multiply the resulting rows.

Test:

class DuplicationTest(TestCase):

    def test_row_duplication(self):
        now = timezone.now()

        active_discounts = [
            Discount.objects.create(name='-5%', active_till=now + timedelta(days=1)),
            Discount.objects.create(name='-5$', active_till=now + timedelta(days=1)),
        ]

        old_discounts = [
            Discount.objects.create(name='Whatever', active_till=now - timedelta(days=1)),
            Discount.objects.create(name='Whatever', active_till=now - timedelta(days=1)),
        ]

        item_w_discounts = Item.objects.create(name='item1')
        item_w_discounts.discounts.set(active_discounts + old_discounts)

        item_w_old_discounts = Item.objects.create(name='item2')
        item_w_old_discounts.discounts.set(old_discounts)

        item_wo_discounts = Item.objects.create(name='item3')

        items = list(
            Item.objects
            .annotate(
                available_discounts=FilteredRelation(
                    'discounts', condition=Q(discounts__active_till__gte=now)
                )
            ).annotate(
                name_with_discount=Concat(F('name'), F('available_discounts__name'))
            )
        )

        self.assertEqual(items.count(item_wo_discounts), 1)     # Passed
        # expected to see one item, since it does not have active discounts
        self.assertEqual(items.count(item_w_old_discounts), 1)  # AssertionError: 2 != 1
        # expected to see two items, one for each active discount
        self.assertEqual(items.count(item_w_discounts), len(active_discounts))  # AssertionError: 4 != 2

Change History (2)

comment:1 by idemchenko-wrk, 5 years ago

Keywords: FilteredRelation added

comment:2 by Mariusz Felisiak, 5 years ago

Resolution: invalid
Status: newclosed

Thanks for this report. You're using a nullable relations i.e. you don't filter by available_discounts that's why we need to use LEFT OUTER JOIN because you want to get items with and without matching discounts. You can use distinct() or split your query. I don't think that's anything that we can change in Django to make it works better.

Closing per TicketClosingReasons/UseSupportChannels.

Note: See TracTickets for help on using tickets.
Back to Top