Opened 10 months ago

Last modified 9 months ago

#35099 closed Bug

Combining QuerySets with "|" or "&" produce side effects affecting further queries — at Initial Version

Reported by: Alan Owned by: nobody
Component: Database layer (models, ORM) Version: 5.0
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hello everyone.

Combining some queries with "|" or "&" somehow affects queries involved in the operation, leading to malformed SQL and unexpected results.

Here are details and steps to reproduce. Apologise, for maybe a bit confusing model names, I copied them from production.

class SiteUser(models.Model):
    pass

class Notification(models.Model):
    user = models.ForeignKey(to=SiteUser, on_delete=models.CASCADE)


class PayoutRequest(models.Model):
    requester = models.ForeignKey(to=SiteUser, on_delete=models.CASCADE)

Test:

from django.test import TestCase
from django.db.models import OuterRef, Exists
from reproduce.models import Notification, SiteUser, PayoutRequest


class Reproduce(TestCase):

    def test(self):
        u01 = SiteUser.objects.create()
        u02 = SiteUser.objects.create()
        u03 = SiteUser.objects.create()

        Notification.objects.create(user=u01)
        PayoutRequest.objects.create(requester=u01)
        Notification.objects.create(user=u02)
        PayoutRequest.objects.create(requester=u03)

        are_active = SiteUser.objects.all().distinct()
        got_money = SiteUser.objects.filter(
            Exists(PayoutRequest.objects.filter(requester=OuterRef('pk')))
        ).distinct()
        whatever_query = SiteUser.objects.all().distinct()

        # Execute queries first time
        need_help = are_active.exclude(pk__in=got_money)
        notified = Notification.objects.filter(user__in=need_help).values_list('user_id', flat=True)
        query_before = str(notified.query)

        self.assertEqual(len(notified), 1)  # correct

        whatever_query | got_money  # Touch "got_money" with any other query

        # Execute same queries second time
        need_help = are_active.exclude(pk__in=got_money)
        notified = Notification.objects.filter(user__in=need_help).values_list('user_id', flat=True)
        query_after = str(notified.query)

        print(query_before)
        print(query_after)
        self.assertEqual(len(notified), 1)  # expected 1, got 0
        self.assertEqual(query_before, query_after)  # false

As you can see, merely touching the got_money query with any other query leads to modifying the results of the same queries executed after that.
This test case probably may be simplified even further, but unfortunately, I have no more time resources to dig much deeper.

I had another queries built using simple .filter() and .exclude(). Those were not affected by combining.
I found only this query got_money using Exists() and OuterRef() to be affected. There might be more of which I am not aware of.

The reason for this I don't know, but query_before and query_after differs.
query_before correctly separates subqueries using W0, U0, V0 aliases, while the query_after uses a single U0 alias for all subqueries, leading to incorrect results.

Before

SELECT 
  "reproduce_notification"."user_id" 
FROM 
  "reproduce_notification" 
WHERE 
  "reproduce_notification"."user_id" IN (
    SELECT 
      DISTINCT W0."id" 
    FROM 
      "reproduce_siteuser" W0 
    WHERE 
      NOT (
        W0."id" IN (
          SELECT 
            DISTINCT V0."id" 
          FROM 
            "reproduce_siteuser" V0 
          WHERE 
            EXISTS(
              SELECT 
                1 AS "a" 
              FROM 
                "reproduce_payoutrequest" U0 
              WHERE 
                U0."requester_id" = (V0."id") 
              LIMIT 
                1
            )
        )
      )
  ) 

After

SELECT 
  "reproduce_notification"."user_id" 
FROM 
  "reproduce_notification" 
WHERE 
  "reproduce_notification"."user_id" IN (
    SELECT 
      DISTINCT U0."id" 
    FROM 
      "reproduce_siteuser" U0 
    WHERE 
      NOT (
        U0."id" IN (
          SELECT 
            DISTINCT U0."id" 
          FROM 
            "reproduce_siteuser" U0 
          WHERE 
            EXISTS(
              SELECT 
                1 AS "a" 
              FROM 
                "reproduce_payoutrequest" U0 
              WHERE 
                U0."requester_id" = (U0."id") 
              LIMIT 
                1
            )
        )
      )
  )

Change History (0)

Note: See TracTickets for help on using tickets.
Back to Top