Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#22709 closed Uncategorized (duplicate)

Problem with negating Q object

Reported by: thierryb@… Owned by: nobody
Component: Database layer (models, ORM) Version: 1.6
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hello,

I have 2 models. One is the computer model which has a M2M field linking to an application model.

If I want to know the computers that have a specific app I can for instance do:
Q(applicationsname='app name', applications_version=12)
Computer.objects.filter(q)

this will translate into the right where clause with both conditions.

Now if I want the opposite, I do:
Computer.objects.filter(~q)

that will generate something totally different and will consider the 2 conditions separately and generate a where clause roughly like that:

NOT "computer"."id" IN

(

SELECT U1."computer_id" FROM "computer" U1 INNER JOIN "generic_application" U2 ON ( U1."application_id" = U2."id" ) WHERE U2."name" = 'MyApp'

)

AND

"computer"."id" IN
(

SELECT U1."computer_id" FROM "computer" U1 INNER JOIN "generic_application" U2 ON ( U1."application_id" = U2."id" ) WHERE U2."version"::text LIKE 10.0%

)

So it treats the 2 conditions are totally separate and handed separately. Which will give me NOT all the computers that either have my app or an app which version starts with 10.0.

In django 1.4, that used to work as expected (i.e. that both conditions would be taken into account at once.

Now I understand that Django has to evolve I'm wonder how I can achieve what I want.

BTW I purposefully simplified the example. We have a system that generically generates those Q object instances and moving to Django 1.6 is currently not really possible for us, given that behavior.

Any help, workaround or advice is very welcome.

Thanks.

Change History (4)

comment:1 Changed 8 years ago by Anssi Kääriäinen

Could you include the details of the models used, and what query you got in Django 1.4?

It might be that the reason for this regression is that both Django 1.4 and Django 1.6 generate a query that isn't actually correct. However, the 1.4 version happened to work for your use case. If that is the case it will be hard to revert back to providing the 1.4 version of the query.

comment:2 Changed 8 years ago by anonymous

Thank you for your quick reply.
Looking deeper into the generate SQL for django 1.4, it looks like it worked by chance, so here I'm focusing on 1.6 (and the conclusion + solution I came to)

Here are a summary of the models I'm using:

class Application(models.Model):

name = models.TextField()
version = models.TextField()

class GenericClient(models.Model):

applications = models.ManyToManyField(Application)

Now, 1 example I'm doing is:
q=Q(applicationsname='app name', applicationsversion='1')
qs = GenericClient.objects.only('pk).filter(q)

# at this point I'm expecting qs will contain the clients that have 'app name' version '1'
# For reference if I print qs.query it gives me

SELECT "generic_genericclient"."id" FROM "generic_genericclient" INNER JOIN "generic_clientapplication" ON ( "generic_genericclient"."id" = "generic_clientapplication"."client_id" ) INNER JOIN "generic_application" ON ( "generic_clientapplication"."application_id" = "generic_application"."id" ) WHERE ("generic_application"."version" = 1 AND "generic_application"."name" = app name )

Which is what I'm expecting

Now the problem comes when I try to reverse that condition:
qs = GenericClient.objects.only('pk).filter(~q)

# here I'm expecting to get all the clients that don't have 'app name' in version '1'
But instead of that I'm getting the clients don't have either 'app_name' or any other application which version is '1'

print qs.query

SELECT "generic_genericclient"."id" FROM "generic_genericclient" WHERE NOT ("generic_genericclient"."id" IN (SELECT U1."client_id" FROM "generic_clientapplication" U1 INNER JOIN "generic_application" U2 ON ( U1."application_id" = U2."id" ) WHERE U2."version" = 1 ) AND "generic_genericclient"."id" IN (SELECT U1."client_id" FROM "generic_clientapplication" U1 INNER JOIN "generic_application" U2 ON ( U1."application_id" = U2."id" ) WHERE U2."name" = app name ))

I tried the Queryset.exclude route, but after having a look at the code I understood it was the same as my Q version.

More thinking about it, I figured that my expected behavior was just an execration from my side and that generalizing exclude is not easy at all.

So I decided to go with "sub-queryset", that would hopefully work better and be clearer.

So I just wanted to exclude the clients that have 'app name' version '1'.

I did:
qs = GenericClient.objects.only('pk').exclude(pkin=GenericClient.objects.filter(q))

That gives us:
SELECT "generic_genericclient"."id" FROM "generic_genericclient" WHERE NOT ("generic_genericclient"."id" IN (SELECT U0."id" FROM "generic_genericclient" U0 INNER JOIN "generic_clientapplication" U1 ON ( U0."id" = U1."client_id" ) INNER JOIN "generic_application" U2 ON ( U1."application_id" = U2."id" ) WHERE (U2."version" = 1 AND U2."name" = app name )))

...Which is finally what I wanted. YAY success.

So for the the conclusion is that exclude only does what I want in a few cases.
In general I think exclude is hard to implement on the django side. Not even sure it was not a design mistake in the first place.

I basically wanted to share my experience with using it. On the actual problem I consider it "FIXED" since I found a workaround and I will be super careful when using exclude in the future.

Any comment is of course welcome.

comment:3 Changed 8 years ago by Anssi Kääriäinen

Resolution: duplicate
Status: newclosed

Yes, implementing .exclude() correctly is hard - when multiple filter conditions target the same multivalued join, a single subquery needs to be generated in a way that it includes all the conditions. Doing this in a generic way is surprisingly hard, and often the fix is to manually use a subquery.

The exclude() behavior isn't a design mistake - the principle is very sound - when doing .exclude(condition) you get the complement of .filter(condition) set. The problem is that this is just really hard to achieve in some cases, generating (or just writing) correct SQL queries is sometimes just plain messy.

The upshot here is that I think there is some hope to use the style you did manually - detect a (possibly multipart condition) that needs to be pushed into a subquery (lets call this condition subq_condition), and then just generate the subquery as qs.filter(pk__in=subq.filter(subq_condition).values_list('pk')). This might not be the most efficient query, but in theory this should work for many currently broken cases.

I'll close this as duplicate of #14645.

comment:4 Changed 8 years ago by Tim Graham

Summary: Pproblem with negating Q objectProblem with negating Q object
Note: See TracTickets for help on using tickets.
Back to Top