Opened 22 months ago

Last modified 14 months ago

#21333 new Cleanup/optimization

Document queryset OR using a vertical bar

Reported by: EvilDMP Owned by: nobody
Component: Documentation Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

https://docs.djangoproject.com/en/dev/ref/models/queries/#q-objects says (in effect): "you can't use OR in querysets without Q() objects" - but you can!

qs = items.filter(meal="supper") | items.filter(meal="lunch")

I can't see this construction mentioned in the documentation. It appears to work perfectly well. Is there anything wrong with it? Is there any good reason not to use it?

Change History (5)

comment:1 Changed 22 months ago by timo

  • Triage Stage changed from Unreviewed to Accepted

Notes from IRC discussion: "the 'how to combine querysets' deserve their own section in the docs -- filter(A, B), filter(Q(A) & A(B)) and now qs.filter(A) & qs.filter(B). are all of them totally equivalent? if no in what do they differ? which is the recommended way (if any)? they use different code so chances are it's not equivalent

comment:2 Changed 22 months ago by mjtamlyn

Whilst we shouldn't say this isn't possible, we do need to be careful with it as there is an obvious footgun - you could easily end up with additional queries if one of your earlier querysets got executed for some reason

Personally, I'm not completely sure why this is possible

comment:3 Changed 15 months ago by jorgecarleitao

This is because QuerySet has methods __and__ and __or__. Why they are defined I don't know, but they seem to be a correctly exposed API:

  • Both use Query.combine to merge their queries, a method only used by QuerySet.__and__ and QuerySet.__or__ that combines two queries (resolving joins and things like that)
  • Both merge the Querysets QuerySet._known_related_objects(?) using QuerySet._merge_known_related_objects(self, other), a method only used for these 2 methods. I didn't went deep enough to say if the merge is good or bad, but this parameter is only *modified* by db.models.field.create_foreign_related_manager function, a very far away function(!)

timo, AFAI can tell from the code, filter(A, B), filter(Q(A) & Q(B)) and qs.filter(A) & qs.filter(B) are equivalent.

AFAI searched, these methods are not documented (https://docs.djangoproject.com/en/1.6/ref/models/querysets/, https://docs.djangoproject.com/en/1.6/topics/db/queries/, and a regex on __and__ on docs). Yet, I did a quick check of monkey-removing them and running the full test and they are being tested and are used in 28 tests, for instance

  • basic.tests.ModelTest.test_object_creation
  • extra_regress.tests.ExtraRegressTests.test_regression_10847
  • extra_regress.tests.ExtraRegressTests.test_regression_7314_7372

I would say we remove these 2 methods altogether and replace them in the tests by lookup expressions. Here are my reasons:

  • Not documented, thus not part of the public API
  • From the footgun perspective, as mjtamlyn pointed out: if a queryset is already evaluated, you may hit the db again (e.g. qs1 | qs2 is valid and one could have already been evaluated)
  • From the user perspective, IMO is harder to write qs.filter(A) & qs.filter(B) than filter(Q(A) & Q(B)) or filter(A, B).
  • There are non trivial constraints (to the user) to Query.combine two queries:

1- they need to be on the same model;
2- lhs must be able to filter lhs.can_filter (why not both?);
3- both queries must have equal distinct and distinct fields
these constraints are being enforced with assert.

  • From the API perspective, joining conditional clauses in queries is already well posed by Q objects and chaining: IMO having a new entry point seems unnecessary.
  • From the design perspective, since Query does not have a __and__ or __or__ method, I don't see why a set of Querys - QuerySet - should have: after all, __and__ and __or__ make sense on conditional expressions (e.g. ExpressionNode and Q objects), not queries (for instance, is the __and__ being applied to the WHERE clause or to the HAVING clause?

In any case, this seems to require a decision on the API: should this be part and we fix it and document it, or should we remove it?
It seems to me this is actually a New feature instead of a Cleanup/optimization. In either case, I don't mind preparing a patch if you EvilDMP don't want/have time.

comment:4 Changed 15 months ago by akaariai

There are a couple of reasons why __and__ and __or__ are needed. First is that if you happen to have two existing querysets it is convenient to use queryset combining. The second reason is that if you have a query with .filter(m2m_field=foo) applied, then there is no way to add a condition to that same m2m_field - doing .filter(m2m_field=bar) creates another join for the same alias. The reason is that .filter(A).filter(B) isn't equivalent to .filter(A, B). So, if you have a queryset with .filter(A), then you need queryset combining to achieve .filter(A, B) for that queryset.

I have always considered queryset combining public API. I am very surprised if there isn't and hasn't ever been any documentation about this. Even if queryset combining hasn't ever been public API, I believe there to be enough code relying on queryset combining that we should then just publish the API. Also, the combining code has existed since forever (that is, pre 1.0).

comment:5 Changed 14 months ago by timo

  • Summary changed from Undocumented queryset OR should be documented, probably to Document queryset OR using a vertical bar
Note: See TracTickets for help on using tickets.
Back to Top