Opened 3 years ago

Closed 3 years ago

#18330 closed Bug (fixed)

core.cache.backends.db does not work with 3rd party db backends that lack limit + offset

Reported by: manfre Owned by: akaariai
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: cache, orm, database, backend
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The DatabaseCache must use raw SQL because it's not an app and doesn't have access to the ORM. Most of the raw SQL is standard SQL, except for a single query in _cull() that uses LIMIT + OFFSET. There is already special handling to support 'oracle' and similar, but different, handling is needed for django-mssql and most likely other 3rd party backends.

That attached patch adds DatabaseOperations.raw_limit_offset_select that is intended to allow a non-ORM way for database backends to provide a valid raw LIMIT + OFFSET query.

Attachments (1)

django-ticket18330.diff (5.6 KB) - added by manfre 3 years ago.
Adds raw_limit_offset_select with better documentation and more flexibility for limit and offset.

Download all attachments as: .zip

Change History (13)

comment:1 Changed 3 years ago by akaariai

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

Seems like a good approach.

comment:2 Changed 3 years ago by manfre

  • Patch needs improvement set

I'm making a few improvements to the implementation.

comment:3 Changed 3 years ago by manfre

  • Patch needs improvement unset

Happier with this rendition of the patch. Read to be moved forward in the process. I don't have an Oracle backend to test against and only compared that backend's generated SQL against the previously known working syntax.

comment:4 Changed 3 years ago by manfre

  • Owner changed from nobody to manfre

raw_limit_offset_select will need a "where" argument because unit test "aggregation_regress.AggregationTests.test_annotate_with_extra" also has hand written limit query.

Changed 3 years ago by manfre

Adds raw_limit_offset_select with better documentation and more flexibility for limit and offset.

comment:5 Changed 3 years ago by manfre

Updated patch to add where argument to allow more useful queries and udpated the hand coded limit query in unit test aggregation_regress.AggregationTests.test_annotate_with_extra to use raw_limit_offset_select().

comment:6 Changed 3 years ago by akaariai

I am going to look if we could use the Django ORM directly in the dbcache case. The second use of limit in aggregation_regress can be easily removed. If the ORM idea works out (basically create a model in flight) then there is no need to run any LIMIT queries at all.

comment:7 Changed 3 years ago by akaariai

I created ticket #18401 to track the db cache backend issue. If that gets in core, then there is just the lone aggregation_regress problem left, and that should be easy to correct (by for example fetching the PK in separate query, then using where pk = pk in the raw SQL query). This way there is hopefully no need for a new backend method.

comment:8 Changed 3 years ago by aaugustin

This was suggested in #15580, before I fixed that problem inappropriately in #16481. It would be interesting to compare the patch on this ticket and on #15580 (which was written by a core dev but not committed).

comment:9 Changed 3 years ago by akaariai

In my opinion there are two ways forward for this ticket: the method suggested by Ian Kelly in #15580: that is, a database operation "sql for cache cull". Or, to go with ORM-based cache db backend, likely with raw SQL for the read-only operations for performance reasons (#18401).

The good thing about the #15580 is that it is simple and solves the current situation. The good thing about #18401 is that it reuses existing ORM code, and cleans up the implementation of cache table creation and cache operations.

If it isn't obvious by now I favor using the ORM for the cache tables. Of course, this decision isn't that important, and I can easily live with the #15580 approach, too.

My suggestion is to use the ORM for everything else except for the has_key() and get() operations in the cache backend. The hard parts of this ticket would be solved by that, and the cache backend implementation would be cleaner, too. There is no performance loss for any realistic use case.

I will wait for some time for comments to this plan.

comment:10 Changed 3 years ago by akaariai

I will go with the idea in #15580. I do think using the ORM for cache tables will make DRYer and cleaner code. We have an ORM which is designed to generate queries in cross-backend compatible way, so using the existing machinery here seems like a good idea. But as I haven't heard any other opinions here, I am going to go with what Russell suggested in #18401 (use #15580's approach).

comment:11 Changed 3 years ago by manfre

  • Owner changed from manfre to akaariai

comment:12 Changed 3 years ago by Anssi Kääriäinen <akaariai@…>

  • Resolution set to fixed
  • Status changed from new to closed

In [d3c2eb103f6682c029a850e60dc4cf85896b6aa2]:

Fixed #18330 - Made cache culling 3rd party db backend friendly

This is Ian Kelly's patch from #15580 with minor modifications.

Note: See TracTickets for help on using tickets.
Back to Top