Opened 3 weeks ago
Closed 14 hours ago
#36960 closed Cleanup/optimization (fixed)
Django never uses psycopg 3's optimised timestamptzloader
| Reported by: | Aarni Koskela | Owned by: | Aarni Koskela |
|---|---|---|---|
| Component: | Database layer (models, ORM) | Version: | 6.0 |
| Severity: | Normal | Keywords: | postgresql |
| Cc: | Triage Stage: | Ready for checkin | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
When available (e.g. psycopg3-c or psycopg3-binary installed), Psycopg implicitly registers a faster TimestamptzLoader implementation https://github.com/psycopg/psycopg/blob/master/psycopg_c/psycopg_c/types/datetime.pyx.
However, Django overrides this with a custom class that derives from the Python psycopg.types.datetime.TimestamptzLoader implementation. (You can't derive from the final-marked C speedup class, anyway.)
For performance, it would be pleasant if Django used the faster class when available. This is made slightly hairy by the fact that, as mentioned, you can't use the speedup class as a base, and there is no public API in psycopg to get the speedup class. context.adapters._get_optimised exists, but is not a public API. You could do something like (and I experimentally did)
try:
optimised_class = context.adapters._get_optimised(TimestamptzLoader)
if optimised_class is not TimestamptzLoader:
self.fast_load = optimised_class(oid, context).load
except Exception:
pass
and res = (self.fast_load or super().load)(data) but that's also hairy...
Change History (8)
comment:1 by , 3 weeks ago
| Resolution: | → wontfix |
|---|---|
| Status: | new → closed |
| Type: | Bug → Cleanup/optimization |
comment:2 by , 3 weeks ago
a public API from psycopg for retrieving the c-accelerated internals (unlikely...)
I'm not sure that's totally implausible; the author/maintainer of psycopg3, @dvarrazzo-on-GitHub, chimed in on the original issue porting his psycopg3 backend to Django in https://github.com/django/django/pull/15687#issuecomment-1348923713, and is responsive on the psycopg repo (I've been looking at some perf fixes on that side of the fence.)
I could make a PR there to propose making get_optimised a public API, and link this thread.
comment:3 by , 3 weeks ago
| Resolution: | wontfix |
|---|---|
| Status: | closed → new |
| Triage Stage: | Unreviewed → Someday/Maybe |
Great plan. I'll signal that by moving this to Someday/Maybe to represent we're waiting on upstream input.
comment:4 by , 3 weeks ago
:+1: Opened an issue for discussion for now instead of going full in with another PR: https://github.com/psycopg/psycopg/issues/1273
comment:5 by , 6 days ago
| Has patch: | set |
|---|---|
| Owner: | set to |
| Status: | new → assigned |
comment:6 by , 6 days ago
| Triage Stage: | Someday/Maybe → Accepted |
|---|
Accepting, now that we have guidance from psycopg about the proper way to do this.
comment:7 by , 14 hours ago
| Triage Stage: | Accepted → Ready for checkin |
|---|
Thanks for the idea. I played with the provided PR and added some timeit statements, and found about a 4x performance improvement. However, I don't think the provided solution is adequate, since, as you acknowledge, there is no public interface for this.
We could revisit with one of two things in hand:
load()method dynamically like this, although the cursor timezone registration needs investigation:django/db/backends/postgresql/base.py
django/db/backends/postgresql/psycopg_any.py
class BaseTzLoader(TimestamptzLoader):"""Load a PostgreSQL timestamptz using the a specific timezone.The timezone can be None too, in which case it will be chopped."""timezone = Nonedef load(self, data):res = super().load(data)return res.replace(tzinfo=self.timezone)I don't know how practical that sketched idea would be. It's not very great python, either, so even then I'm still uncertain.