Opened 9 months ago

Closed 9 months ago

#34874 closed Cleanup/optimization (wontfix)

Add table_exists() to optimize checks for django_migrations table.

Reported by: Pierre Owned by: nobody
Component: Database layer (models, ORM) Version: 4.2
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

On databases with a lot of tables, even in several schemas (with django-tenant-schema for instance), the introspection.table_names() call can take tens of ms to run. For instance on a PG database with 8800 relations, the query takes 21ms.
table_names() is called at least by the migration recorder in order to verify the presence of the django_migrations table. It thus slows down every migration execution uselessly.
Fixing the migrations recorder to keep the information cached would be a quick work around this, but it would still execute the long query at least once, but especially with django-tenant-schema it would end up taking tens of seconds (at least one call per tenant).

I suggest adding an table_exists(table_name) function in db.introspection. Its base implementation could be simply a return table_name in self.table_names(), but specific DB engines like PostgreSQL could have an optimized variant.
The SQL query from table_names function modified to add a relname = 'XXX' criteria uses the index on relname and is thus over 20 times faster.

Change History (1)

comment:1 by Mariusz Felisiak, 9 months ago

Resolution: wontfix
Status: newclosed
Summary: the construction "table_name" in introspection.table_names() is suboptimalAdd table_exists() to optimize checks for django_migrations table.

For instance on a PG database with 8800 relations, the query takes 21ms.

As far as I'm aware, tables introspection is extremely fast on PostgreSQL. This is proven by 21ms for 8800 tables.

Fixing the migrations recorder to keep the information cached would be a quick work around this, but it would still execute the long query at least once, but especially with django-tenant-schema it would end up taking tens of seconds (at least one call per tenant).

This is cached in Django 5.0+ (ea8cbca579cc6742e119747fc1eb6ecf90638bce.)

I suggest adding an table_exists(table_name) function in db.introspection. Its base implementation could be simply a return table_name in self.table_names(), but specific DB engines like PostgreSQL could have an optimized variant.

I don't think it's worth adding for a single (already cached) call.

Note: See TracTickets for help on using tickets.
Back to Top