Opened 2 days ago

Closed 16 hours ago

#35793 closed New feature (wontfix)

Add support for atomic upserts

Reported by: Storm Heg Owned by:
Component: Database layer (models, ORM) Version:
Severity: Normal Keywords:
Cc: Storm Heg Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hello,

I took over maintenance of a package called wagtail-ab-testing which performs a raw SQL query to insert or update (UPSERT) a model atomically (link to relevant source code)

This raw sql query confused me, but as it turns out it is there because this the Django ORM does not appear support atomic upserts without taking a lock on the row, which hurts performance

Specifically, the (performant) raw SQL for PostgreSQL looks like this:

INSERT INTO %s (ab_test_id, version, date, hour, participants, conversions)
VALUES (%ss, %ss, %%s, %%s, %%s, %%s)
ON CONFLICT (ab_test_id, version, date, hour)
  DO UPDATE SET participants = %s.participants + %%s, conversions = %s.conversions + %%s;

There are a few things to note here:

  • There is an unique constraint on combination of ab_test_id, version, date, and hour columns that prevents duplicate objects from being created.
  • The participants and conversions columns are updated atomically; That is: these columns are incremented by the given values, not set directly. This is important for atomicity. We don't want multiple concurrent database calls to overwrite each other.

The more common way to do this would be to use the `update_or_create` method, but this internally takes a lock on the row in case of updates, which is not acceptable because this is a hot code path with a lot of concurrent requests. We can’t afford to take a lock on the row.

I think it would be great if Django ORM supported atomic upserts without taking a lock on the row. I’m not sure what the api for that would look like, suggestions are welcome.

As for database support, PostgreSQL and SQLite support this syntax since PostgreSQL 9.5 and SQLite 3.24.0 respectively. MySQL and MariaDB do not support this natively, but it can be emulated using INSERT ... ON DUPLICATE KEY UPDATE syntax and creating a unique index on the columns.

Oracle apparently supports something similar using the vastly different MERGE statement syntax. That might be a bit of a challenge.

Looking for feedback on this idea. Is this something that would be worth supporting in Django? I’ll admit it’s a bit of a niche feature, but it’s a very useful one in certain cases. I would love to get rid of the raw SQL query in wagtail-ab-testing and have something that works across all supported databases.

  • Storm

Change History (1)

comment:1 by Sarah Boyce, 16 hours ago

Resolution: wontfix
Status: newclosed

Hi Storm!

I think it might be worth asking on the Django forum for help/input and then confirming if something new would need adding to Django (and what that would look like).
We also ask contributors to get feedback from the community through a forum discussion first, so we can confirm there is community consensus that we should add this feature

If you get positive feedback from the forum, can you link it back here and reopen the ticket?

It feels like what you need might be possible using a combination of force_insert and F expressions, but I'm not sure if I followed completely and I could be wrong

Note: See TracTickets for help on using tickets.
Back to Top