Opened 9 months ago

Closed 8 months ago

#30053 closed New feature (wontfix)

Allow for conditional QuerySet.update_or_create()

Reported by: Joshua Cannon Owned by: Nasir Hussain
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: QuerySet update_or_create
Cc: Nasir Hussain, Adam (Chainz) Johnson Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Joshua Cannon)

QuerySet.update_or_create() is great since it lifts the burden of thread-safety and repeated code from the client. However there exists some scenarios where the "update" logic should be hidden behind a condition. (E.g. only update if the value of a DateTimeField is less than some value). There isn't much help for clients in these scenarios, with the best solution being to copy+paste QuerySet.update_or_create() and wrap the setattr-and-save logic in a conditional (worst case being they roll their own code, which likely would lead to thread-safety issues)

The condition would most likely be a callable that accepts one argument, the gotten instance.

The solution should be trivial, either:

  1. Add a "condition" parameter to update_or_create() (along with logic to make the feature backwards compatible, likely checking if the value is callable with one argument or not, since "condition" could be a field)
  2. Add another method, and re-write update_or_create() in terms of the new method with a condition that is always true.

Change History (19)

comment:1 Changed 9 months ago by Joshua Cannon

Description: modified (diff)

comment:2 Changed 9 months ago by Simon Charette

Could you detail your use case a bit more? I'm not sure I understand how adding a condition to the if created statement would allow you to put updates behind a condition given it's only reached when the row gets created.

Also, have you experimented with chaining filter().update_or_create() to limit its application to a filtered domain?

Last edited 9 months ago by Simon Charette (previous) (diff)

comment:3 Changed 9 months ago by Joshua Cannon

Description: modified (diff)

comment:4 in reply to:  2 ; Changed 9 months ago by Joshua Cannon

Replying to Simon Charette:

Could you detail your use case a bit more? I'm not sure I understand how adding a condition to the if created statement would allow you to put updates behind a condition given it's only reached when the row gets created.

Also, have you experimented with chaining filter().update_or_create() to limit its application to a filtered domain?

Oohh. good observation (See, I'm already introducing bugs in my solution and I'm aware of the problem!). Modified the details.

Also, no I haven't. Assuming that works and behaves as expected that might be the solution I'm looking for. If that's the case, perhaps a note in the documentation would be beneficial to novice/intermediate developers like myself.

comment:5 in reply to:  4 Changed 9 months ago by Joshua Cannon

Replying to Joshua Cannon:

Replying to Simon Charette:

Could you detail your use case a bit more? I'm not sure I understand how adding a condition to the if created statement would allow you to put updates behind a condition given it's only reached when the row gets created.

Also, have you experimented with chaining filter().update_or_create() to limit its application to a filtered domain?

Oohh. good observation (See, I'm already introducing bugs in my solution and I'm aware of the problem!). Modified the details.

Also, no I haven't. Assuming that works and behaves as expected that might be the solution I'm looking for. If that's the case, perhaps a note in the documentation would be beneficial to novice/intermediate developers like myself.

OK, filter().create_or_update() isn't the same, as it would create a new instance if the condition fails, but the kwargs would've matched.
E.g.:

class MyModel(models.Model):
    magic_number = models.IntegerField(default=0)
    should_be_updated = models.BooleanField(default=True)

MyModel.objects.create(magic_number=42, should_be_updated=False)

# The following should be a no-op
MyModel.objects.update_or_create(magic_number=42, condition=lambda m: m.should_be_updated)

# The following creates a new instance
MyModel.objects.filter(should_be_updated=True).update_or_create(magic_number=42)

comment:6 Changed 9 months ago by Nasir Hussain

Hi, I've chained the condition in the query set and created a PR here https://github.com/django/django/pull/10800 . Kindly let me know if there are any bugs or need improvements.

Last edited 9 months ago by Nasir Hussain (previous) (diff)

comment:7 Changed 9 months ago by Nasir Hussain

Owner: changed from nobody to Nasir Hussain
Status: newassigned

comment:8 Changed 9 months ago by Nasir Hussain

Cc: Nasir Hussain added
Has patch: set

comment:9 Changed 9 months ago by Simon Charette

Hello there,

In order to avoid wasted efforts I'd suggest you try to gather consensus on the mailing list about whether or not this feature might be useful to other developers before commiting time to a final implementation. A PR can certainly help in backing up backward compatiblity and implementation claims but since this complexifies an already complex method I'd be great to confirm it's a desired feature first. It's also a great opportunity to get implementation feedback or identify a larger issue.

I'm personally not sold on the feature addition because it looks like it could be implemented with a prior update query and accepting a conditional callable to match a model instances is not a pattern used anywhere else.

e.g

with transaction.atomic():
    # .update() returns the number of updated rows.
    if not objects.filter(foo=bar, date__lt=minimal).update(**defaults):
        # Either the row doesn't exist or doesn't match the optimistic condition
        objects.update_or_create(foo=bar, default=defaults)

comment:10 in reply to:  9 Changed 9 months ago by Nasir Hussain

Hi,
The feature requested has the following cases if the update condition is provided:

  1. No objects matching foo=bar -> Create a new object
  2. Object matching foo=bar but doesn't satisfy the condition -> do nothing.
  3. Object matching foo=bar and satisfy the condition -> update the object.

In the code below If there is an object foo=bar and doesn't satisfy condition date__lt=minimal the object foo=bar will still get updated.

 
 with transaction.atomic():
     # .update() returns the number of updated rows.
     if not objects.filter(foo=bar, date__lt=minimal).update(**defaults):
         # Either the row doesn't exist or doesn't match the optimistic condition
         objects.update_or_create(foo=bar, default=defaults)

What we usually do is:

 
 with transaction.atomic():
     # .update() returns the number of updated rows.
     if not objects.filter(foo=bar, date__lt=minimal).update(**defaults) and not objects.filter(foo=bar).exists():
         # Either the row doesn't exist or doesn't match the optimistic condition
         objects.create

Which leads to 3 different queries to database.1st for the update, 2nd for exists and 3rd to save.
If we add an update condition parameter, the same could be achieved in 2 queries.

Replying to Simon Charette:

Hello there,

In order to avoid wasted efforts I'd suggest you try to gather consensus on the mailing list about whether or not this feature might be useful to other developers before commiting time to a final implementation. A PR can certainly help in backing up backward compatiblity and implementation claims but since this complexifies an already complex method I'd be great to confirm it's a desired feature first. It's also a great opportunity to get implementation feedback or identify a larger issue.

I'm personally not sold on the feature addition because it looks like it could be implemented with a prior update query and accepting a conditional callable to match a model instances is not a pattern used anywhere else.

e.g

with transaction.atomic():
    # .update() returns the number of updated rows.
    if not objects.filter(foo=bar, date__lt=minimal).update(**defaults):
        # Either the row doesn't exist or doesn't match the optimistic condition
        objects.update_or_create(foo=bar, default=defaults)
Last edited 9 months ago by Nasir Hussain (previous) (diff)

comment:11 Changed 9 months ago by Simon Charette

I see, thanks for the clarification. I still think this should be discussed on the mailing list though to gather feedback about whether or not there's a better approach to deal with this kind of scenario.

comment:12 in reply to:  9 Changed 9 months ago by Joshua Cannon

I'll make a post there and link it here. Thanks for the suggestion.

comment:13 Changed 9 months ago by Simon Charette

Also, can't this be achieved by using get_or_create instead?

with transaction.atomic():
    obj, created = objects.select_for_update().get_or_create(foo=bar, defaults=defaults)
    if not created and obj.date < minimal:
        # set obj attributes...
        obj.save(update_fields=...)

This would result in the same behaviour and only two queries.

comment:14 in reply to:  13 Changed 9 months ago by Joshua Cannon

True, and that's probably the closest we'll get to a client-side solution, but it does involve mechanics (atomic transactions) that a beginner (maybe even intermediate) Django programmer are likely to miss.
It also relies on the user re-implementing the setattr logic in that nice little # set obj attributes... comment ;)

I concede the unary callable condition is not something found elsewhere in Django, but I'd argue the benefit to client code (clean, readable, thread-safe code) is worth the new paradigm.

comment:15 Changed 9 months ago by Nasir Hussain

I second Joshua Cannon. Moreover get_or_create itself does 2 queries.

comment:17 Changed 9 months ago by Adam (Chainz) Johnson

Cc: Adam (Chainz) Johnson added

comment:18 Changed 9 months ago by Simon Charette

Moreover get_or_create itself does 2 queries.

get_or_create only does two queries by itself in the create case which the proposed solution through update_or_create(update_condition) does as well so this argument is moot.

In other words, both the solution in comment:13 and the one proposed here through update_or_create(update_condition) perform the same optimistic queries.

  1. SELECT FOR UPDATE ... WHERE
  2. CREATE if no matches else UPDATE
Last edited 9 months ago by Simon Charette (previous) (diff)

comment:19 Changed 8 months ago by Tim Graham

Resolution: wontfix
Status: assignedclosed

I don't see a consensus on the mailing list discussion to move forward with this.

Note: See TracTickets for help on using tickets.
Back to Top