Opened 6 years ago

Closed 6 years ago

#30053 closed New feature (wontfix)

Allow for conditional QuerySet.update_or_create()

Reported by: Joshua Cannon Owned by: Nasir Hussain
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: QuerySet update_or_create
Cc: Nasir Hussain, Adam Johnson Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Joshua Cannon)

QuerySet.update_or_create() is great since it lifts the burden of thread-safety and repeated code from the client. However there exists some scenarios where the "update" logic should be hidden behind a condition. (E.g. only update if the value of a DateTimeField is less than some value). There isn't much help for clients in these scenarios, with the best solution being to copy+paste QuerySet.update_or_create() and wrap the setattr-and-save logic in a conditional (worst case being they roll their own code, which likely would lead to thread-safety issues)

The condition would most likely be a callable that accepts one argument, the gotten instance.

The solution should be trivial, either:

  1. Add a "condition" parameter to update_or_create() (along with logic to make the feature backwards compatible, likely checking if the value is callable with one argument or not, since "condition" could be a field)
  2. Add another method, and re-write update_or_create() in terms of the new method with a condition that is always true.

Change History (19)

comment:1 by Joshua Cannon, 6 years ago

Description: modified (diff)

comment:2 by Simon Charette, 6 years ago

Could you detail your use case a bit more? I'm not sure I understand how adding a condition to the if created statement would allow you to put updates behind a condition given it's only reached when the row gets created.

Also, have you experimented with chaining filter().update_or_create() to limit its application to a filtered domain?

Last edited 6 years ago by Simon Charette (previous) (diff)

comment:3 by Joshua Cannon, 6 years ago

Description: modified (diff)

in reply to:  2 ; comment:4 by Joshua Cannon, 6 years ago

Replying to Simon Charette:

Could you detail your use case a bit more? I'm not sure I understand how adding a condition to the if created statement would allow you to put updates behind a condition given it's only reached when the row gets created.

Also, have you experimented with chaining filter().update_or_create() to limit its application to a filtered domain?

Oohh. good observation (See, I'm already introducing bugs in my solution and I'm aware of the problem!). Modified the details.

Also, no I haven't. Assuming that works and behaves as expected that might be the solution I'm looking for. If that's the case, perhaps a note in the documentation would be beneficial to novice/intermediate developers like myself.

in reply to:  4 comment:5 by Joshua Cannon, 6 years ago

Replying to Joshua Cannon:

Replying to Simon Charette:

Could you detail your use case a bit more? I'm not sure I understand how adding a condition to the if created statement would allow you to put updates behind a condition given it's only reached when the row gets created.

Also, have you experimented with chaining filter().update_or_create() to limit its application to a filtered domain?

Oohh. good observation (See, I'm already introducing bugs in my solution and I'm aware of the problem!). Modified the details.

Also, no I haven't. Assuming that works and behaves as expected that might be the solution I'm looking for. If that's the case, perhaps a note in the documentation would be beneficial to novice/intermediate developers like myself.

OK, filter().create_or_update() isn't the same, as it would create a new instance if the condition fails, but the kwargs would've matched.
E.g.:

class MyModel(models.Model):
    magic_number = models.IntegerField(default=0)
    should_be_updated = models.BooleanField(default=True)

MyModel.objects.create(magic_number=42, should_be_updated=False)

# The following should be a no-op
MyModel.objects.update_or_create(magic_number=42, condition=lambda m: m.should_be_updated)

# The following creates a new instance
MyModel.objects.filter(should_be_updated=True).update_or_create(magic_number=42)

comment:6 by Nasir Hussain, 6 years ago

Hi, I've chained the condition in the query set and created a PR here https://github.com/django/django/pull/10800 . Kindly let me know if there are any bugs or need improvements.

Last edited 6 years ago by Nasir Hussain (previous) (diff)

comment:7 by Nasir Hussain, 6 years ago

Owner: changed from nobody to Nasir Hussain
Status: newassigned

comment:8 by Nasir Hussain, 6 years ago

Cc: Nasir Hussain added
Has patch: set

comment:9 by Simon Charette, 6 years ago

Hello there,

In order to avoid wasted efforts I'd suggest you try to gather consensus on the mailing list about whether or not this feature might be useful to other developers before commiting time to a final implementation. A PR can certainly help in backing up backward compatiblity and implementation claims but since this complexifies an already complex method I'd be great to confirm it's a desired feature first. It's also a great opportunity to get implementation feedback or identify a larger issue.

I'm personally not sold on the feature addition because it looks like it could be implemented with a prior update query and accepting a conditional callable to match a model instances is not a pattern used anywhere else.

e.g

with transaction.atomic():
    # .update() returns the number of updated rows.
    if not objects.filter(foo=bar, date__lt=minimal).update(**defaults):
        # Either the row doesn't exist or doesn't match the optimistic condition
        objects.update_or_create(foo=bar, default=defaults)

in reply to:  9 comment:10 by Nasir Hussain, 6 years ago

Hi,
The feature requested has the following cases if the update condition is provided:

  1. No objects matching foo=bar -> Create a new object
  2. Object matching foo=bar but doesn't satisfy the condition -> do nothing.
  3. Object matching foo=bar and satisfy the condition -> update the object.

In the code below If there is an object foo=bar and doesn't satisfy condition date__lt=minimal the object foo=bar will still get updated.

 
 with transaction.atomic():
     # .update() returns the number of updated rows.
     if not objects.filter(foo=bar, date__lt=minimal).update(**defaults):
         # Either the row doesn't exist or doesn't match the optimistic condition
         objects.update_or_create(foo=bar, default=defaults)

What we usually do is:

 
 with transaction.atomic():
     # .update() returns the number of updated rows.
     if not objects.filter(foo=bar, date__lt=minimal).update(**defaults) and not objects.filter(foo=bar).exists():
         # Either the row doesn't exist or doesn't match the optimistic condition
         objects.create

Which leads to 3 different queries to database.1st for the update, 2nd for exists and 3rd to save.
If we add an update condition parameter, the same could be achieved in 2 queries.

Replying to Simon Charette:

Hello there,

In order to avoid wasted efforts I'd suggest you try to gather consensus on the mailing list about whether or not this feature might be useful to other developers before commiting time to a final implementation. A PR can certainly help in backing up backward compatiblity and implementation claims but since this complexifies an already complex method I'd be great to confirm it's a desired feature first. It's also a great opportunity to get implementation feedback or identify a larger issue.

I'm personally not sold on the feature addition because it looks like it could be implemented with a prior update query and accepting a conditional callable to match a model instances is not a pattern used anywhere else.

e.g

with transaction.atomic():
    # .update() returns the number of updated rows.
    if not objects.filter(foo=bar, date__lt=minimal).update(**defaults):
        # Either the row doesn't exist or doesn't match the optimistic condition
        objects.update_or_create(foo=bar, default=defaults)
Last edited 6 years ago by Nasir Hussain (previous) (diff)

comment:11 by Simon Charette, 6 years ago

I see, thanks for the clarification. I still think this should be discussed on the mailing list though to gather feedback about whether or not there's a better approach to deal with this kind of scenario.

in reply to:  9 comment:12 by Joshua Cannon, 6 years ago

I'll make a post there and link it here. Thanks for the suggestion.

comment:13 by Simon Charette, 6 years ago

Also, can't this be achieved by using get_or_create instead?

with transaction.atomic():
    obj, created = objects.select_for_update().get_or_create(foo=bar, defaults=defaults)
    if not created and obj.date < minimal:
        # set obj attributes...
        obj.save(update_fields=...)

This would result in the same behaviour and only two queries.

in reply to:  13 comment:14 by Joshua Cannon, 6 years ago

True, and that's probably the closest we'll get to a client-side solution, but it does involve mechanics (atomic transactions) that a beginner (maybe even intermediate) Django programmer are likely to miss.
It also relies on the user re-implementing the setattr logic in that nice little # set obj attributes... comment ;)

I concede the unary callable condition is not something found elsewhere in Django, but I'd argue the benefit to client code (clean, readable, thread-safe code) is worth the new paradigm.

comment:15 by Nasir Hussain, 6 years ago

I second Joshua Cannon. Moreover get_or_create itself does 2 queries.

comment:17 by Adam Johnson, 6 years ago

Cc: Adam Johnson added

comment:18 by Simon Charette, 6 years ago

Moreover get_or_create itself does 2 queries.

get_or_create only does two queries by itself in the create case which the proposed solution through update_or_create(update_condition) does as well so this argument is moot.

In other words, both the solution in comment:13 and the one proposed here through update_or_create(update_condition) perform the same optimistic queries.

  1. SELECT FOR UPDATE ... WHERE
  2. CREATE if no matches else UPDATE
Last edited 6 years ago by Simon Charette (previous) (diff)

comment:19 by Tim Graham, 6 years ago

Resolution: wontfix
Status: assignedclosed

I don't see a consensus on the mailing list discussion to move forward with this.

Note: See TracTickets for help on using tickets.
Back to Top