Opened 3 years ago

Last modified 5 weeks ago

#19513 new Bug

update() after annotate(val=Sum(...)) crashes on PostgreSQL & Oracle

Reported by: mengzhuo1203@… Owned by: nobody
Component: Database layer (models, ORM) Version: 1.4
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

An QuerySet operation like:
Combination.objects.all().update(tag='TagName')

The structure like:

class Combination:
    items = ForeignKey(Item)
CombinationManager:
    def get_query_set(self):
        qs = super(CombinationManager, self).get_query_set()
        qs = qs.annotate(price=Sum('items__price'))

The SQLite and MySQL runs fine with such structure, not with Oracle, it will encounter "ORA-00913: too many values"

Change History (9)

comment:1 Changed 3 years ago by mengzhuo

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

I think that I know why MySQL and SQlite runs fine, cause both of them will query first then update the queried sets, however Oracle backend will not, it simply puts everything in the same query.

Here is the buggy Oracle Query set

UPDATE "ITEM_COMBINATION" SET "ACTIVE" = :arg0 WHERE "ITEM_COMBINATION"."ID" IN (
    SELECT U0."ID", SUM(U8."PRICE") AS "PRICE" 
           FROM "ITEM_COMBINATION" U0 
           LEFT OUTER JOIN "ITEM_COMBINATION_ITEMS" U7 ON (U0."ID" = U7."COMBINATION_ID") 
           LEFT OUTER JOIN "ITEM_ITEM" U8 ON (U7."ITEM_ID" = U8."ID") WHERE U0."DURATION_ID" = :arg1  
           GROUP BY U0."ID", U0."ITEM_ID", U0."NAME", U0."SOMEDUMMYITEM", U0."SOMEDUMMYITEM", U0."SOMEDUMMYITEM", U0."SOMEDUMMYITEM", U0."SOMEDUMMYITEM", U0."SOMEDUMMYITEM", U0."ACTIVE", U0."SOMEDUMMYITEM", U0."SOMEDUMMYITEM", U0."CATEGORY_ID"); args=(False, 2)

Here is MySQL Query Set

SELECT U0.`id`, SUM(U8.`price`) AS `price`, MIN(U8.`store`) AS `store` FROM `item_combination` U0 LEFT OUTER JOIN `item_combination_items` U7 ON (U0.`id` = U7.`combination_id`) LEFT OUTER JOIN `item_item` U8 ON (U7.`item_id` = U8.`id`) GROUP BY U0.`id`, U0.`item_id`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM`, U0.`active`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM`, U0.`SOMEDUMMYITEM` ORDER BY NULL; args=()

UPDATE `item_combination` SET `active` = 1 WHERE `item_combination`.`id` IN (1, 3); args=(True, 1, 3)

comment:2 Changed 3 years ago by akaariai

  • Triage Stage changed from Unreviewed to Accepted

I had to use way too much time reproducing this. Your models do not actually work, the SQL in the comment doesn't match the queries in the description. The queries in the description doesn't match the models.

Here are the models & code to reproduce this (on master):

class Item(models.Model):
    price = models.IntegerField()
  
class CombinationManager(models.Manager):
    def get_query_set(self):
        qs = super(CombinationManager, self).get_query_set()
        qs = qs.annotate(price=models.Sum('items__price'))
        return qs
  
class Combination(models.Model):
    tag = models.CharField(max_length=32)
    items = models.ForeignKey(Item)
    objects = CombinationManager()

Combination.objects.all().update(tag='TagName')

The problem is the SUM() select in the subquery. Happens on PostgreSQL too.

comment:3 Changed 10 months ago by timgraham

  • Summary changed from Annotate broken django.db.backends.oracle while updating to update() after annotate(val=Sum(...)) crashes on PostgreSQL & Oracle

comment:4 Changed 10 months ago by jarshwah

See also #25171

comment:5 Changed 6 weeks ago by dsanders11

Has there been any recent progress on this issue? It really limits adding annotations in a Manager because those annotations may prevent updates. I'm seeing this with sqlite3 as well, so the original (3 year old) description is not accurate in that regard. I believe MySQL is OK for the reason in #1, it appears to do a SELECT first and then update only those IDs. The example code akaariai gave still replicates the issue on sqlite3 with the slight tweak that 'get_query_set' is now 'get_queryset'.

I have a fix that I believe nicely resolves the issue, but I don't know nearly enough about query and SQL generating codebase to know if it's missing something obvious or will cause other potential problems.

The fix is fairly simple I simply clear annotations (unsure if the set_annotation_mask call is necessary) on the query after it is cloned and the update values are added in QuerySet.update. That resolved 90% of the issue. On #25171 zauddelig made a comment that this approach of clearing annotations would mess with filters and and F functions using the annotations. At least with filters on sqlite3 this is not a problem, as the filters are applied to the query before the update occurs and simply inserts the resolved annotation into the WHERE clause. Aggregations are already not allowed in the update so really it's only situations where you have an annotation like annotate(range=F('max') - F('min')).update(foo=F('range')). I fixed this by resolving the update values in UpdateQuery. add_update_fields so that they're resolved before stripping the annotations from the UPDATE query.

I believe this fix to be 'safe' in the sense that annotations create new names and aren't allowed to shadow a field on the model so any potential problem from clearing the annotations should appear as a name failing to resolve in a query because the annotation was removed. I don't believe it could cause any subtle mischief or corruption.

Is there any risk in resolving expressions at the time they're added to the update query as an update field?

BTW, zauddelig's comment actually points out an edge case in the MySQL generated code (and for a different reason sqlite3, I tested both) where if you try to use an annotation which spans tables in an update, everything will resolve and attempt to execute but the SQL will fail because for MySQL the query is broken into a SELECT and then an UPDATE, but the necessary info is not available in the UPDATE.

For example:

class Bar(models.Model):
    name = models.CharField(max_length=32)

class Foo(models.Model):
    related_bar = models.ForeignKey(Bar)
    bar_name = models.CharField(max_length=32)

----------------------

Foo.objects.annotate(bar_name=F('related_bar__name')).update(name=F('bar_name'))

------------------------

SELECT `test_foo`.`id`, `test_bar`.`name` AS `bar_name` FROM `test_foo` LEFT OUTER JOIN `test_bar` ON ( `test_foo`.`related_bar` = `test_bar`.`id` );

UPDATE `test_foo` SET `bar_name` = `test_bar`.`name` WHERE `test_foo`.`id` IN (1);

The generated SQL for sqlite3 is similar with the SELECT inside the WHERE clause. Seems like if this case is allowed (which is a narrow use case since aggregates aren't allowed in the update at the moment) it would need to add the JOIN to the update query for everything to work as expected. Instead it fails currently during the SQL execution. Note, I only tested this on Django 1.8.7 but I haven't seen any recent changes in the related code to suggest that a similar result wouldn't happen on master.

comment:6 Changed 6 weeks ago by timgraham

There isn't much risk in adding a regression test to your patch and sending a pull request. If it passes the test suite, we'll at least know there aren't any known issues with the approach you suggested and query experts are more likely to provide feedback. Thanks!

comment:7 Changed 6 weeks ago by dsanders11

It may be a bit before I get the time to do a test for it, so I'll let it sit here for a bit and hopefully if it's a waste of time someone will chime in with that info before I spend the time writing a test. I can probably get to a test in a couple weeks.

comment:8 Changed 5 weeks ago by dsanders11

Opened a PR request with a regression test as requested. The PR should also fix #18580 which is the same underlying issue as this ticket.

comment:9 Changed 5 weeks ago by timgraham

  • Has patch set
Note: See TracTickets for help on using tickets.
Back to Top