Opened 11 years ago

Closed 9 years ago

Last modified 9 years ago

#16436 closed Bug (fixed)

`annotate()` + `select_related()` + `only()` = `ValueError: invalid literal for int() with base 10`

Reported by: Tai Lee Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Release blocker Keywords: annotate select_related only defer ValueError empty list
Cc: slafs@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

With some combinations of models and arguments to annotate(), select_related() and only(), I get ValueError: invalid literal for int() with base 10.

I think that SQLCompiler.results_iter() is not taking the deferred fields into account when calculating the aggregate_start offset.

To create a test, I started with a large number of models each with a large number of fields, taken from an app that we have in production, and pared it down by trial and error to as simple a form as I could and still reproduce the error.

As I did this (choosing models and fields to remove at random) the behaviour changed from raising a ValueError, to yielding no items when consuming the queryset (but with .count() still returning a positive number), to apparently working as expected.

This is probably related to (or the same issue) that was reported in the comments of #11890.

I had a crack at calculating the correct aggregate_start by looking at the deferred_loading attribute and the q.get_loaded_field_names() method on the Query object, but didn't have any luck.

I've put the test in it's own module, defer_annotate_select_related_only, just because it seems like such an obscure edge-case and I didn't want the other models and tests in defer_regress to have any influence.

Hopefully if someone can figure out how and why this is happening, we can integrate a more concise test into defer_regress.

Attachments (1)

16436-annotate-select_related-only-r16522.diff (3.1 KB) - added by Tai Lee 11 years ago.
Failing test case and fix.

Download all attachments as: .zip

Change History (20)

comment:1 Changed 11 years ago by Tai Lee

Has patch: set
Patch needs improvement: set

comment:2 Changed 11 years ago by Aymeric Augustin

Triage Stage: UnreviewedAccepted

comment:3 Changed 11 years ago by Tai Lee

Patch needs improvement: unset

I've updated the patch with a couple more tests and a fix. The fix calculates the aggregate_start offset from loaded fields, if any are deferred, instead of self.query.select which includes all fields on the model.

Changed 11 years ago by Tai Lee

Failing test case and fix.

comment:4 Changed 11 years ago by Tai Lee

Replaced tabs with spaces.

comment:5 Changed 11 years ago by Tai Lee

I also noticed that a TypeError raised in resolve_aggregate() is silenced (not sure where), which causes the queryset to yield no results, even though .count() returns a positive number. But other exceptions, such as ValueError are not silenced.

This is why the symptoms of this bug vary between raising a ValueError or yielding no results. If the aggregate_start was calculated in such a way that a datetime value was passed to resolve_aggregate() for an ordinal aggregate such as Count, a TypeError exception is silenced. If a non-numerical string value is passed to an ordinal aggregate, a ValueError exception is raised.

This patch fixes both cases by correctly calculating aggregate_start, but it might also be a good idea to track down where TypeError is being swallowed and fix that, too.

comment:6 Changed 11 years ago by Jacob

milestone: 1.4

Milestone 1.4 deleted

comment:7 Changed 10 years ago by Tai Lee

I've updated this patch on a branch at GitHub to apply cleanly.

https://github.com/thirstydigital/django/tree/tickets/16436-annotate-select_related-only

I think this is RFC. Does anyone have feedback or issues that would prevent this from being committed?

comment:8 Changed 9 years ago by Anssi Kääriäinen

Component: ORM aggregationDatabase layer (models, ORM)

comment:9 Changed 9 years ago by Tai Lee

Rebased onto master and opened a pull request.

https://github.com/django/django/pull/895

comment:10 Changed 9 years ago by FrankBie

Triage Stage: AcceptedReady for checkin

For me it seems to be ready to checkin.

comment:11 Changed 9 years ago by Anssi Kääriäinen <akaariai@…>

Resolution: fixed
Status: newclosed

In 69f7db153d8108dcef033207d49f4c80febf3d70:

Fixed #16436 -- defer + annotate + select_related crash

Correctly calculate the aggregate_start offset from loaded fields,
if any are deferred, instead of self.query.select which includes all
fields on the model.

Also made some PEP 8 fixes.

comment:12 Changed 9 years ago by Aymeric Augustin

Resolution: fixed
Severity: NormalRelease blocker
Status: closednew

Unfortunately the test doesn't pass under Oracle:
http://ci.djangoproject.com/job/Django%20Oracle/112/database=oracle,python=python2.7/testReport/junit/defer_regress.tests/DeferAnnotateSelectRelatedTest/test_defer_annotate_select_related/

Since it's a direct consequence of the patch, I'm re-opening the ticket and marking as a blocker.

comment:13 Changed 9 years ago by Tim Graham

Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

comment:14 Changed 9 years ago by Anssi Kääriäinen <akaariai@…>

Resolution: fixed
Status: newclosed

In 728d3fe1bac6b5f23dbd088e11860cfba51cf7b5:

Fixed #16436 -- Oracle defer_regress test failure

Oracle doesn't like grouping by TextField, so use CharFields instead in
models.

comment:15 Changed 9 years ago by Anssi Kääriäinen <akaariai@…>

In b495c24375aa1fe0373727511cf81298337a1227:

[1.5.x] Fixed #16436 -- defer + annotate + select_related crash

Correctly calculate the aggregate_start offset from loaded fields,
if any are deferred, instead of self.query.select which includes all
fields on the model.

Backpatch of 69f7db153d8108dcef033207d49f4c80febf3d70 from master.

comment:16 Changed 9 years ago by Tim Graham <timograham@…>

In ed67184b3051c392a6cb722e889c062233730c2e:

Oracle defer test failure; refs #16436.

Oracle doesn't like grouping by TextField, so use CharFields instead in
models.

Backport of 728d3fe1bac6b5f23dbd088e11860cfba51cf7b5 from master

comment:17 Changed 9 years ago by Shai Berger

For anyone else who might be confused at first: Anssi backported the fix to the problem, Tim backported the fix to the test on Oracle.

comment:18 in reply to:  16 Changed 9 years ago by Sławek Ehlert

Cc: slafs@… added

Replying to Tim Graham <timograham@…>:

Oracle doesn't like grouping by TextField, so use CharFields instead in
models.

Actually this is true for all types of LOB columns.

I have some serious doubts about whether or not this should be marked as fixed. If I get this right, we are in a situation where on Oracle one can't do any annotations on a model that has a TextField or BinaryField.

We should probably document this here https://docs.djangoproject.com/en/dev/ref/databases/#textfield-limitations (and add a note also about BinaryField) or try to fix this, by maybe introducing some parameter on a Query class. I noticed that there is some serious work regarding refactoring aggregate support (#14030). Is this issue also applicable for that kind of refactoring?

comment:19 Changed 9 years ago by Anssi Kääriäinen

My understanding is that this can't be fixed in Django, this is a limitation in Oracle itself.

If documentation is needed for this issue then separate ticket seems like the way to go.

Note: See TracTickets for help on using tickets.
Back to Top