Opened 13 years ago

Closed 11 years ago

Last modified 10 years ago

#16436 closed Bug (fixed)

`annotate()` + `select_related()` + `only()` = `ValueError: invalid literal for int() with base 10`

Reported by: Tai Lee Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Release blocker Keywords: annotate select_related only defer ValueError empty list
Cc: slafs@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

With some combinations of models and arguments to annotate(), select_related() and only(), I get ValueError: invalid literal for int() with base 10.

I think that SQLCompiler.results_iter() is not taking the deferred fields into account when calculating the aggregate_start offset.

To create a test, I started with a large number of models each with a large number of fields, taken from an app that we have in production, and pared it down by trial and error to as simple a form as I could and still reproduce the error.

As I did this (choosing models and fields to remove at random) the behaviour changed from raising a ValueError, to yielding no items when consuming the queryset (but with .count() still returning a positive number), to apparently working as expected.

This is probably related to (or the same issue) that was reported in the comments of #11890.

I had a crack at calculating the correct aggregate_start by looking at the deferred_loading attribute and the q.get_loaded_field_names() method on the Query object, but didn't have any luck.

I've put the test in it's own module, defer_annotate_select_related_only, just because it seems like such an obscure edge-case and I didn't want the other models and tests in defer_regress to have any influence.

Hopefully if someone can figure out how and why this is happening, we can integrate a more concise test into defer_regress.

Attachments (1)

16436-annotate-select_related-only-r16522.diff (3.1 KB ) - added by Tai Lee 13 years ago.
Failing test case and fix.

Download all attachments as: .zip

Change History (20)

comment:1 by Tai Lee, 13 years ago

Has patch: set
Patch needs improvement: set

comment:2 by Aymeric Augustin, 13 years ago

Triage Stage: UnreviewedAccepted

comment:3 by Tai Lee, 13 years ago

Patch needs improvement: unset

I've updated the patch with a couple more tests and a fix. The fix calculates the aggregate_start offset from loaded fields, if any are deferred, instead of self.query.select which includes all fields on the model.

by Tai Lee, 13 years ago

Failing test case and fix.

comment:4 by Tai Lee, 13 years ago

Replaced tabs with spaces.

comment:5 by Tai Lee, 13 years ago

I also noticed that a TypeError raised in resolve_aggregate() is silenced (not sure where), which causes the queryset to yield no results, even though .count() returns a positive number. But other exceptions, such as ValueError are not silenced.

This is why the symptoms of this bug vary between raising a ValueError or yielding no results. If the aggregate_start was calculated in such a way that a datetime value was passed to resolve_aggregate() for an ordinal aggregate such as Count, a TypeError exception is silenced. If a non-numerical string value is passed to an ordinal aggregate, a ValueError exception is raised.

This patch fixes both cases by correctly calculating aggregate_start, but it might also be a good idea to track down where TypeError is being swallowed and fix that, too.

comment:6 by Jacob, 13 years ago

milestone: 1.4

Milestone 1.4 deleted

comment:7 by Tai Lee, 12 years ago

I've updated this patch on a branch at GitHub to apply cleanly.

https://github.com/thirstydigital/django/tree/tickets/16436-annotate-select_related-only

I think this is RFC. Does anyone have feedback or issues that would prevent this from being committed?

comment:8 by Anssi Kääriäinen, 11 years ago

Component: ORM aggregationDatabase layer (models, ORM)

comment:9 by Tai Lee, 11 years ago

Rebased onto master and opened a pull request.

https://github.com/django/django/pull/895

comment:10 by FrankBie, 11 years ago

Triage Stage: AcceptedReady for checkin

For me it seems to be ready to checkin.

comment:11 by Anssi Kääriäinen <akaariai@…>, 11 years ago

Resolution: fixed
Status: newclosed

In 69f7db153d8108dcef033207d49f4c80febf3d70:

Fixed #16436 -- defer + annotate + select_related crash

Correctly calculate the aggregate_start offset from loaded fields,
if any are deferred, instead of self.query.select which includes all
fields on the model.

Also made some PEP 8 fixes.

comment:12 by Aymeric Augustin, 11 years ago

Resolution: fixed
Severity: NormalRelease blocker
Status: closednew

Unfortunately the test doesn't pass under Oracle:
http://ci.djangoproject.com/job/Django%20Oracle/112/database=oracle,python=python2.7/testReport/junit/defer_regress.tests/DeferAnnotateSelectRelatedTest/test_defer_annotate_select_related/

Since it's a direct consequence of the patch, I'm re-opening the ticket and marking as a blocker.

comment:13 by Tim Graham, 11 years ago

Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

comment:14 by Anssi Kääriäinen <akaariai@…>, 11 years ago

Resolution: fixed
Status: newclosed

In 728d3fe1bac6b5f23dbd088e11860cfba51cf7b5:

Fixed #16436 -- Oracle defer_regress test failure

Oracle doesn't like grouping by TextField, so use CharFields instead in
models.

comment:15 by Anssi Kääriäinen <akaariai@…>, 11 years ago

In b495c24375aa1fe0373727511cf81298337a1227:

[1.5.x] Fixed #16436 -- defer + annotate + select_related crash

Correctly calculate the aggregate_start offset from loaded fields,
if any are deferred, instead of self.query.select which includes all
fields on the model.

Backpatch of 69f7db153d8108dcef033207d49f4c80febf3d70 from master.

comment:16 by Tim Graham <timograham@…>, 11 years ago

In ed67184b3051c392a6cb722e889c062233730c2e:

Oracle defer test failure; refs #16436.

Oracle doesn't like grouping by TextField, so use CharFields instead in
models.

Backport of 728d3fe1bac6b5f23dbd088e11860cfba51cf7b5 from master

comment:17 by Shai Berger, 11 years ago

For anyone else who might be confused at first: Anssi backported the fix to the problem, Tim backported the fix to the test on Oracle.

in reply to:  16 comment:18 by Sławek Ehlert, 10 years ago

Cc: slafs@… added

Replying to Tim Graham <timograham@…>:

Oracle doesn't like grouping by TextField, so use CharFields instead in
models.

Actually this is true for all types of LOB columns.

I have some serious doubts about whether or not this should be marked as fixed. If I get this right, we are in a situation where on Oracle one can't do any annotations on a model that has a TextField or BinaryField.

We should probably document this here https://docs.djangoproject.com/en/dev/ref/databases/#textfield-limitations (and add a note also about BinaryField) or try to fix this, by maybe introducing some parameter on a Query class. I noticed that there is some serious work regarding refactoring aggregate support (#14030). Is this issue also applicable for that kind of refactoring?

comment:19 by Anssi Kääriäinen, 10 years ago

My understanding is that this can't be fixed in Django, this is a limitation in Oracle itself.

If documentation is needed for this issue then separate ticket seems like the way to go.

Note: See TracTickets for help on using tickets.
Back to Top