#13844 closed Bug (fixed)
Errors when using character fields for aggregation
Reported by: | Owned by: | Greg Wogan-Browne | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.2 |
Severity: | Normal | Keywords: | |
Cc: | philipe.rp@… | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
My intension is joining tables with two different formated columns,
by using django's double underscore magic to resolve the querysets.
The values in table B are preceded by leading zeros.
table A --> id = "10"
table B --> id = "000010"
select distinct b.id from a, b where where a.id = TRIM(leading '0' FROM b.id);
The resulting code should look like this
qs_a=A.objects.all().values('id') qs_b=B.objects.annotate(id_trim=Trim('id', position='leading', char='0')).filter(id_trim__in=qs_a)
I use the following code to implement the extra functionality.
try: import psycopg2 except ImportError, e: import psycopg else: psycopg = psycopg2._psycopg class Trim(django.db.models.Aggregate): name = 'Trim_pg' django.db.models.Trim=Trim class Trim_pg(django.db.models.sql.aggregates.Aggregate): ''' position = [leading, trailing, both] char = <character to remove> ''' sql_function = 'TRIM' sql_template = '''%(function)s(%(position)s %(char)s FROM %(field)s)''' def __init__(self, col, distinct=False, **extra): assert extra.has_key('position'), u'no position' assert extra['position'] in ('leading', 'trailing', 'both'), 'position no in [leading, trailing, both]' assert extra.has_key('char'), u'no char' assert len(extra['char']) == 1, 'only one character' extra['char']=str(psycopg2._psycopg.QuotedString(extra['char'])) #Quoting super(Trim_pg, self).__init__(col, distinct=distinct, **extra) django.db.models.sql.aggregates.Trim_pg=Trim_pg
The problem is, that "convert_values" makes for a "CharField"
a cast to float. My solution is to return the value for CharFields
without the cast.
Index: db/backends/__init__.py =================================================================== --- db/backends/__init__.py (Revision 12595) +++ db/backends/__init__.py (Arbeitskopie) @@ -438,6 +438,8 @@ return int(value) elif internal_type in ('DateField', 'DateTimeField', 'TimeField'): return value + elif internal_type in ('CharField'): + return value # No field, or the field isn't known to be a decimal or integer # Default to a float return float(value)
Change History (14)
comment:1 by , 14 years ago
Component: | Uncategorized → Database layer (models, ORM) |
---|---|
Summary: | ValueError: by adding custom aggregate function TRIM → Errors when using character fields for aggregation |
Triage Stage: | Unreviewed → Accepted |
Version: | 1.1 → 1.2 |
comment:2 by , 14 years ago
Cc: | added |
---|---|
Has patch: | set |
I'm the guy at IRC. The proposed fix already works for me. I having troubles in get Max from a charfield.
Given model:
class Link_A(models.Model): link = models.CharField(max_length=500) short_link = models.CharField(max_length=50)
I get following error when try to get Max of short_link:
In [5]: Link_A.objects.all().aggregate(Max('short_link')) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /home/felipe/projects/testproject/<ipython console> in <module>() /home/felipe/.py/2.6/lib/python2.6/site-packages/django/db/models/query.pyc in aggregate(self, *args, **kwargs) 311 is_summary=True) 312 --> 313 return query.get_aggregation(using=self.db) 314 315 def count(self): /home/felipe/.py/2.6/lib/python2.6/site-packages/django/db/models/sql/query.pyc in get_aggregation(self, using) 371 (alias, self.resolve_aggregate(val, aggregate, connection=connections[using])) 372 for (alias, aggregate), val --> 373 in zip(query.aggregate_select.items(), result) 374 ]) 375 /home/felipe/.py/2.6/lib/python2.6/site-packages/django/db/models/sql/query.pyc in resolve_aggregate(self, value, aggregate, connection) 325 else: 326 # Return value depends on the type of the field being processed. --> 327 return self.convert_values(value, aggregate.field, connection) 328 329 def get_aggregation(self, using): /home/felipe/.py/2.6/lib/python2.6/site-packages/django/db/models/sql/query.pyc in convert_values(self, value, field, connection) 303 it can be overridden by Query classes for specific backends. 304 """ --> 305 return connection.ops.convert_values(value, field) 306 307 def resolve_aggregate(self, value, aggregate, connection): /home/felipe/.py/2.6/lib/python2.6/site-packages/django/db/backends/__init__.pyc in convert_values(self, value, field) 443 # No field, or the field isn't known to be a decimal or integer 444 # Default to a float --> 445 return float(value) 446 447 def check_aggregate_support(self, aggregate_func): ValueError: invalid literal for float(): c
and checking queries, I can see that querie are made:
In [6]: connection.queries Out[6]: [{'sql': 'SELECT MAX("webui_link_a"."short_link") AS "short_link__max" FROM "webui_link_a"', 'time': '0.001'}]
That's is a small fix, but I think that need to check how act with others internal types, or default to float only if float(value) don't raise nothing and just return value if raise.
comment:3 by , 14 years ago
This is a related issue: #12889 Using annotation unexpectedly returns DecimalFields as floats
comment:4 by , 14 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:5 by , 14 years ago
Severity: | → Normal |
---|---|
Type: | → Bug |
comment:6 by , 14 years ago
Description: | modified (diff) |
---|---|
Needs tests: | set |
Patch needs improvement: | set |
Patch needs improvement as per chronos' comment. Also needs tests.
comment:9 by , 12 years ago
The proposed solution does not solve the underlying design issue: this conversion should be done *in the fields*, not in the database backend! I.e.
def convert_values(self, value, field): return field.convert_value(value)
and I expect (but am not 100% certain) that field.to_python() already does exactly what should happen. The differences I can see at the moment are a) the default float() and b) CommaSeparatedIntegerField (which shouldn't be cast to int in the first place).
a) the default to_python implementation is just
{{{ def to_python(self, value):
return value
}}}
b) the CommaSeparatedIntegerField is actually just a charfield with a specific validation.
comment:10 by , 12 years ago
I have submitted a pull request with a regression test & a fix for the issue.
An even simpler option, which uses
def convert_values(self, value, field): return value
fails AggregationTests.test_more;
The pull-requested-version works correctly and passes all aggregation and aggregation_regress tests under both sqlite and postgresql.
comment:11 by , 12 years ago
Last, but not least: a work-around for postgresql_psycopg2:
from django.db.backends.postgresql_psycopg2 import base class DatabaseOperations(base.DatabaseOperations): def convert_values(self, value, field): return field.to_python(value) class DatabaseWrapper(base.DatabaseWrapper): def __init__(self, *args, **kwargs): super(DatabaseWrapper, self).__init__(*args, **kwargs) self.ops = DatabaseOperations(self)
comment:12 by , 12 years ago
Needs tests: | unset |
---|---|
Patch needs improvement: | unset |
Triage Stage: | Accepted → Ready for checkin |
There is a comment in Oracle's convert_values() which says that because to_python() is meant for validation, it should not be used for converting values. At least one possible problem is that the to_python() method raises ValidationError
on problematic input, and that error doesn't fit here.
I am proposing (and I am going to likely commit) a patch which just removes the non-necessary float(value) from convert_values. This fixes PostgreSQL and MySQL. Oracle and SQLite already override the convert_values() method, and return the bare value in the "unknown field type" case.
In addition, the suggestion made in the comments of this ticket that the conversion should be done by the field, not by the backend gets a +1 from me. It might be as straightforward as defining to_python() to do the conversion, but IMO it needs separate ticket and some more investigation.
comment:13 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
For the life of me I can't understand the logic of the original
convert_values
method, but it definitely seems wrong.Someone in IRC encountered a problem because of this using the built in
Max
aggregation class with a charfield, so it's not just custom aggregate classes which are effected here.