Opened 12 years ago

Closed 12 years ago

#18461 closed Bug (fixed)

UnicodeDecodeError in sql logger

Reported by: zw0rk Owned by: nobody
Component: Core (Other) Version: dev
Severity: Normal Keywords:
Cc: charette.s@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Trying to filter a queryset with unicode values causes UnicodeDecodeError in sql logger.

>>> User.objects.filter(last_name=u'Z')
[]
>>> User.objects.filter(last_name=u'й')
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/django/db/models/query.py", line 74, in __repr__
    data = list(self[:REPR_OUTPUT_SIZE + 1])
  File "/usr/lib/python2.7/site-packages/django/db/models/query.py", line 89, in __len__
    self._result_cache.extend(self._iter)
  File "/usr/lib/python2.7/site-packages/django/db/models/query.py", line 294, in iterator
    for row in compiler.results_iter():
  File "/usr/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 764, in results_iter
    for rows in self.execute_sql(MULTI):
  File "/usr/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 819, in execute_sql
    cursor.execute(sql, params)
  File "/usr/lib/python2.7/site-packages/django/db/backends/util.py", line 51, in execute
    logger.debug('(%.3f) %s; args=%s' % (duration, sql, params),
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 850: ordinal not in range(128)

This happens even right after syncdb, when django tries to create admin_permission objects and models have unicode verbose_name's.

No custom loggers were installed in settings.py.

Attachments (1)

18461-1.diff (2.2 KB ) - added by Claude Paroz 12 years ago.
Decode returned value from last executed query

Download all attachments as: .zip

Change History (10)

comment:1 by Claude Paroz, 12 years ago

Has patch: set
Triage Stage: UnreviewedAccepted

It seems that at least PostgreSQL and MySQL are returning their last executed query as an 'utf-8' encoded byte string. Needs confirmation by ORM guys, and check with Oracle.

by Claude Paroz, 12 years ago

Attachment: 18461-1.diff added

Decode returned value from last executed query

comment:2 by Simon Charette, 12 years ago

Cc: charette.s@… added

I've got bit by that using master and trying to load fixtures with non-ascii character.

comment:3 by Claude Paroz <claude@…>, 12 years ago

Resolution: fixed
Status: newclosed

In [e9ef9776d186a3379cdbf47a73b14e89b74d0926]:

Fixed #18461 -- Ensured that last_executed_query returns Unicode

Thanks Anssi Kääriäinen for the review.

comment:4 by Anssi Kääriäinen <akaariai@…>, 12 years ago

In [86c20e39eb9ea0bbd25889403d16d857748aff9b]:

Fixed connection.queries encoding handling on Oracle

In addition, removed a possibly problematic .filter() call from
backends.test_query_encoding test. It is possible the .filter could
cause collation problems on MySQL, and as it wasn't absolutely needed
for the test it seemed better to get rid of the call.

Refs #18461.

comment:5 by Anssi Kääriäinen, 12 years ago

For future reference: the problem with the test was that it used .filter(name='й'). This caused problems on MySQL when TEST_CHARSET = 'UTF8' is not in use. I didn't have that, and didn't spot that this is required (documented here). So, the original test was correct. I don't see a need to change the test back, as the removed .filter() doesn't make the test incorrect either. Of course, I am not objecting changing the test back if that is wanted...

comment:6 by err, 12 years ago

This is doesn't work when you have connection not in utf8 (for example in mysql OPTIONS' : {"charset": "cp1251"})

'utf8' codec can't decode byte 0xcd in position 401: invalid continuation byte

comment:7 by err, 12 years ago

Resolution: fixed
Status: closednew

comment:8 by err, 12 years ago

<         encoding = cursor.connection.character_set_name()
<         return cursor._last_executed.decode(encoding)
---
>         return cursor._last_executed.decode('utf-8')

I guess that it will help

Last edited 12 years ago by err (previous) (diff)

comment:9 by Claude Paroz, 12 years ago

Resolution: fixed
Status: newclosed

Please reopen a new ticket with your specific issue.

Note: See TracTickets for help on using tickets.
Back to Top