Code

Opened 2 years ago

Closed 16 months ago

#18461 closed Bug (fixed)

UnicodeDecodeError in sql logger

Reported by: zw0rk Owned by: nobody
Component: Core (Other) Version: master
Severity: Normal Keywords:
Cc: charette.s@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Trying to filter a queryset with unicode values causes UnicodeDecodeError in sql logger.

>>> User.objects.filter(last_name=u'Z')
[]
>>> User.objects.filter(last_name=u'й')
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/django/db/models/query.py", line 74, in __repr__
    data = list(self[:REPR_OUTPUT_SIZE + 1])
  File "/usr/lib/python2.7/site-packages/django/db/models/query.py", line 89, in __len__
    self._result_cache.extend(self._iter)
  File "/usr/lib/python2.7/site-packages/django/db/models/query.py", line 294, in iterator
    for row in compiler.results_iter():
  File "/usr/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 764, in results_iter
    for rows in self.execute_sql(MULTI):
  File "/usr/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 819, in execute_sql
    cursor.execute(sql, params)
  File "/usr/lib/python2.7/site-packages/django/db/backends/util.py", line 51, in execute
    logger.debug('(%.3f) %s; args=%s' % (duration, sql, params),
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 850: ordinal not in range(128)

This happens even right after syncdb, when django tries to create admin_permission objects and models have unicode verbose_name's.

No custom loggers were installed in settings.py.

Attachments (1)

18461-1.diff (2.2 KB) - added by claudep 2 years ago.
Decode returned value from last executed query

Download all attachments as: .zip

Change History (10)

comment:1 Changed 2 years ago by claudep

  • Has patch set
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

It seems that at least PostgreSQL and MySQL are returning their last executed query as an 'utf-8' encoded byte string. Needs confirmation by ORM guys, and check with Oracle.

Changed 2 years ago by claudep

Decode returned value from last executed query

comment:2 Changed 2 years ago by charettes

  • Cc charette.s@… added

I've got bit by that using master and trying to load fixtures with non-ascii character.

comment:3 Changed 2 years ago by Claude Paroz <claude@…>

  • Resolution set to fixed
  • Status changed from new to closed

In [e9ef9776d186a3379cdbf47a73b14e89b74d0926]:

Fixed #18461 -- Ensured that last_executed_query returns Unicode

Thanks Anssi Kääriäinen for the review.

comment:4 Changed 2 years ago by Anssi Kääriäinen <akaariai@…>

In [86c20e39eb9ea0bbd25889403d16d857748aff9b]:

Fixed connection.queries encoding handling on Oracle

In addition, removed a possibly problematic .filter() call from
backends.test_query_encoding test. It is possible the .filter could
cause collation problems on MySQL, and as it wasn't absolutely needed
for the test it seemed better to get rid of the call.

Refs #18461.

comment:5 Changed 2 years ago by akaariai

For future reference: the problem with the test was that it used .filter(name='й'). This caused problems on MySQL when TEST_CHARSET = 'UTF8' is not in use. I didn't have that, and didn't spot that this is required (documented here). So, the original test was correct. I don't see a need to change the test back, as the removed .filter() doesn't make the test incorrect either. Of course, I am not objecting changing the test back if that is wanted...

comment:6 Changed 16 months ago by err

This is doesn't work when you have connection not in utf8 (for example in mysql OPTIONS' : {"charset": "cp1251"})

'utf8' codec can't decode byte 0xcd in position 401: invalid continuation byte

comment:7 Changed 16 months ago by err

  • Resolution fixed deleted
  • Status changed from closed to new

comment:8 Changed 16 months ago by err

<         encoding = cursor.connection.character_set_name()
<         return cursor._last_executed.decode(encoding)
---
>         return cursor._last_executed.decode('utf-8')

I guess that it will help

Last edited 16 months ago by err (previous) (diff)

comment:9 Changed 16 months ago by claudep

  • Resolution set to fixed
  • Status changed from new to closed

Please reopen a new ticket with your specific issue.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.