Django

Code

Ticket #4021 (closed: fixed)

Opened 2 years ago

Last modified 2 years ago

[unicode][patch] initial sql with non-ascii strings not imported

Reported by: Ivan Sagalaev <Maniac@SoftwareManiacs.Org> Assigned to: jacob
Milestone: Component: Uncategorized
Version: other branch Keywords: unicode
Cc: mtredinnick, Maniac@SoftwareManiacs.Org Triage Stage: Unreviewed
Has patch: 0 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 0

Description

An SQL file containing utf-8 encoded data looking like this:

set names utf8;
insert into cicero_forum (`slug`, `name`, `group`, `ordering`) values ('test', 'Тестовый форум', 'Тест', 0);

breaks initial syncdb with mysql backend:

'ascii' codec can't decode byte 0xd0 in position 81: ordinal not in range(128)

Evidently there's a plain str-to-unicode conversion somewhere...

Attachments

4021.diff (3.8 kB) - added by Ivan Sagalaev <Maniac@SoftwareManiacs.Org> on 04/21/07 14:47:49.
Patch

Change History

04/12/07 09:12:10 changed by Ivan Sagalaev <Maniac@SoftwareManiacs.Org>

  • needs_better_patch changed.
  • needs_tests changed.
  • needs_docs changed.

Found it...

The problem occurs in management.py in syncdb where it passes raw file contents as str into cursor.execute(). Now since we have {'use_unicode': True} for mysql backend it apparently expects only unicode data.

The obvious fix would be decoding content of custom .sql files in syncdb. Here we have the same problem as with templates: we can't know for sure in which encoding the file is. Another way to do it is to connect to MySQL during syncdb with {'use_unicode': False} and without explicit charset. I think this is correct since syncdb is a command line tool and shouldn't care about unicode internals.

Thoughts?

04/21/07 14:47:49 changed by Ivan Sagalaev <Maniac@SoftwareManiacs.Org>

  • attachment 4021.diff added.

Patch

04/21/07 14:52:14 changed by Ivan Sagalaev <Maniac@SoftwareManiacs.Org>

  • summary changed from [unicode] initial sql with non-ascii strings not imported to [unicode][patch] initial sql with non-ascii strings not imported.

In the end I've decided not to load templates with codecs.open because it appears that codecs.open always opens files in binary mode while currently we use a simple open that does it in text. May be this is not an issue though...

04/21/07 23:21:05 changed by mtredinnick

  • status changed from new to closed.
  • resolution set to fixed.

(In [5058]) unicode: Added FILE_CHARSET setting and use it to decode files read from disk. Based on a patch from Ivan Sagalaev. Fixed #4021.


Add/Change #4021 ([unicode][patch] initial sql with non-ascii strings not imported)




Change Properties
Action