#19954 closed Bug (fixed)
Storing of Binary fields leads to Exceptions
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.5 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
I am using a BinaryField on a table to store a binary value
Class for this BinaryField:
class BinaryField(models.TextField): def get_prep_value(self, value): return value;
This worked with 1.4 release pretty good. With 1.5 i got this exception
File "C:\Program Files\Python27\lib\site-packages\django\core\handlers\base.py", line 115, in get_response response = callback(request, *callback_args, **callback_kwargs) File "C:\dev\workspace\myproject\Project\src\package\adminviews.py", line 16, in changevalue movie.save() File "C:\Program Files\Python27\lib\site-packages\django\db\models\base.py", line 546, in save force_update=force_update, update_fields=update_fields) File "C:\Program Files\Python27\lib\site-packages\django\db\models\base.py", line 626, in save_base rows = manager.using(using).filter(pk=pk_val)._update(values) File "C:\Program Files\Python27\lib\site-packages\django\db\models\query.py", line 603, in _update return query.get_compiler(self.db).execute_sql(None) File "C:\Program Files\Python27\lib\site-packages\django\db\models\sql\compiler.py", line 1014, in execute_sql cursor = super(SQLUpdateCompiler, self).execute_sql(result_type) File "C:\Program Files\Python27\lib\site-packages\django\db\models\sql\compiler.py", line 840, in execute_sql cursor.execute(sql, params) File "C:\Program Files\Python27\lib\site-packages\django\db\backends\util.py", line 45, in execute sql = self.db.ops.last_executed_query(self.cursor, sql, params) File "C:\Program Files\Python27\lib\site-packages\django\db\backends\mysql\base.py", line 243, in last_executed_query return cursor._last_executed.decode('utf-8') File "C:\Program Files\Python27\lib\encodings\utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0xff in position 708: invalid start byte
This is right, one cannot decode a binary value to utf8 characters. So I saw Ticket #6416 which describes pretty good my problem. I patched the mysql adapter with the code from #6416 and now it works. I don't know exactly on which level this should be fixed, but it is also a problem when it comes to logging of this sql query with a binary field in it, the encoding to bas64 (from code #6416) solves this problem here.
Attachments (1)
Change History (11)
comment:1 by , 12 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 12 years ago
I can definitely live with that, one can relatively easy convert the quoted-printable back to it's binary form. This would be more than I need in my case.
comment:3 by , 12 years ago
Has patch: | set |
---|
The downside of my initial proposal is that other valid utf-8 chars in the query will also be qp encoded, which is not acceptable.
IMHO, the valid alternatives are either:
- Split the query and try to decode the query chunk by chunk, falling back to a byte-oriented decoding on errors.
- Default to parent
last_executed_query
when decoding fails
I will attach a patch where I chose to implement 1. with 'unicode-escape' fallback encoding. The issue with 2. is that the log would contain two sorts of queries: those coming from the backend and those being reconstructed with the "QUERY = %r - PARAMS = %r"
pattern.
by , 12 years ago
Attachment: | 19954-1.diff added |
---|
follow-up: 5 comment:4 by , 12 years ago
What about binary values who could be represented as UTF-8 characters but aren't characters? Then you have a wrong representation of a binary value, which are not characters, this could lead to confusion.
Django generates the queries (prepared-statements?) by using the field datatype defined in the model, I guess. Why doesn't the logging starts there and not when it is all over? There you could easily decide which mode should be used to represent the value of a field in the log. Binary fields should really be represented as they are: binary and not converted to characters, because we don't know (on django side) what kind of binary data we are dealing with.
comment:5 by , 12 years ago
Patch needs improvement: | set |
---|
Replying to marcel.ryser.ch@…:
What about binary values who could be represented as UTF-8 characters but aren't characters? Then you have a wrong representation of a binary value, which are not characters, this could lead to confusion.
You are right, but I didn't find any better solution for now.
Django generates the queries (prepared-statements?) by using the field datatype defined in the model, I guess. Why doesn't the logging starts there and not when it is all over? There you could easily decide which mode should be used to represent the value of a field in the log. Binary fields should really be represented as they are: binary and not converted to characters, because we don't know (on django side) what kind of binary data we are dealing with.
Django is separately passing the query with placeholders and parameters to the database backend, it does not generate the full query itself. That's why Django uses the cursor.last_executed_query
trick to retrieve the real query. But feel free to experiment and propose a better patch. I'm out of ideas currently :-(
comment:6 by , 12 years ago
If we don't find any better solution, I think we'll fallback to force_text(query, errors='replace')
. It's still better than crashing.
comment:7 by , 12 years ago
Fit's for me, because I don't need to log the binary value, I just expect django to not crash when I have some binary fields in my model.
comment:8 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
follow-up: 10 comment:9 by , 11 years ago
Since this was introduced in django 1.5 can the fix also be included in the 1.5 branch?
comment:10 by , 11 years ago
Replying to webadmin@…:
Since this was introduced in django 1.5 can the fix also be included in the 1.5 branch?
It should.
I met this bug in the 1.5 branch so I fixed it as described here : http://stackoverflow.com/a/18923501
I'm about to commit the
BinaryField
patch from #2417. I've tested and reproduced your issue. I found another way to workaround it:This gives something like
INSERT INTO `model_fields_datamodel` (`short_data`, `data`) VALUES ('=08', =\n'\\0F=FE')
The quoted-printable representation of the binary parameters may not be the best representation, but at least it doesn't crash. Opinions?