Opened 17 years ago

Closed 17 years ago

Last modified 16 years ago

#5135 closed (wontfix)

Unicode-branch merge broke insertion of binary data

Reported by: bjorn.kempen@… Owned by: Adrian Holovaty
Component: Database layer (models, ORM) Version: dev
Severity: Keywords: binary blob unicode
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Django does not have a BinaryField, or BlobField or whatever you want to call it, which is a bit sad.
In 0.96 and before it however worked to insert into blob fields with custom SQL though. In the SVN-release this is broken.

An example from my code.
info_hash is a blob field (mysql)

def create_xbt_file(info_hash, timestamp):

query = "INSERT INTO xbt_files (info_hash, mtime, ctime) VALUES (%s, %s, %s)"
from django.db import connection
cursor = connection.cursor()
cursor.execute(query, [info_hash, timestamp, timestamp])

This throws a nasty UnicodeDecodeError whenever a byte with a position between 45-50 (among others I guess) is in info_hash.

If I bypass django completely using MySQLdb, then it works fine

def create_xbt_file(info_hash, timestamp):

import MySQLdb
db = MySQLdb.connect("localhost", DATABASE_USER, DATABASE_PASSWORD, DATABASE_NAME)
cursor = db.cursor()
query = "INSERT INTO xbt_files (info_hash, mtime, ctime) VALUES (%s, %s, %s)"
cursor.execute(query, [info_hash, timestamp, timestamp])
db.close()

Change History (5)

comment:1 by bjorn.kempen@…, 17 years ago

Sorry about the formating.. here we go again

Does not work (custom SQL)

def create_xbt_file(info_hash, timestamp):
  query = "INSERT INTO xbt_files (info_hash, mtime, ctime) VALUES (%s, %s, %s)"
  from django.db import connection
  cursor = connection.cursor()
  cursor.execute(query, [info_hash, timestamp, timestamp])

Does work (MySQLdb)

def create_xbt_file(info_hash, timestamp):
  import MySQLdb
  db = MySQLdb.connect("localhost", DATABASE_USER, DATABASE_PASSWORD, DATABASE_NAME)
  cursor = db.cursor()
  query = "INSERT INTO xbt_files (info_hash, mtime, ctime) VALUES (%s, %s, %s)"
  cursor.execute(query, [info_hash, timestamp, timestamp])
  db.close()

comment:2 by Malcolm Tredinnick, 17 years ago

Resolution: wontfix
Status: newclosed

Storing binary data was unsafe before (what if your binary data contained a zero byte?), so it was kind of lucky -- and unsupported -- that it worked at all. It just works even less well know.

The real fix here is something like #2417 (adding a propery binary field type). The current workaround is to use base64 encoding (or base96 or some other binary->ascii encoding) on the data before storing it. There's nothing we can do at the text field level, since we are assuming Unicode strings for text and databases obviously use an encoding when they store stuff, hence we have to convert between the encoding and Python Unicode objects.

comment:3 by bjorn.kempen@…, 17 years ago

So this means that django officially can't be used to interface with legacy databases or external applications using binary fields? bas64 works great when you design the database from scratch, but when integrating your app with another application that uses blob-fields it simply won't work-

In my case I'm building a web front end for XBT Tracker which uses blob fields for storing hashes. I can't change that without editing a lot of XBTT's source code and since I'm no C++ genius that doesn't seem very wise.

Interfacing "the web" with legacy databases and external applications should be a rather common task by now :/ I mean... there's even a chapter on it in djangobook. Not having all the field types I can live with, but not even being able to handle it using custom SQL seems odd.

in reply to:  3 comment:4 by James Bennett, 17 years ago

Replying to bjorn.kempen@gmail.com:

Interfacing "the web" with legacy databases and external applications should be a rather common task by now :/ I mean... there's even a chapter on it in djangobook. Not having all the field types I can live with, but not even being able to handle it using custom SQL seems odd.

Malcolm pointed you to the ticket where adding a real binary field to Django is being discussed; if it's an important issue for you, why not head over there and devote some energy to helping improve the proposed patch?

;)

comment:5 by Evgeniy Ivanov, 16 years ago

Instead of implementing a binary field, why don't you just use cursor from MySQLdb and SETTINGS to perform connection?

Note: See TracTickets for help on using tickets.
Back to Top