This one led me on a wild goose chase. I am trying to use SQLAlchemy alongside Django.
Here's the problem: The default encoding for a psycopg2 connection is "SQL_ASCII". And, by default, psycopg2 accepts and passes back non-Unicode strings (i.e., Python str objects, not unicode objects). SQLAlchemy works okay using this setup, as it does conversion between unicode objects and utf-8-encoded str objects as data passes to and from the database.
Django, however, seems to rely on psycopg2 to do the conversions; so, it registers psycopg2's "UNICODE" extension:
psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)
This is done in django/db/backends/postgresql_psycopg2/base.py, upon loading that module. When this option is set, psycopg2 tries to convert all results to unicode objects. If the default encoding, "SQL_ASCII", is in use, this will cause UnicodeDecodeError's to be raised upon attempting to pull out some non-ASCII text from the database...
However, this is okay for Django's personal needs, because it also sets the client encoding for its psycopg2 database connection:
self.connection.set_client_encoding('UTF8')
This change, however, only affects the given connection object, which is local to Django. Unfortunately, SQLAlchemy does not set the client encoding for its connections.
So, by registering psycopg2's UNICODE extension, Django places a restriction on all psycopg2 connections that wish to deal with Unicode: all of the connections must set_client_encoding to UTF8 (or perhaps another Unicode encoding). This doesn't sound like a big deal, but:
- it would take some serious hack-arounds to make sure SQLAlchemy's psycopg2 connections all use the right encoding (i.e., call connection.set_client_encoding('utf8')), and
- this can lead to very difficult to trackdown problems.
This "bug" led to some especially odd behavior, in my case. I was finding that, early on in my test script, there were no problems inserting and selecting non-ASCII text into/from the database. It took me a long time to realize that, it was only after certain parts of Django had been loaded that errors would start flying. It took a whole lot of trial-and-error (commenting out bits of Django, loading various modules, etc.) to get to the bottom of things.
The only foolproof way that I can think of, for fixing this, is to program to Django to behave as SQLAlchemy does: it should manually convert to/from unicode objects.