#14091 closed Bug (fixed)
Fix incorrect quoting in connection.queries
Reported by: | Daniel Roseman | Owned by: | Jacob |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.2 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The default implementation of DatabaseOperations.last_executed_query
returns incorrectly quoted values for many common queries in MySQL. This means that connection.queries
does not properly reflect the query that was actually executed by the database, leading to hard-to-debug errors.
For example, given this model:
class Foo(models.Model): name = models.CharField(max_length=10) added = models.DateTimeField()
doing this query:
Foo.objects.filter(added__lte=datetime.datetime.today())
gives this result from connection.queries:
[{'sql': "SELECT `foo_foo`.`id`, `foo_foo`.`name`, `foo_foo`.`added` FROM `foo_foo` WHERE `foo_foo`.`added` <= 2010-08-11 02:58:21 LIMIT 21", 'time': '0.052'}]
which is invalid SQL because the timestamp is not quoted at all - although the query actually sent to the db is correctly quoted. More seriously, doing something like this:
values = ((u'1', '2010-08-09T15:55:28'), (u'2', '2010-08-10T15:55:28'), (u'3', '2010-08-11T15:55:28')) Foo.objects.extra(where=['(name, added) in %s'], params=[values])
wrongly reports that the values
tuple has been correctly quoted:
[{'sql': "SELECT `foo_foo`.`id`, `foo_foo`.`name`, `foo_foo`.`added` FROM `foo_foo` WHERE (name, added) in ((u'0', '2010-08-08T15:55:28'), (u'1', '2010-08-09T15:55:28'), (u'2', '2010-08-10T15:55:28'), (u'3', '2010-08-11T15:55:28')) LIMIT 21", 'time': '0.052'}]
when in fact the query sent to the database is
SELECT `foo_foo`.`id`, `foo_foo`.`name`, `foo_foo`.`added` FROM `foo_foo` WHERE (name, added) in (("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"))
which is obviously not what was intended.
The Postgres backend overrides the last_executed_query
method to use the cursor.query
atribute, ensuring that the query is correctly represented. MySQLdb actually provides a similar attribute, cursor._executed
, and this should be used by the MySQL backend. Attached patch uses this, but falls back to the default implementation if the attribute is not populated (the current version of MySQLdb occasionally does not populate _executed
, although this appears to be fixed in the forthcoming version 2.)
No tests attached, because I couldn't work out where to put them - happy to do so with some advice.
Attachments (4)
Change History (23)
by , 14 years ago
Attachment: | mysql_lastexecuted.diff added |
---|
comment:1 by , 14 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 14 years ago
Needs tests: | set |
---|
comment:3 by , 14 years ago
Owner: | changed from | to
---|
comment:4 by , 14 years ago
Needs tests: | unset |
---|
Attached patch implements query retrieval for MySQL and Oracle, and proper query rebuilding for SQLite.
With this patch, the postgresql (psycopg v1) backend still does not have proper escaping, but it is deprecated. A discussion on IRC determined that implementing a generic (SQL-92 style) parameter escaping function would be a bad idea, because it's extremely difficult to get right. For starters, mapping the Python types to database parameters is database dependent... So we will stick to database-specific implementations.
I tested the patch under SQLite and MySQL.
I do not have access to an Oracle server. My proposal is purely based on the doc. However, the change is trivial and could easily be reverted if the tests do not pass under Oracle.
by , 14 years ago
Attachment: | 14091.patch added |
---|
comment:5 by , 14 years ago
Severity: | → Normal |
---|---|
Summary: | PATCH: fix incorrect quoting in connection.queries for MySQL → Fix incorrect quoting in connection.queries |
Type: | → Uncategorized |
I updated the patch after the postgresql backend was removed and the postgresql_psycopg2 backend was refactored.
by , 14 years ago
Attachment: | 14091.2.patch added |
---|
comment:6 by , 14 years ago
Type: | Uncategorized → Bug |
---|
comment:8 by , 14 years ago
Owner: | removed |
---|
I de-assign myself since I've proposed a patch and I have nothing more to do until someone reviews it.
comment:9 by , 14 years ago
Easy pickings: | unset |
---|
connections.queries
is explicitly documented as not quoting things correctly, here http://docs.djangoproject.com/en/dev/faq/models/#how-can-i-see-the-raw-sql-queries-django-is-running, I'm not sure why we'd bother to fix this, the patch is non-trivial (it involves running queries on SQLite) and adds little value IMO.
comment:10 by , 14 years ago
Alex, the reasons for fixing this are:
- make it possible to test queries by copy-pasting from the debug bar or a debug log. That is useful to troubleshoot performance issues:
./manage dbshell
thenEXPLAIN <<Ctrl-V>>
. Re-quoting manually complex queries is just a pain. The framework should do that for us. - remove a common source of questions and frustration for users, as seen on #django and in duplicate tickets.
Also, Django is supposed be a framework for perfectionnists; IMHO the status quo yeah, it's buggy, but we say it in the small print... doesn't live up to this goal. Should we ask django-developers for other opinions?
comment:11 by , 14 years ago
I disagree with Alex about the lack of value, but like Alex, I'm not sold on the approach under SQLite - it seems rather hackish and prone to failure. I aprove (highly) of the fixes for MySQL and Oracle, and I'd like to see those go in.
I say we fix this where we can (MySQL and Oracle) and leave it as-is for SQLite. If there's a non-gross way of retrieving the last query from the sqlite3
library then let's by all means use that, but if not we'll need to leave SQLite alone. We already have a good document explaining the differences between databases, so this can be another caveat there (and another reason to use a real database).
aaugustin, if you make those changes feel free to mark this ticket RFC. Everything else looks good to me.
comment:12 by , 14 years ago
aaugustin: also, add your name to AUTHORS if you'd like credit when we check this in!
comment:13 by , 14 years ago
FWIW jacob, I spoke with aaugustin on IRC and I agree with your position, if we can do it in a non-hacky way (as psycopg2 is now), +1.
by , 14 years ago
Attachment: | 14091.3.patch added |
---|
comment:14 by , 14 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
This new patch implements Alex and Jacob's recommendation.
Regarding SQLite, parameters are substituted at line 642 in _sqlite/cursor.c
:
pysqlite_statement_bind_parameters(self->statement, parameters, allow_8bit_chars);
Unfortunately there is no way to reach self->statement
from Python code. That's why I ended up quoting parameters with SELECT QUOTE(?)
.
comment:16 by , 9 years ago
UI/UX: | unset |
---|
Four and a half years later, I still believe this should be fixed on SQLite as well.
I submitted a pull request: https://github.com/django/django/pull/5282
comment:18 by , 6 years ago
Hi,
I am having this problem on PostgreSQL. I think my use case is sane:
Use the Django ORM to generate some SQL -> Use Pandas to evaluate the SQL and return a data frame.
Replying to danielr:
tests/regressiontests/backends
would be an appropriate location.