Opened 14 years ago

Closed 13 years ago

Last modified 5 years ago

#14091 closed Bug (fixed)

Fix incorrect quoting in connection.queries

Reported by: Daniel Roseman Owned by: Jacob
Component: Database layer (models, ORM) Version: 1.2
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The default implementation of DatabaseOperations.last_executed_query returns incorrectly quoted values for many common queries in MySQL. This means that connection.queries does not properly reflect the query that was actually executed by the database, leading to hard-to-debug errors.

For example, given this model:

    class Foo(models.Model):
        name = models.CharField(max_length=10)
        added = models.DateTimeField()

doing this query:

    Foo.objects.filter(added__lte=datetime.datetime.today())

gives this result from connection.queries:

[{'sql': "SELECT `foo_foo`.`id`, `foo_foo`.`name`, `foo_foo`.`added` FROM `foo_foo` WHERE `foo_foo`.`added` <= 2010-08-11 02:58:21  LIMIT 21",
  'time': '0.052'}]

which is invalid SQL because the timestamp is not quoted at all - although the query actually sent to the db is correctly quoted. More seriously, doing something like this:

values = ((u'1', '2010-08-09T15:55:28'),
          (u'2', '2010-08-10T15:55:28'),
          (u'3', '2010-08-11T15:55:28'))
Foo.objects.extra(where=['(name, added) in %s'], params=[values])

wrongly reports that the values tuple has been correctly quoted:

[{'sql': "SELECT `foo_foo`.`id`, `foo_foo`.`name`, `foo_foo`.`added` FROM `foo_foo` WHERE (name, added) in ((u'0', '2010-08-08T15:55:28'), 
(u'1', '2010-08-09T15:55:28'), (u'2', '2010-08-10T15:55:28'), (u'3', '2010-08-11T15:55:28')) LIMIT 21",  'time': '0.052'}]

when in fact the query sent to the database is

SELECT `foo_foo`.`id`, `foo_foo`.`name`, `foo_foo`.`added` FROM `foo_foo` WHERE (name, added) in (("\'0\'", "\'2010-08-09T15:55:28\'"), 
("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), 
("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'")) 

which is obviously not what was intended.

The Postgres backend overrides the last_executed_query method to use the cursor.query atribute, ensuring that the query is correctly represented. MySQLdb actually provides a similar attribute, cursor._executed, and this should be used by the MySQL backend. Attached patch uses this, but falls back to the default implementation if the attribute is not populated (the current version of MySQLdb occasionally does not populate _executed, although this appears to be fixed in the forthcoming version 2.)

No tests attached, because I couldn't work out where to put them - happy to do so with some advice.

Attachments (4)

mysql_lastexecuted.diff (685 bytes ) - added by Daniel Roseman 14 years ago.
14091.patch (6.2 KB ) - added by Aymeric Augustin 13 years ago.
14091.2.patch (5.8 KB ) - added by Aymeric Augustin 13 years ago.
14091.3.patch (6.6 KB ) - added by Aymeric Augustin 13 years ago.

Download all attachments as: .zip

Change History (23)

by Daniel Roseman, 14 years ago

Attachment: mysql_lastexecuted.diff added

in reply to:  description comment:1 by Ramiro Morales, 13 years ago

Triage Stage: UnreviewedAccepted

Replying to danielr:

No tests attached, because I couldn't work out where to put them - happy to do so with some advice.

tests/regressiontests/backends would be an appropriate location.

comment:2 by Luke Plant, 13 years ago

Needs tests: set

comment:3 by Aymeric Augustin, 13 years ago

Owner: changed from nobody to Aymeric Augustin

#11209 reports the same problem; it was "fixed" in r10847 by a note in the docs.

#8877 reports the same problem plus another case sensitivity problem; it was closed after the latter was considered invalid.

comment:4 by Aymeric Augustin, 13 years ago

Needs tests: unset

Attached patch implements query retrieval for MySQL and Oracle, and proper query rebuilding for SQLite.

With this patch, the postgresql (psycopg v1) backend still does not have proper escaping, but it is deprecated. A discussion on IRC determined that implementing a generic (SQL-92 style) parameter escaping function would be a bad idea, because it's extremely difficult to get right. For starters, mapping the Python types to database parameters is database dependent... So we will stick to database-specific implementations.

I tested the patch under SQLite and MySQL.

I do not have access to an Oracle server. My proposal is purely based on the doc. However, the change is trivial and could easily be reverted if the tests do not pass under Oracle.

by Aymeric Augustin, 13 years ago

Attachment: 14091.patch added

comment:5 by Aymeric Augustin, 13 years ago

Severity: Normal
Summary: PATCH: fix incorrect quoting in connection.queries for MySQLFix incorrect quoting in connection.queries
Type: Uncategorized

I updated the patch after the postgresql backend was removed and the postgresql_psycopg2 backend was refactored.

by Aymeric Augustin, 13 years ago

Attachment: 14091.2.patch added

comment:6 by Aymeric Augustin, 13 years ago

Type: UncategorizedBug

comment:7 by Erin Kelly, 13 years ago

I've run the test under Oracle, and it passes.

comment:8 by Aymeric Augustin, 13 years ago

Owner: Aymeric Augustin removed

I de-assign myself since I've proposed a patch and I have nothing more to do until someone reviews it.

comment:9 by Alex Gaynor, 13 years ago

Easy pickings: unset

connections.queries is explicitly documented as not quoting things correctly, here http://docs.djangoproject.com/en/dev/faq/models/#how-can-i-see-the-raw-sql-queries-django-is-running, I'm not sure why we'd bother to fix this, the patch is non-trivial (it involves running queries on SQLite) and adds little value IMO.

comment:10 by Aymeric Augustin, 13 years ago

Alex, the reasons for fixing this are:

  • make it possible to test queries by copy-pasting from the debug bar or a debug log. That is useful to troubleshoot performance issues: ./manage dbshell then EXPLAIN <<Ctrl-V>>. Re-quoting manually complex queries is just a pain. The framework should do that for us.
  • remove a common source of questions and frustration for users, as seen on #django and in duplicate tickets.

Also, Django is supposed be a framework for perfectionnists; IMHO the status quo yeah, it's buggy, but we say it in the small print... doesn't live up to this goal. Should we ask django-developers for other opinions?

comment:11 by Jacob, 13 years ago

I disagree with Alex about the lack of value, but like Alex, I'm not sold on the approach under SQLite - it seems rather hackish and prone to failure. I aprove (highly) of the fixes for MySQL and Oracle, and I'd like to see those go in.

I say we fix this where we can (MySQL and Oracle) and leave it as-is for SQLite. If there's a non-gross way of retrieving the last query from the sqlite3 library then let's by all means use that, but if not we'll need to leave SQLite alone. We already have a good document explaining the differences between databases, so this can be another caveat there (and another reason to use a real database).

aaugustin, if you make those changes feel free to mark this ticket RFC. Everything else looks good to me.

comment:12 by Jacob, 13 years ago

aaugustin: also, add your name to AUTHORS if you'd like credit when we check this in!

comment:13 by Alex Gaynor, 13 years ago

FWIW jacob, I spoke with aaugustin on IRC and I agree with your position, if we can do it in a non-hacky way (as psycopg2 is now), +1.

by Aymeric Augustin, 13 years ago

Attachment: 14091.3.patch added

comment:14 by Aymeric Augustin, 13 years ago

Triage Stage: AcceptedReady for checkin

This new patch implements Alex and Jacob's recommendation.

Regarding SQLite, parameters are substituted at line 642 in _sqlite/cursor.c:

pysqlite_statement_bind_parameters(self->statement, parameters, allow_8bit_chars);

Unfortunately there is no way to reach self->statement from Python code. That's why I ended up quoting parameters with SELECT QUOTE(?).

comment:15 by Jacob, 13 years ago

Owner: set to Jacob
Resolution: fixed
Status: newclosed

In [16081]:

Fixed #14091 - be more correct about logging queries in connection.queries.

Thanks to Aymeric Augustin for figuring out how to make this work across
multiple databases.

comment:16 by Aymeric Augustin, 9 years ago

UI/UX: unset

Four and a half years later, I still believe this should be fixed on SQLite as well.

I submitted a pull request: https://github.com/django/django/pull/5282

comment:17 by Aymeric Augustin <aymeric.augustin@…>, 9 years ago

In 4f6a766:

Refs #14091 -- Fixed connection.queries on SQLite.

comment:18 by John Cant, 5 years ago

Hi,

I am having this problem on PostgreSQL. I think my use case is sane:

Use the Django ORM to generate some SQL -> Use Pandas to evaluate the SQL and return a data frame.

comment:19 by Tim Graham, 5 years ago

Please open a new issue with steps to reproduce.

Note: See TracTickets for help on using tickets.
Back to Top