Opened 5 years ago

Closed 4 years ago

#14091 closed Bug (fixed)

Fix incorrect quoting in connection.queries

Reported by: danielr Owned by: jacob
Component: Database layer (models, ORM) Version: 1.2
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX:

Description

The default implementation of DatabaseOperations.last_executed_query returns incorrectly quoted values for many common queries in MySQL. This means that connection.queries does not properly reflect the query that was actually executed by the database, leading to hard-to-debug errors.

For example, given this model:

    class Foo(models.Model):
        name = models.CharField(max_length=10)
        added = models.DateTimeField()

doing this query:

    Foo.objects.filter(added__lte=datetime.datetime.today())

gives this result from connection.queries:

[{'sql': "SELECT `foo_foo`.`id`, `foo_foo`.`name`, `foo_foo`.`added` FROM `foo_foo` WHERE `foo_foo`.`added` <= 2010-08-11 02:58:21  LIMIT 21",
  'time': '0.052'}]

which is invalid SQL because the timestamp is not quoted at all - although the query actually sent to the db is correctly quoted. More seriously, doing something like this:

values = ((u'1', '2010-08-09T15:55:28'),
          (u'2', '2010-08-10T15:55:28'),
          (u'3', '2010-08-11T15:55:28'))
Foo.objects.extra(where=['(name, added) in %s'], params=[values])

wrongly reports that the values tuple has been correctly quoted:

[{'sql': "SELECT `foo_foo`.`id`, `foo_foo`.`name`, `foo_foo`.`added` FROM `foo_foo` WHERE (name, added) in ((u'0', '2010-08-08T15:55:28'), 
(u'1', '2010-08-09T15:55:28'), (u'2', '2010-08-10T15:55:28'), (u'3', '2010-08-11T15:55:28')) LIMIT 21",  'time': '0.052'}]

when in fact the query sent to the database is

SELECT `foo_foo`.`id`, `foo_foo`.`name`, `foo_foo`.`added` FROM `foo_foo` WHERE (name, added) in (("\'0\'", "\'2010-08-09T15:55:28\'"), 
("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), 
("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'"), ("\'0\'", "\'2010-08-09T15:55:28\'")) 

which is obviously not what was intended.

The Postgres backend overrides the last_executed_query method to use the cursor.query atribute, ensuring that the query is correctly represented. MySQLdb actually provides a similar attribute, cursor._executed, and this should be used by the MySQL backend. Attached patch uses this, but falls back to the default implementation if the attribute is not populated (the current version of MySQLdb occasionally does not populate _executed, although this appears to be fixed in the forthcoming version 2.)

No tests attached, because I couldn't work out where to put them - happy to do so with some advice.

Attachments (4)

mysql_lastexecuted.diff (685 bytes) - added by danielr 5 years ago.
14091.patch (6.2 KB) - added by aaugustin 4 years ago.
14091.2.patch (5.8 KB) - added by aaugustin 4 years ago.
14091.3.patch (6.6 KB) - added by aaugustin 4 years ago.

Download all attachments as: .zip

Change History (19)

Changed 5 years ago by danielr

comment:1 in reply to: ↑ description Changed 5 years ago by ramiro

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

Replying to danielr:

No tests attached, because I couldn't work out where to put them - happy to do so with some advice.

tests/regressiontests/backends would be an appropriate location.

comment:2 Changed 5 years ago by lukeplant

  • Needs tests set

comment:3 Changed 4 years ago by aaugustin

  • Owner changed from nobody to aaugustin

#11209 reports the same problem; it was "fixed" in r10847 by a note in the docs.

#8877 reports the same problem plus another case sensitivity problem; it was closed after the latter was considered invalid.

comment:4 Changed 4 years ago by aaugustin

  • Needs tests unset

Attached patch implements query retrieval for MySQL and Oracle, and proper query rebuilding for SQLite.

With this patch, the postgresql (psycopg v1) backend still does not have proper escaping, but it is deprecated. A discussion on IRC determined that implementing a generic (SQL-92 style) parameter escaping function would be a bad idea, because it's extremely difficult to get right. For starters, mapping the Python types to database parameters is database dependent... So we will stick to database-specific implementations.

I tested the patch under SQLite and MySQL.

I do not have access to an Oracle server. My proposal is purely based on the doc. However, the change is trivial and could easily be reverted if the tests do not pass under Oracle.

Changed 4 years ago by aaugustin

comment:5 Changed 4 years ago by aaugustin

  • Severity set to Normal
  • Summary changed from PATCH: fix incorrect quoting in connection.queries for MySQL to Fix incorrect quoting in connection.queries
  • Type set to Uncategorized

I updated the patch after the postgresql backend was removed and the postgresql_psycopg2 backend was refactored.

Changed 4 years ago by aaugustin

comment:6 Changed 4 years ago by aaugustin

  • Type changed from Uncategorized to Bug

comment:7 Changed 4 years ago by ikelly

I've run the test under Oracle, and it passes.

comment:8 Changed 4 years ago by aaugustin

  • Owner aaugustin deleted

I de-assign myself since I've proposed a patch and I have nothing more to do until someone reviews it.

comment:9 Changed 4 years ago by Alex

  • Easy pickings unset

connections.queries is explicitly documented as not quoting things correctly, here http://docs.djangoproject.com/en/dev/faq/models/#how-can-i-see-the-raw-sql-queries-django-is-running, I'm not sure why we'd bother to fix this, the patch is non-trivial (it involves running queries on SQLite) and adds little value IMO.

comment:10 Changed 4 years ago by aaugustin

Alex, the reasons for fixing this are:

  • make it possible to test queries by copy-pasting from the debug bar or a debug log. That is useful to troubleshoot performance issues: ./manage dbshell then EXPLAIN <<Ctrl-V>>. Re-quoting manually complex queries is just a pain. The framework should do that for us.
  • remove a common source of questions and frustration for users, as seen on #django and in duplicate tickets.

Also, Django is supposed be a framework for perfectionnists; IMHO the status quo yeah, it's buggy, but we say it in the small print... doesn't live up to this goal. Should we ask django-developers for other opinions?

comment:11 Changed 4 years ago by jacob

I disagree with Alex about the lack of value, but like Alex, I'm not sold on the approach under SQLite - it seems rather hackish and prone to failure. I aprove (highly) of the fixes for MySQL and Oracle, and I'd like to see those go in.

I say we fix this where we can (MySQL and Oracle) and leave it as-is for SQLite. If there's a non-gross way of retrieving the last query from the sqlite3 library then let's by all means use that, but if not we'll need to leave SQLite alone. We already have a good document explaining the differences between databases, so this can be another caveat there (and another reason to use a real database).

aaugustin, if you make those changes feel free to mark this ticket RFC. Everything else looks good to me.

comment:12 Changed 4 years ago by jacob

aaugustin: also, add your name to AUTHORS if you'd like credit when we check this in!

comment:13 Changed 4 years ago by Alex

FWIW jacob, I spoke with aaugustin on IRC and I agree with your position, if we can do it in a non-hacky way (as psycopg2 is now), +1.

Changed 4 years ago by aaugustin

comment:14 Changed 4 years ago by aaugustin

  • Triage Stage changed from Accepted to Ready for checkin

This new patch implements Alex and Jacob's recommendation.

Regarding SQLite, parameters are substituted at line 642 in _sqlite/cursor.c:

pysqlite_statement_bind_parameters(self->statement, parameters, allow_8bit_chars);

Unfortunately there is no way to reach self->statement from Python code. That's why I ended up quoting parameters with SELECT QUOTE(?).

comment:15 Changed 4 years ago by jacob

  • Owner set to jacob
  • Resolution set to fixed
  • Status changed from new to closed

In [16081]:

Fixed #14091 - be more correct about logging queries in connection.queries.

Thanks to Aymeric Augustin for figuring out how to make this work across
multiple databases.

Note: See TracTickets for help on using tickets.
Back to Top