Opened 17 years ago

Closed 11 years ago

Last modified 11 years ago

#3485 closed Bug (fixed)

Initial SQL fails when data contains a % in it when settings.DEBUG = True

Reported by: lakin@… Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: me@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

We had some initial sql data that contains %'s in strings exported from a postgresql database. When a fellow developer tried to start the application, not all of the data was properly inserted as the mysql driver interpreted the % symbols as a directive for it to process and they failed.

Attachments (6)

db_do_not_format_string.diff (864 bytes ) - added by Craig Ogg <cogg@…> 17 years ago.
Patch
__init__.py (7 bytes ) - added by Craig Ogg <cogg@…> 17 years ago.
Should go in trunk/tests/modeltests/initial_sql/
models.py (932 bytes ) - added by Craig Ogg <cogg@…> 17 years ago.
Should go in trunk/tests/modeltests/initial_sql/
simple.sql (214 bytes ) - added by Craig Ogg <cogg@…> 17 years ago.
Should go in trunk/tests/modeltests/initial_sql/sql/
initial_sql_regress_models.diff (1.4 KB ) - added by Craig Ogg <cogg@…> 17 years ago.
Additional test added to trunk/tests/regressiontests/initial_sql_regress/models.py
simple2.sql (241 bytes ) - added by Craig Ogg <cogg@…> 17 years ago.
Should be in trunk/tests/regressiontests/initial_sql_regress/sql

Download all attachments as: .zip

Change History (30)

comment:1 by Gary Wilson <gary.wilson@…>, 17 years ago

Resolution: invalid
Status: newclosed

Possibly a quoting issue? What was used to export the SQL from the postgres database? Was it output by Django? Could you post an example? Until we have more info, closing this ticket as invalid.

comment:2 by lakin.wecker@…, 17 years ago

Resolution: invalid
Status: closedreopened

From my understanding of SQL, it's not a quoting issue. The initialdata is in raw sql format. The postgres statement is:

INSERT INTO cms_contentblock (id, page_id, name, body) VALUES (4, 3, 'email_format', 'Question From: %(name)s <%(email)s>

%(question)s

');

This query works fine from the MySQL command line utility or through a couple of different mysql administration tools.
In order for Django to properly put it into MySQL we had to change the query to:

INSERT INTO cms_contentblock (id, page_id, name, body) VALUES (4, 3, 'email_format', 'Question From: %%(name)s <%%(email)s>

%%(question)s

');

comment:3 by Simon G. <dev@…>, 17 years ago

Triage Stage: UnreviewedAccepted

#4801 was marked as a duplicate

by Craig Ogg <cogg@…>, 17 years ago

Patch

comment:4 by Craig Ogg <cogg@…>, 17 years ago

Component: UncategorizedDatabase wrapper
Has patch: set
Summary: Initial SQL fails with MySQLdb driver when data contains a % in it.Initial SQL fails when data contains a % in it.

Tickets #4465 and #3537 are duplicates (and I will mark as such). Modified Summary to make it clear that this applies to all backends (at least, it was reported against MySQLdb driver and I am seeing it with postgresql_psycopg2).

The patch I submitted is based on the code in #4465 but modified to accommodate the recent unicode changes.

comment:5 by Simon G. <dev@…>, 17 years ago

Triage Stage: AcceptedReady for checkin

comment:6 by Simon G. <dev@…>, 17 years ago

Looks good Craig - but we might need some tests.

in reply to:  6 comment:7 by Craig Ogg <cogg@…>, 17 years ago

Summary: Initial SQL fails when data contains a % in it.Initial SQL fails when data contains a % in it when settings.DEBUG = True

I have modified the subject to show this is a problem recording the SQL query in debug mode and not a problem with the SQL execution itself. I have written tests for this, but am not quite sure how to make them available, so I will attach them to the ticket. There was no existing test for initial SQL loading except as a regression, so I added a new app for it.

Put the attached files in the following locations on the trunk:

tests\modeltests\initial_sql (new directory)

init.py
models.py

tests\modeltests\initial_sql\sql

simple.sql

I realize in retrospect that actual problem doesn't have anything to do with the initial SQL mechanism (other than that being a convenient way to trigger the problem), it is instead a bug with the CursorDebugWrapper which has much broader applicability. I am afraid that I don't have enough understanding of Django to know if this is a complete fix (IOW, there may be other places where literal %s in SQL will cause Django to fail). I am sure the fix is safe though: the worst effect it could have is to leave a %% in a debug copy of SQL string where a % should be).

The nature of this test may mean it should be in the regression area. Not sure what the convention is.

Note: CursorDebugWrapper.executemany() seems to have the same problem from code inspection. Not sure under what conditions (if any) it would be triggered.

by Craig Ogg <cogg@…>, 17 years ago

Attachment: __init__.py added

Should go in trunk/tests/modeltests/initial_sql/

by Craig Ogg <cogg@…>, 17 years ago

Attachment: models.py added

Should go in trunk/tests/modeltests/initial_sql/

by Craig Ogg <cogg@…>, 17 years ago

Attachment: simple.sql added

Should go in trunk/tests/modeltests/initial_sql/sql/

comment:8 by Malcolm Tredinnick, 17 years ago

Triage Stage: Ready for checkinAccepted

I don't think I like this patch, whereby the quoting behaviour is different depending upon whether or not parameters are present. Consistent quoting behaviour is a good idea. I realise there's the contradiction that print '%s' and print '%s' % 'foo' behave differently in Python, but it's harder to tell if there are going to be parameters included in the SQL query or not (particularly when using, e.g., the extra() method on a QuerySet).

We should probably try and move the tests under regressionstests to where the existing initial data tests are, for performance reasons. There's no real reason it needs a new directory and every new directory adds extra time to test runs (another setup and teardown iteration).

Moving back to "accepted" whilst I think about this a bit more. I probably want to just add extra quoting for initial SQL (which is more likely to be expected to be raw SQL) and leave the cursor behaviour as it is now -- always expecting parameters and you have to escape '%' markers yourself.

in reply to:  8 comment:9 by Craig Ogg <cogg@…>, 17 years ago

Replying to mtredinnick:

I don't think I like this patch, whereby the quoting behaviour is different depending upon whether or not parameters are present. Consistent quoting behaviour is a good idea. I realise there's the contradiction that print '%s' and print '%s' % 'foo' behave differently in Python, but it's harder to tell if there are going to be parameters included in the SQL query or not (particularly when using, e.g., the extra() method on a QuerySet).

After writing the test and thinking about this more, it is obvious what the correction solution is: the placeholder behavior for the debug SQL needs to match that of the SQL driver. This isn't true for the existing CursorDebugWrapper and isn't true for the patch (though the patch is incrementally closer). I will try to take a look at the source for one of the SQL drivers to discover what the behavior is (I couldn't find a good spec, would be happy to follow one if someone can point me at one).

We should probably try and move the tests under regressionstests to where the existing initial data tests are, for performance reasons. There's no real reason it needs a new directory and every new directory adds extra time to test runs (another setup and teardown iteration).

Makes sense.

Moving back to "accepted" whilst I think about this a bit more. I probably want to just add extra quoting for initial SQL (which is more likely to be expected to be raw SQL) and leave the cursor behaviour as it is now -- always expecting parameters and you have to escape '%' markers yourself.

Extra quoting for initial SQL won't work. You'll find that if you escape the % you will end up with %% in your db. The root cause of this bug is that the database drivers' handling of placeholders and print '%s' % 'foo' are different.

comment:10 by Craig Ogg <cogg@…>, 17 years ago

On further investigation, I believe the patch is correct. From the source for psycopg2, formatting is only applied if there are parameters. See cursor_type.c:

/* here we are, and we have a sequence or a dictionary filled with
   objects to be substituted (bound variables). we try to be smart and do
   the right thing (i.e., what the user expects) */
    
if (vars && vars != Py_None)
{
   if(_mogrify(vars, operation, self->conn, &cvt) == -1) {
       Py_XDECREF(uoperation);
       return 0;
   }
}

Similar code is in psycopg2 executemany, so a similar change really should be made to the appropriate method for it as well. I haven't made that change because I am not sure on what the policy here is on fixing bugs discovered through code inspection.

I'll update the test code so it can live in the regression directory with the other initial SQL test.

by Craig Ogg <cogg@…>, 17 years ago

Additional test added to trunk/tests/regressiontests/initial_sql_regress/models.py

by Craig Ogg <cogg@…>, 17 years ago

Attachment: simple2.sql added

Should be in trunk/tests/regressiontests/initial_sql_regress/sql

comment:11 by Malcolm Tredinnick, 17 years ago

Doing what the backend cursors do is the right thing if they all behave the same way. I still need to check that behaviour, but assuming they are consistent, I'm happy with this fix.

If various wrappers behave in different ways, we need to think this through carefully, because we try to be as database-neutral as possible (although sometimes differences occur).

comment:12 by Craig Ogg <cogg@…>, 17 years ago

MySQLdb does, according to the cursors.py Python source for parameter handling:

if args is not None:
   query = query % connection.literal(args)

I don't have the energy to check the rest right now, but will point out that the unit test I wrote will fail against a backend if its parameter handling is different. It inserts a string with literal percents in it and verifies that they come back intact. Parameter handling behavior is considered back-end specific according to the spec, though it has a footnote on the behavior that says:

The term "bound" refers to the process of binding an input
value to a database execution buffer. In practical terms,
this means that the input value is directly used as a value
in the operation. The client should not be required to
"escape" the value so that it can be used -- the value
should be equal to the actual database value.

This is not clear enough to me to be sure it is speaking of the case we are talking about now.

comment:13 by Craig Ogg <cogg@…>, 17 years ago

I have verified that cx_Oracle also doesn't do translation if there are no parameters, from Cursor.c:

perform binds
if (executeArgs && Cursor_SetBindVariables(self, executeArgs, 1, 0) < 0)

return NULL;

Later in the code, Cursor_PerformBind passes through with success if there are no bind variables set. I attempted to verify pysqlite, but was confident in my understanding of the code from a brief pass through.

comment:14 by Craig Ogg, 17 years ago

Ticket summary:

  • Code in trunk will crash if settings.DEBUG is true and there is initial SQL with a '%' in it when it goes to log the query.
  • Patch only does string substitution (string % args) for query logging if there are query arguments passed in.
  • Actual data stored in DB is correct, this only affects DB logging.

mtredinnick wanted to be sure this match db backend behavior and wanted tests moved to regression tree.

  • Verified through code inspection that this is what cx_Oracle, psycopg2 and MySQLdb do as well.
  • Created tests in regressions (test files dated 8/12/2007 are correct files)

comment:15 by durdinator, 17 years ago

Cc: me@… added

comment:16 by arifamirani, 15 years ago

This still continues to affect in the trunk. Is there a workaround or resolution?

comment:17 by Adam Nelson, 14 years ago

Patch needs improvement: set

comment:18 by Łukasz Rekucki, 13 years ago

Severity: Normal
Type: Bug

comment:19 by Julien Phalip, 13 years ago

See some other issues relating to the use of the % character and escaping: #6343, #12268, #11391, #13648. Perhaps some of these could be tackled at the same time.

comment:20 by Aymeric Augustin, 12 years ago

UI/UX: unset

Change UI/UX from NULL to False.

comment:21 by Aymeric Augustin, 12 years ago

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:22 by Aymeric Augustin, 11 years ago

Status: reopenednew

comment:23 by Claude Paroz, 11 years ago

Resolution: fixed
Status: newclosed

I think this issue has been fixed in the same time as #9055 [76aecfbc4b49f5a]. I will commit the proposed test anyway.

comment:24 by Claude Paroz <claude@…>, 11 years ago

In 34a50e99e80295aa0a8970a775f53c62fe220c5a:

Added regression test for custom SQL containing percents

Refs #3485.

Note: See TracTickets for help on using tickets.
Back to Top