#3485 closed Bug (fixed)
Initial SQL fails when data contains a % in it when settings.DEBUG = True
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | me@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
We had some initial sql data that contains %'s in strings exported from a postgresql database. When a fellow developer tried to start the application, not all of the data was properly inserted as the mysql driver interpreted the % symbols as a directive for it to process and they failed.
Attachments (6)
Change History (30)
comment:1 by , 17 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
comment:2 by , 17 years ago
Resolution: | invalid |
---|---|
Status: | closed → reopened |
From my understanding of SQL, it's not a quoting issue. The initialdata is in raw sql format. The postgres statement is:
INSERT INTO cms_contentblock (id, page_id, name, body) VALUES (4, 3, 'email_format', 'Question From: %(name)s <%(email)s> %(question)s ');
This query works fine from the MySQL command line utility or through a couple of different mysql administration tools.
In order for Django to properly put it into MySQL we had to change the query to:
INSERT INTO cms_contentblock (id, page_id, name, body) VALUES (4, 3, 'email_format', 'Question From: %%(name)s <%%(email)s> %%(question)s ');
comment:4 by , 17 years ago
Component: | Uncategorized → Database wrapper |
---|---|
Has patch: | set |
Summary: | Initial SQL fails with MySQLdb driver when data contains a % in it. → Initial SQL fails when data contains a % in it. |
Tickets #4465 and #3537 are duplicates (and I will mark as such). Modified Summary to make it clear that this applies to all backends (at least, it was reported against MySQLdb driver and I am seeing it with postgresql_psycopg2).
The patch I submitted is based on the code in #4465 but modified to accommodate the recent unicode changes.
comment:5 by , 17 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:7 by , 17 years ago
Summary: | Initial SQL fails when data contains a % in it. → Initial SQL fails when data contains a % in it when settings.DEBUG = True |
---|
I have modified the subject to show this is a problem recording the SQL query in debug mode and not a problem with the SQL execution itself. I have written tests for this, but am not quite sure how to make them available, so I will attach them to the ticket. There was no existing test for initial SQL loading except as a regression, so I added a new app for it.
Put the attached files in the following locations on the trunk:
tests\modeltests\initial_sql (new directory)
init.py
models.py
tests\modeltests\initial_sql\sql
simple.sql
I realize in retrospect that actual problem doesn't have anything to do with the initial SQL mechanism (other than that being a convenient way to trigger the problem), it is instead a bug with the CursorDebugWrapper which has much broader applicability. I am afraid that I don't have enough understanding of Django to know if this is a complete fix (IOW, there may be other places where literal %s in SQL will cause Django to fail). I am sure the fix is safe though: the worst effect it could have is to leave a %% in a debug copy of SQL string where a % should be).
The nature of this test may mean it should be in the regression area. Not sure what the convention is.
Note: CursorDebugWrapper.executemany() seems to have the same problem from code inspection. Not sure under what conditions (if any) it would be triggered.
by , 17 years ago
Attachment: | simple.sql added |
---|
Should go in trunk/tests/modeltests/initial_sql/sql/
follow-up: 9 comment:8 by , 17 years ago
Triage Stage: | Ready for checkin → Accepted |
---|
I don't think I like this patch, whereby the quoting behaviour is different depending upon whether or not parameters are present. Consistent quoting behaviour is a good idea. I realise there's the contradiction that print '%s'
and print '%s' % 'foo'
behave differently in Python, but it's harder to tell if there are going to be parameters included in the SQL query or not (particularly when using, e.g., the extra() method on a QuerySet).
We should probably try and move the tests under regressionstests to where the existing initial data tests are, for performance reasons. There's no real reason it needs a new directory and every new directory adds extra time to test runs (another setup and teardown iteration).
Moving back to "accepted" whilst I think about this a bit more. I probably want to just add extra quoting for initial SQL (which is more likely to be expected to be raw SQL) and leave the cursor behaviour as it is now -- always expecting parameters and you have to escape '%' markers yourself.
comment:9 by , 17 years ago
Replying to mtredinnick:
I don't think I like this patch, whereby the quoting behaviour is different depending upon whether or not parameters are present. Consistent quoting behaviour is a good idea. I realise there's the contradiction that
print '%s'
andprint '%s' % 'foo'
behave differently in Python, but it's harder to tell if there are going to be parameters included in the SQL query or not (particularly when using, e.g., the extra() method on a QuerySet).
After writing the test and thinking about this more, it is obvious what the correction solution is: the placeholder behavior for the debug SQL needs to match that of the SQL driver. This isn't true for the existing CursorDebugWrapper
and isn't true for the patch (though the patch is incrementally closer). I will try to take a look at the source for one of the SQL drivers to discover what the behavior is (I couldn't find a good spec, would be happy to follow one if someone can point me at one).
We should probably try and move the tests under regressionstests to where the existing initial data tests are, for performance reasons. There's no real reason it needs a new directory and every new directory adds extra time to test runs (another setup and teardown iteration).
Makes sense.
Moving back to "accepted" whilst I think about this a bit more. I probably want to just add extra quoting for initial SQL (which is more likely to be expected to be raw SQL) and leave the cursor behaviour as it is now -- always expecting parameters and you have to escape '%' markers yourself.
Extra quoting for initial SQL won't work. You'll find that if you escape the % you will end up with %% in your db. The root cause of this bug is that the database drivers' handling of placeholders and print '%s' % 'foo'
are different.
comment:10 by , 17 years ago
On further investigation, I believe the patch is correct. From the source for psycopg2, formatting is only applied if there are parameters. See cursor_type.c:
/* here we are, and we have a sequence or a dictionary filled with objects to be substituted (bound variables). we try to be smart and do the right thing (i.e., what the user expects) */ if (vars && vars != Py_None) { if(_mogrify(vars, operation, self->conn, &cvt) == -1) { Py_XDECREF(uoperation); return 0; } }
Similar code is in psycopg2 executemany, so a similar change really should be made to the appropriate method for it as well. I haven't made that change because I am not sure on what the policy here is on fixing bugs discovered through code inspection.
I'll update the test code so it can live in the regression directory with the other initial SQL test.
by , 17 years ago
Attachment: | initial_sql_regress_models.diff added |
---|
Additional test added to trunk/tests/regressiontests/initial_sql_regress/models.py
by , 17 years ago
Attachment: | simple2.sql added |
---|
Should be in trunk/tests/regressiontests/initial_sql_regress/sql
comment:11 by , 17 years ago
Doing what the backend cursors do is the right thing if they all behave the same way. I still need to check that behaviour, but assuming they are consistent, I'm happy with this fix.
If various wrappers behave in different ways, we need to think this through carefully, because we try to be as database-neutral as possible (although sometimes differences occur).
comment:12 by , 17 years ago
MySQLdb does, according to the cursors.py Python source for parameter handling:
if args is not None: query = query % connection.literal(args)
I don't have the energy to check the rest right now, but will point out that the unit test I wrote will fail against a backend if its parameter handling is different. It inserts a string with literal percents in it and verifies that they come back intact. Parameter handling behavior is considered back-end specific according to the spec, though it has a footnote on the behavior that says:
The term "bound" refers to the process of binding an input
value to a database execution buffer. In practical terms,
this means that the input value is directly used as a value
in the operation. The client should not be required to
"escape" the value so that it can be used -- the value
should be equal to the actual database value.
This is not clear enough to me to be sure it is speaking of the case we are talking about now.
comment:13 by , 17 years ago
I have verified that cx_Oracle also doesn't do translation if there are no parameters, from Cursor.c:
perform binds
if (executeArgs && Cursor_SetBindVariables(self, executeArgs, 1, 0) < 0)
return NULL;
Later in the code, Cursor_PerformBind passes through with success if there are no bind variables set. I attempted to verify pysqlite, but was confident in my understanding of the code from a brief pass through.
comment:14 by , 17 years ago
Ticket summary:
- Code in trunk will crash if settings.DEBUG is true and there is initial SQL with a '%' in it when it goes to log the query.
- Patch only does string substitution (
string % args
) for query logging if there are query arguments passed in. - Actual data stored in DB is correct, this only affects DB logging.
mtredinnick wanted to be sure this match db backend behavior and wanted tests moved to regression tree.
- Verified through code inspection that this is what cx_Oracle, psycopg2 and MySQLdb do as well.
- Created tests in regressions (test files dated 8/12/2007 are correct files)
comment:15 by , 17 years ago
Cc: | added |
---|
comment:16 by , 15 years ago
This still continues to affect in the trunk. Is there a workaround or resolution?
comment:17 by , 15 years ago
Patch needs improvement: | set |
---|
comment:18 by , 14 years ago
Severity: | → Normal |
---|---|
Type: | → Bug |
comment:19 by , 14 years ago
comment:22 by , 12 years ago
Status: | reopened → new |
---|
comment:23 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
I think this issue has been fixed in the same time as #9055 [76aecfbc4b49f5a]. I will commit the proposed test anyway.
Possibly a quoting issue? What was used to export the SQL from the postgres database? Was it output by Django? Could you post an example? Until we have more info, closing this ticket as invalid.