#3214 closed Bug (fixed)
[patch] raw sql file doesn't recognize quotes correctly
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Release blocker | Keywords: | rawsql migrations |
Cc: | shaun@…, sam@…, andrehcampos@…, aseering@…, walter+django@…, vlastimil.zima@…, Christophe Pettus, sbarre, vlastimil@…, 4glitch@… | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Suppose I have an app foo with a model Bar.
in foo/sql/bar.postgresql_psycopg2.sql I have:
CREATE OR REPLACE FUNCTION test() RETURNS trigger LANGUAGE plpythonu AS $$ '''this will have problems if it contains an embedded ; ''' $$;
This generates the following error on syncdb:
sql /* cranedata.db.search.sql.searchmodel.postgresql_psycopg2 * * Setup for 'SearchModel': * want to have a trigger on SearchModels which sets up a trigger * on models searched, such that when rows of these models are * modified, text fields are automatically indexed. */ sql CREATE OR REPLACE FUNCTION test() RETURNS trigger LANGUAGE plpythonu AS $$ '''this will have problems if it contains an embedded ; params () Failed to install initial SQL data for search.SearchModel model: unterminated dollar-quoted string a t or near "$$ '''this will have problems if it contains an embedded ;" at character 355
Attachments (2)
Change History (54)
comment:1 Changed 17 years ago by
comment:2 Changed 17 years ago by
The problem is the following, in django.core.management:341:
# Some backends can't execute more than one SQL statement at a time, # so split into separate statements. statements = re.compile(r";[ \t]*$", re.M)
This may be a fine workaround if the backend really doesn't deal with multiple statements, but it breaks things that have ';' in quotes. Easiest fix would be for the db to support a property: "SUPPORTS_MULTIPLE_STATEMENTS_IN_EXECUTE" or whatever, and only split if necessary.
Changed 17 years ago by
Attachment: | sql-statement-iterator.diff added |
---|
patch, introduces 'sql-statement-iterator' which correctly deals with quotes
comment:3 Changed 17 years ago by
Has patch: | set |
---|
I've attached a patch for this bug: with patch management.py will use utility "sql_statement_iterator" which splits sql into statements more carefully, respecting quotes and comments.
Also, it will change freestanding '%' to double -- otherwise these cause an error in management.py, as python DB2.0 spec treats all non-doubled percent signs as substitution targets, which isn't wanted in this situation.
With this fix I'm able to use pg_dump (with appropriate flags to use 'insert' rather than 'copy') to dump old tables, then cut and paste resulting inserts (and set sequence) into initial data.
utils/sqltools.py, where "sql_statement_iterator" lives, has doctests. Ideally I would have written unit tests for both this and new management, but, ... well maybe I'll get to it after I deploy next edition of our website :).
comment:4 Changed 17 years ago by
Keywords: | rawsql added |
---|---|
Needs tests: | set |
Summary: | raw sql file doesn't recognize quotes correctly → [patch] raw sql file doesn't recognize quotes correctly |
Triage Stage: | Unreviewed → Design decision needed |
comment:5 Changed 17 years ago by
Cc: | shaun@… added |
---|
I understand this is been relegated to Tumbolia for the moment. If it ends up "accepted", please do ping me, and if I have any time at all, I'll write whatever unit tests are required. (Note: already includes doctest of sql_statement_iterator itself.)
comment:6 Changed 16 years ago by
Cc: | sam@… added |
---|
comment:7 Changed 16 years ago by
As a workaround, multi-line SQL statements have to have someting other than whitespace between their semicolons and newline characters. For example:
CREATE OR REPLACE FUNCTION create_post () RETURNS trigger AS $$ BEGIN NEW.date_created := current_timestamp; -- RETURN NEW; -- END; $$ LANGUAGE plpgsql;
comment:8 Changed 16 years ago by
Version: | → SVN |
---|
comment:9 Changed 16 years ago by
Note that the patch requires a slight reorg. to work now that management.py has been changed....
Again, if the powers that be accept this, I'll do my best to get it into shape, as I'm still using it myself.
comment:10 Changed 15 years ago by
Cc: | andrehcampos@… added |
---|
Changed 15 years ago by
Attachment: | sql-statement-iterator-updated.diff added |
---|
Updated version of shaunc's patch, that works with Django SVN as of r8255
comment:11 Changed 15 years ago by
Cc: | aseering@… added |
---|
This bug currently also causes multiline strings to fail to parse, at least for me with PostgreSQL 8.3. This means that Sam Morris's workaround (above) doesn't work for me; as far as I know, there's no raw-SQL way of inserting stored procedures into a database? I could use a Signals hook; but that doesn't feel particularly DRY to me...
I've submitted an updated patch that works for me with Django SVN. I'm not yet very familiar with the new management-module layout; let me know if I did something silly.
I'm quite interested in seeing this bug resolved; I'm also glad to do some coding if it would help.
Django devs: It seems that there's the will and expertise to fix this bug. Is there any chance someone could spare a few moments to make a decision on it? (or, does anyone know of an alternate workaround?) Thanks; I'd certainly appreciate it!
comment:12 Changed 15 years ago by
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Due to variations between all the different SQL variants accepted by the backends we support (and others provide externally), as well as the complexity of SQL (and we aren't going to include a full SQL parser), it's really chasing rainbows to try and make complex stuff work here.
I think this is a situation we just aren't going to handle directly, since the hooks now exist to do it on a case-by-case basis in client code. If you want to pass in complex SQL, catch the post-syncdb signal and work directly with a django.db.backend.cursor
. You will know precisely what database you're talking to and can use the appropriate syntax and calling mechanisms.
comment:13 Changed 15 years ago by
Grumble grumble...
IMO, it is perfectly reasonable to say "its not our business to understand SQL". But then you shouldn't be trying to break the SQL down into statements. Rather, pass it through as a chunk. If this causes problems, then -- if you want to -- patch (hack) it in the backends that have problems. Or users of those back ends should get their db engines fixed.
-- Shaun Cutts
comment:14 follow-up: 16 Changed 15 years ago by
Yes, if the solution was the simple we would be doing it. Not all database wrappers permit multiple statements in one call. So the current stuff works for them and your stuff is possible. Win for everybody.
comment:15 follow-up: 17 Changed 15 years ago by
Curious: Does anyone watching this ticket, know offhand which DB backends/engines actually don't support multiple statements? I tend to agree with Shaun that the best long-term fix is to add support into the lower-level code.
Also, mtredinnick: Shaun's solution sounds pretty simple to me. Has it been tried? If not, would you consider accepting a patch that implements it?
comment:16 Changed 15 years ago by
Replying to mtredinnick:
Yes, if the solution was the simple we would be doing it. Not all database wrappers permit multiple statements in one call. So the current stuff works for them and your stuff is possible. Win for everybody.
What I meant to suggest is: do just what you're doing now, but only do it if the db wrapper doesn't support multiple statements. (seconding aseering's question) Would you consider a patch that did this?
comment:17 follow-up: 18 Changed 14 years ago by
Replying to aseering@mit.edu:
Curious: Does anyone watching this ticket, know offhand which DB backends/engines actually don't support multiple statements? I tend to agree with Shaun that the best long-term fix is to add support into the lower-level code.
I tested PostgreSQL, SQLite and MySQL on Ubuntu 9.04, and SQLite was the only one not supporting multiple statements.
Would it make sense to add a
django.backends.*.DatabaseFeatures.accepts_multiple_statements
attribute and do the statements.split()
dance only when needed?
comment:18 Changed 14 years ago by
Resolution: | wontfix |
---|---|
Status: | closed → reopened |
Triage Stage: | Design decision needed → Accepted |
Replying to akaihola:
Replying to aseering@mit.edu:
Curious: Does anyone watching this ticket, know offhand which DB backends/engines actually don't support multiple statements? I tend to agree with Shaun that the best long-term fix is to add support into the lower-level code.
I tested PostgreSQL, SQLite and MySQL on Ubuntu 9.04, and SQLite was the only one not supporting multiple statements.
Would it make sense to add a
django.backends.*.DatabaseFeatures.accepts_multiple_statementsattribute and do the
statements.split()
dance only when needed?
Reopening (and accepting) on the basis of this suggestion. This sounds workable to me, although it's the python binding version that is important, not the database version itself.
comment:19 follow-up: 20 Changed 14 years ago by
Caveat - I'm accepting the "run bulk SQL as a single block if possible" idea. I'd rather see a 'execute_bulk_sql' call on the DatabaseOperations class than a boolean flag. Making it an overridable method on the backend means that the regular expressions/splitting technique can be modified on a per-backend basis.
comment:21 Changed 14 years ago by
Cc: | walter+django@… added |
---|
As a workaround, multi-line SQL statements have to have someting other than whitespace between their semicolons and newline characters. For example:
> CREATE OR REPLACE FUNCTION create_post () RETURNS trigger AS $$ > BEGIN > NEW.date_created := current_timestamp; -- > RETURN NEW; -- > END; > $$ LANGUAGE plpgsql;
Is that a bug or a feature?
I was about to file this patch because I was getting quite puzzling "Command out of sync error"s because I had a bit of comment after a statement:
--- django/core/management/sql.py.orig 2010-02-25 12:32:04.660976707 +0100 +++ django/core/management/sql.py 2010-02-25 12:38:06.240954572 +0100 @@ -174,7 +174,7 @@ # Some backends can't execute more than one SQL statement at a time, # so split into separate statements. - statements = re.compile(r";[ \t]*$", re.M) + statements = re.compile(r";[ \t]*(?:--.*)?$", re.M) # Find custom SQL, if it's available. sql_files = [os.path.join(app_dir, "%s.%s.sql" % (opts.object_name.lower(), settings.DATABASE_ENGINE)),
But that fix will break your feature.
comment:22 Changed 13 years ago by
Cc: | vlastimil.zima@… added |
---|
comment:23 Changed 13 years ago by
Severity: | normal → Normal |
---|---|
Type: | defect → Bug |
comment:26 Changed 12 years ago by
Cc: | Christophe Pettus added |
---|
comment:27 Changed 11 years ago by
Status: | reopened → new |
---|
comment:28 Changed 10 years ago by
Cc: | sbarre added |
---|
comment:29 Changed 10 years ago by
Can we get this fixed at some point please? It seems quite straightforward now and it causes daily pain to so many people.
comment:30 Changed 10 years ago by
Sure - it will be fixed as soon as someone presents a patch. There's a call to action in Comment 18 and Comment 19, but nobody has presented a patch that implements that approach AFAICT from the rest of the ticket discussion.
If someone wants this, you're going to have to write the patch.
comment:31 Changed 10 years ago by
Needs tests: | unset |
---|
https://github.com/django/django/pull/1533
Open for review!
comment:32 follow-up: 33 Changed 10 years ago by
Question raised by Mark on the PR: With the migrations support, are .sql files even needed any more?
comment:33 Changed 10 years ago by
Replying to claudep:
Question raised by Mark on the PR:
With the migrations support, are .sql files even needed any more?
I have not check the entire migration and south documentation, but what about custom restraints, triggers and functions?
comment:34 Changed 10 years ago by
Cc: | vlastimil@… added |
---|
comment:35 Changed 10 years ago by
Cc: | 4glitch@… added |
---|
comment:36 Changed 10 years ago by
Patch needs improvement: | set |
---|
Repeating Andrew's answer on the pull request:
They should go away, there's a RunSQL operation you can use in migrations instead (that accepts either single statements or mutiple ones with its own internal splitter regex). I didn't get around to removing the support for them on syncdb-type apps, but they'll be ignored if you have migrations.
We should therefore test (sql like this) and document the new process replacing initial sql before closing this ticket (or close this ticket and open new one).
comment:37 Changed 10 years ago by
How will the migrations framework handle Database-backend-specific SQL data?
comment:38 Changed 10 years ago by
Yep, this same bug will still apply to the splitter inside RunSQL (though I believe it's marginally better than the old Django one).
timo: Database-specific SQL data will only work with apps without migrations; those with migrations are expected to achieve the same result using a migration to load in the data, which is much more portable and forwards-compatible.
comment:39 Changed 10 years ago by
At least parts of this must have been fixed in #22401, see 071c9337750b296d198cced56f3ffad0e176afb6.
comment:40 Changed 10 years ago by
Assuming sqlparse does the job, this can indeed be closed. Thanks!
comment:41 Changed 10 years ago by
Has patch: | unset |
---|---|
Keywords: | migrations added |
Patch needs improvement: | unset |
Severity: | Normal → Release blocker |
This ticket never tells which backend support multiple statements in cursor.execute()
and which don't. It would be interesting to check that on the four officially supported databases before going any further...
If indeed some backends need splitting, in order to close this ticket, we must implement a similar approach as #22401 in RunSQL._split_sql
.
Splitting occurs only when explicitly setting multiple=True
, which Django itself never does, even in the test suite. We should add a test for that (and skip it when sqlparse isn't available.
The patch will be easier if we include it in 1.7 because we won't need a deprecation path. Therefore, marking as a release blocker.
comment:42 Changed 10 years ago by
SQLite: needs splitting
PostgreSQL: does not need splitting
MySQL: does not need splitting, but has issues when some multiline SQL results are not read (cursor.nextset()
has to be used to "consume" results).
Oracle: ?
comment:43 Changed 10 years ago by
It would be interesting to avoid the dependency on sqlparse for backends that do not need it. Can we introduce a feature flag "execute_supports_multiple_statements"?
comment:44 Changed 10 years ago by
In my tentative pull request linked in comment:31, I used the approach suggested by Russell in comment:19 (a custom backend method).
comment:45 Changed 10 years ago by
The fix #22401 was reverted and it was then closed as a duplicate.
comment:46 Changed 10 years ago by
Oracle seems to choke on \n
:
DatabaseError: ORA-00911: invalid character
comment:47 Changed 10 years ago by
Triage Stage: | Accepted → Ready for checkin |
---|
I think this is good to go: https://github.com/django/django/pull/2608
I ran the "test_run_sql" test on all four core database backends with and without sqlparse. It's skipped in three cases and passes in five cases.
comment:48 Changed 10 years ago by
Has patch: | set |
---|
comment:49 Changed 10 years ago by
Thanks Aymeric, nice patch!
There are however test failures in initial_sql_regress, I had to apply this:
--- a/tests/initial_sql_regress/tests.py +++ b/tests/initial_sql_regress/tests.py @@ -27,7 +27,10 @@ class InitialSQLTests(TestCase): """ connection = connections[DEFAULT_DB_ALIAS] custom_sql = custom_sql_for_model(Simple, no_style(), connection) - self.assertEqual(len(custom_sql), 9) + if connection.features.requires_sqlparse_for_splitting: + self.assertEqual(len(custom_sql), 9) + else: + self.assertEqual(len(custom_sql), 1) with connection.cursor() as cursor: for sql in custom_sql: cursor.execute(sql)
Apart from that, I think it's RFC.
comment:50 Changed 10 years ago by
Thanks a lot for the review!
I don't think the assert on the number on statements is useful. I'll just remove it.
What matters is that each statement creates an object and that 9 objects are created.
comment:51 Changed 10 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
IMHO the sql file should be just fed all at once to a cursor, rather than django attempting to parse it and feeding in pieces.
The python DB API 2.0 doc is a bit hazy on this point, but:
seems to show that multiple statements count as one "operation", at least using postgresql_psycopg2