Opened 18 years ago
Closed 11 years ago
#2284 closed Bug (duplicate)
[patch] Extra /sql files with whitespace break syncdb
Reported by: | scottanderson | Owned by: | nobody |
---|---|---|---|
Component: | Core (Management commands) | Version: | dev |
Severity: | Normal | Keywords: | syncdb |
Cc: | scottanderson@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | yes | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
Extra .sql files in /sql directories break syncdb when the files have whitespace.
Example:
CREATE TABLE "forum_post_fulltext" (
etc...
) WITHOUT OIDS;
CREATE INDEX idxcontent ON forum_post_fulltext USING gist (content_idx);
CREATE INDEX idxposttree ON forum_post (thread_id, leftid, rightid);
CREATE FUNCTION f_post_insert() RETURNS trigger AS '
... etc. ...
' LANGUAGE plpgsql
;
The above will cause syncdb to display errors.
Attachments (3)
Change History (31)
comment:1 by , 18 years ago
Summary: | Extra /sql files with whitespace break syncdb → [patch] Extra /sql files with whitespace break syncdb |
---|
by , 18 years ago
Attachment: | syncdb-whitespace-fix.diff added |
---|
comment:2 by , 18 years ago
Component: | Core framework → Tools |
---|
comment:3 by , 18 years ago
Can you post an example of something that includes the whitespace you are seeing breakage for, please. It's a bit hard to work out from your original post: is the whitespace hidden inside the "etc" bits, or is it the linebreaks, or ...? To create a regression test for this, we first need a failing example. Thanks.
comment:4 by , 18 years ago
Apparently the bug report software stripped the linebreaks. Just add linebreaks between the statements.
comment:5 by , 18 years ago
Cc: | added |
---|
comment:6 by , 18 years ago
I'm still not able to repeat this. Can you please attach a text file of a failing example. If I add blank lines between statements in tests/regressiontests/inital_sql_regress/sql/simple.sql in the source, it does not cause any errors, for example.
The get_sql_initial_data_for_model()
function in management.py already strips each SQL statement and does not append blank ones, so I am wondering how these extra blank statements are getting through.
What subversion checkout version are you running and which database is this failing under (shouldn't be relevant, but just in case)?
comment:7 by , 18 years ago
To be honest, this is a patch we've been using for a while against magic removal. I've just created a new database and the patch doesn't seem to be fixing it any longer. The whitespace no longer seems to be significant; it's choking on the stored procedure syntax.
Here's a minimal testcase (in a file called Post.sql; I'll attach this as well):
CREATE FUNCTION f_post_insert() RETURNS trigger AS '
DECLARE
v_level integer;
v_rightid integer;
nowtime timestamp(3);
vrec RECORD;
BEGIN
nowtime := CURRENT_TIMESTAMP;
RETURN NEW;
END;
' LANGUAGE plpgsql
;
CREATE TRIGGER post_insert BEFORE INSERT ON forum_post FOR EACH ROW
EXECUTE PROCEDURE f_post_insert() ;
Here's the output I'm getting now:
Installing initial data for Post model
Executing 'CREATE FUNCTION f_post_insert() RETURNS trigger AS '
DECLARE
v_level integer;'
Failed to install initial SQL data for Post model: ERROR: unterminated quoted string at or near "'
DECLARE
v_level integer;" at character 52
CREATE FUNCTION f_post_insert() RETURNS trigger AS '
DECLARE
v_level integer;Installing initial data for Collection model
I apologize, but I don't know what version of m-r we were using when the fix was actually working. It's been a while since I've had to rebuild the database, and I wanted to send the patch in before I forgot about it.
From svn info: Last Changed Rev: 3275
This is PostgreSQL 8.1.
comment:8 by , 18 years ago
I should also add that this file works fine when cut and pasted into a psql session.
comment:9 by , 18 years ago
I've discovered the problem, I think.
Your change seen in 'svn diff -r 3177:3178' is what fixed the problem my patch also fixed, but yours ironically breaks my SQL file in a different fashion. :-)
Reverting to the old regular expression allows my code to go through. The problem bit is that this guy is missing now:
| (?:'[']*') # something in single quotes
Which was preserving the text of the stored procedure as a single executable statement. Now it's being broken up at the semi-colons by the new regex.
by , 18 years ago
Attachment: | new-patch.diff added |
---|
Reverts old regex but stil fixes escaped quotes problem from ticket #2119
comment:10 by , 18 years ago
This new patch seems to work for both escaped quotes and stored procedures in PostgreSQL. I can't promise it works in other databases.
comment:11 by , 18 years ago
Reverting [3178] is a bit desperate. The old reg-exp was extremely fragile and broke in a number of cases (I can think of about half a dozen bugs that were open more or less simultaneously because of problesm in that code). Basically, trying to parse SQL using a reg-exp is pretty doomed (it looks like there might be problems with properly SQL-escaped quotes -- things like -- in that reg-exp, for example). The multi-line nature of the whole thing is part of the cause of the problems.
So we'll need to think about this a bit more and try to come up with something more robust. Since speed isn't a real issue for initial SQL, we can afford to do a little more inspection and be more careful than in other places. So a solution based on splitting and inspection might work, too. I'll try some things out.
comment:12 by , 18 years ago
Oh, #$%! Bitten by wiki-formatting. Just before italics get turned on, it should have two single quotes next to each other (the spec-compliant SQL quote escape).
comment:13 by , 18 years ago
Well, it's not so much a reversion as fixing the old regex in a different fashion. Note that it takes care of escaped quotes now. But yes, I agree with you that it can get to be a bit difficult. However, since all we need to do is figure out statement endings, the problem is simpler. Actually parsing the SQL is an order of magnitude more difficult given the different engines that it needs to support. Perhaps pushing this sort of thing off into the individual database wrappers is in order, so that the idiosyncratic differences won't need a general solution.
At any rate, adding a stored procedure definition to the regression tests is probably a good idea. :-)
comment:14 by , 18 years ago
This takes care of sql-style quoting.
| (?:'(?:[^']|\\'|'')*') # something in single quotes, but not escaped | (?:"(?:[^"]|\\"|"")*") # something in double quotes, but not escaped
comment:15 by , 18 years ago
Needs tests: | set |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:16 by , 18 years ago
Version: | magic-removal → SVN |
---|
comment:17 by , 16 years ago
Component: | Tools → django-admin.py |
---|
comment:18 by , 16 years ago
Component: | django-admin.py → django.contrib.databrowse |
---|---|
milestone: | → post-1.0 |
Needs documentation: | set |
Owner: | changed from | to
Patch needs improvement: | set |
Version: | SVN → 1.0-alpha |
Replying to scottanderson:
Extra .sql files in /sql directories break syncdb when the files have whitespace.
Example:jgnjn njngjnj jnj koijiojij jiojiojibvj iojiojbio jiojioji ji jiojio joi joijoijovijfiji ijoijkioj iojoij oij oijioj iojio joijiojioj iojioj ioj
CREATE TABLE "forum_post_fulltext" (
etc...
) WITHOUT OIDS;
CREATE INDEX idxcontent ON forum_post_fulltext USING gist (content_idx);
CREATE INDEX idxposttree ON forum_post (thread_id, leftid, rightid);
CREATE FUNCTION f_post_insert() RETURNS trigger AS '
... etc. ...
' LANGUAGE plpgsql
;
The above will cause syncdb to display errors.
follow-up: 20 comment:19 by , 16 years ago
Component: | django.contrib.databrowse → django-admin.py |
---|---|
milestone: | post-1.0 |
Needs documentation: | unset |
Owner: | changed from | to
Patch needs improvement: | unset |
Version: | 1.0-alpha → SVN |
Reverting some unhelpful status changes
comment:20 by , 16 years ago
comment:21 by , 15 years ago
Patch needs improvement: | set |
---|
From Jeremy Dunck (http://groups.google.com/group/django-developers/browse_thread/thread/a60a582d1e0a562b/cfca983bfe633b80):
... the current status is captured in
comments 11 through 14 -- that the logic for executing initial SQL
breaks in cases where extended SQL syntax (such as pgsql's DECLARE)
includes quoted text, etc.
You can see the Malcolm objected to the latest patch in 11, and
ScottAnderson objected to it in 13.
Certainly, the patch needs improvement because it doesn't include any
test cases. Possibly, the patch as-is isn't good for the reasons
Malcolm listed, but it's hard for me to tell without doing my own
testing. I'd mark patch-needs-improvement. If you're feeling more
ambitious, I'd suggest coming up with a set of initial SQL files that
exercise the assumptions of the loading code.
comment:22 by , 14 years ago
Type: | defect → Bug |
---|
comment:23 by , 14 years ago
Severity: | normal → Normal |
---|
comment:26 by , 12 years ago
I think that the parsing solution leads us nowhere. It is not fair to penalize all backends because some cannot execute multiple statements. IMHO, we should have a backend-specific load_sql
that take care of either splitting the statements or to feed the database with the read file at once.
comment:28 by , 11 years ago
Resolution: | → duplicate |
---|---|
Status: | new → closed |
In any case, custom sql loading is handled in #3214.
Patch to fix whitespace problem