Code

Opened 8 years ago

Closed 4 months ago

#2284 closed Bug (duplicate)

[patch] Extra /sql files with whitespace break syncdb

Reported by: scottanderson Owned by: nobody
Component: Core (Management commands) Version: master
Severity: Normal Keywords: syncdb
Cc: scottanderson@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Extra .sql files in /sql directories break syncdb when the files have whitespace.

Example:

CREATE TABLE "forum_post_fulltext" (
etc...

) WITHOUT OIDS;

CREATE INDEX idxcontent ON forum_post_fulltext USING gist (content_idx);

CREATE INDEX idxposttree ON forum_post (thread_id, leftid, rightid);

CREATE FUNCTION f_post_insert() RETURNS trigger AS '
... etc. ...
' LANGUAGE plpgsql
;

The above will cause syncdb to display errors.

Attachments (3)

syncdb-whitespace-fix.diff (785 bytes) - added by anonymous 8 years ago.
Patch to fix whitespace problem
post.sql (318 bytes) - added by scottanderson@… 8 years ago.
Test file for problem
new-patch.diff (1.9 KB) - added by scottanderson@… 8 years ago.
Reverts old regex but stil fixes escaped quotes problem from ticket #2119

Download all attachments as: .zip

Change History (31)

comment:1 Changed 8 years ago by anonymous

  • Summary changed from Extra /sql files with whitespace break syncdb to [patch] Extra /sql files with whitespace break syncdb

Changed 8 years ago by anonymous

Patch to fix whitespace problem

comment:2 Changed 8 years ago by anonymous

  • Component changed from Core framework to Tools

comment:3 Changed 8 years ago by mtredinnick

Can you post an example of something that includes the whitespace you are seeing breakage for, please. It's a bit hard to work out from your original post: is the whitespace hidden inside the "etc" bits, or is it the linebreaks, or ...? To create a regression test for this, we first need a failing example. Thanks.

comment:4 Changed 8 years ago by scottanderson

Apparently the bug report software stripped the linebreaks. Just add linebreaks between the statements.

comment:5 Changed 8 years ago by anonymous

  • Cc scottanderson@… added

comment:6 Changed 8 years ago by mtredinnick

I'm still not able to repeat this. Can you please attach a text file of a failing example. If I add blank lines between statements in tests/regressiontests/inital_sql_regress/sql/simple.sql in the source, it does not cause any errors, for example.

The get_sql_initial_data_for_model() function in management.py already strips each SQL statement and does not append blank ones, so I am wondering how these extra blank statements are getting through.

What subversion checkout version are you running and which database is this failing under (shouldn't be relevant, but just in case)?

comment:7 Changed 8 years ago by scottanderson

To be honest, this is a patch we've been using for a while against magic removal. I've just created a new database and the patch doesn't seem to be fixing it any longer. The whitespace no longer seems to be significant; it's choking on the stored procedure syntax.


Here's a minimal testcase (in a file called Post.sql; I'll attach this as well):


CREATE FUNCTION f_post_insert() RETURNS trigger AS '
DECLARE

v_level integer;
v_rightid integer;
nowtime timestamp(3);
vrec RECORD;

BEGIN

nowtime := CURRENT_TIMESTAMP;

RETURN NEW;

END;
' LANGUAGE plpgsql
;

CREATE TRIGGER post_insert BEFORE INSERT ON forum_post FOR EACH ROW
EXECUTE PROCEDURE f_post_insert() ;



Here's the output I'm getting now:


Installing initial data for Post model
Executing 'CREATE FUNCTION f_post_insert() RETURNS trigger AS '
DECLARE

v_level integer;'

Failed to install initial SQL data for Post model: ERROR: unterminated quoted string at or near "'
DECLARE

v_level integer;" at character 52

CREATE FUNCTION f_post_insert() RETURNS trigger AS '
DECLARE

v_level integer;Installing initial data for Collection model



I apologize, but I don't know what version of m-r we were using when the fix was actually working. It's been a while since I've had to rebuild the database, and I wanted to send the patch in before I forgot about it.


From svn info: Last Changed Rev: 3275

This is PostgreSQL 8.1.

Changed 8 years ago by scottanderson@…

Test file for problem

comment:8 Changed 8 years ago by scottanderson@…

I should also add that this file works fine when cut and pasted into a psql session.

comment:9 Changed 8 years ago by scottanderson@…

I've discovered the problem, I think.


Your change seen in 'svn diff -r 3177:3178' is what fixed the problem my patch also fixed, but yours ironically breaks my SQL file in a different fashion. :-)


Reverting to the old regular expression allows my code to go through. The problem bit is that this guy is missing now:


| (?:'[']*') # something in single quotes



Which was preserving the text of the stored procedure as a single executable statement. Now it's being broken up at the semi-colons by the new regex.

Changed 8 years ago by scottanderson@…

Reverts old regex but stil fixes escaped quotes problem from ticket #2119

comment:10 Changed 8 years ago by scottanderson@…

This new patch seems to work for both escaped quotes and stored procedures in PostgreSQL. I can't promise it works in other databases.

comment:11 Changed 8 years ago by mtredinnick

Reverting [3178] is a bit desperate. The old reg-exp was extremely fragile and broke in a number of cases (I can think of about half a dozen bugs that were open more or less simultaneously because of problesm in that code). Basically, trying to parse SQL using a reg-exp is pretty doomed (it looks like there might be problems with properly SQL-escaped quotes -- things like -- in that reg-exp, for example). The multi-line nature of the whole thing is part of the cause of the problems.

So we'll need to think about this a bit more and try to come up with something more robust. Since speed isn't a real issue for initial SQL, we can afford to do a little more inspection and be more careful than in other places. So a solution based on splitting and inspection might work, too. I'll try some things out.

comment:12 Changed 8 years ago by mtredinnick

Oh, #$%! Bitten by wiki-formatting. Just before italics get turned on, it should have two single quotes next to each other (the spec-compliant SQL quote escape).

comment:13 Changed 8 years ago by scottanderson@…

Well, it's not so much a reversion as fixing the old regex in a different fashion. Note that it takes care of escaped quotes now. But yes, I agree with you that it can get to be a bit difficult. However, since all we need to do is figure out statement endings, the problem is simpler. Actually parsing the SQL is an order of magnitude more difficult given the different engines that it needs to support. Perhaps pushing this sort of thing off into the individual database wrappers is in order, so that the idiosyncratic differences won't need a general solution.

At any rate, adding a stored procedure definition to the regression tests is probably a good idea. :-)

comment:14 Changed 8 years ago by anonymous

This takes care of sql-style quoting.

          | (?:'(?:[^']|\\'|'')*') # something in single quotes, but not escaped
          | (?:"(?:[^"]|\\"|"")*") # something in double quotes, but not escaped

comment:15 Changed 7 years ago by Gary Wilson <gary.wilson@…>

  • Needs tests set
  • Triage Stage changed from Unreviewed to Accepted

comment:16 Changed 7 years ago by adrian

  • Version changed from magic-removal to SVN

comment:17 Changed 6 years ago by adrian

  • Component changed from Tools to django-admin.py

comment:18 in reply to: ↑ description Changed 6 years ago by anonymous

  • Component changed from django-admin.py to django.contrib.databrowse
  • milestone set to post-1.0
  • Needs documentation set
  • Owner changed from nobody to anonymoushgjdhj vf
  • Patch needs improvement set
  • Version changed from SVN to 1.0-alpha

Replying to scottanderson:

Extra .sql files in /sql directories break syncdb when the files have whitespace.

Example:jgnjn njngjnj jnj koijiojij jiojiojibvj iojiojbio jiojioji ji jiojio joi joijoijovijfiji ijoijkioj iojoij oij oijioj iojio joijiojioj iojioj ioj

CREATE TABLE "forum_post_fulltext" (
etc...

) WITHOUT OIDS;

CREATE INDEX idxcontent ON forum_post_fulltext USING gist (content_idx);

CREATE INDEX idxposttree ON forum_post (thread_id, leftid, rightid);

CREATE FUNCTION f_post_insert() RETURNS trigger AS '
... etc. ...
' LANGUAGE plpgsql
;

The above will cause syncdb to display errors.

comment:19 follow-up: Changed 6 years ago by russellm

  • Component changed from django.contrib.databrowse to django-admin.py
  • milestone post-1.0 deleted
  • Needs documentation unset
  • Owner changed from anonymoushgjdhj vf to nobody
  • Patch needs improvement unset
  • Version changed from 1.0-alpha to SVN

Reverting some unhelpful status changes

comment:20 in reply to: ↑ 19 Changed 5 years ago by anonymous

Replying to russellm:

Reverting some unhelpful status changes

no not this time

comment:21 Changed 4 years ago by adamnelson

  • Patch needs improvement set

From Jeremy Dunck (http://groups.google.com/group/django-developers/browse_thread/thread/a60a582d1e0a562b/cfca983bfe633b80):

... the current status is captured in
comments 11 through 14 -- that the logic for executing initial SQL
breaks in cases where extended SQL syntax (such as pgsql's DECLARE)
includes quoted text, etc.
You can see the Malcolm objected to the latest patch in 11, and
ScottAnderson objected to it in 13.
Certainly, the patch needs improvement because it doesn't include any
test cases. Possibly, the patch as-is isn't good for the reasons
Malcolm listed, but it's hard for me to tell without doing my own
testing. I'd mark patch-needs-improvement. If you're feeling more
ambitious, I'd suggest coming up with a set of initial SQL files that
exercise the assumptions of the loading code.

comment:22 Changed 3 years ago by lrekucki

  • Type changed from defect to Bug

comment:23 Changed 3 years ago by lrekucki

  • Severity changed from normal to Normal

comment:24 Changed 2 years ago by aaugustin

  • UI/UX unset

Change UI/UX from NULL to False.

comment:25 Changed 2 years ago by aaugustin

  • Easy pickings unset

Change Easy pickings from NULL to False.

comment:26 Changed 18 months ago by claudep

I think that the parsing solution leads us nowhere. It is not fair to penalize all backends because some cannot execute multiple statements. IMHO, we should have a backend-specific load_sql that take care of either splitting the statements or to feed the database with the read file at once.

comment:27 Changed 4 months ago by merb

this ticket seems to be fixed.

comment:28 Changed 4 months ago by claudep

  • Resolution set to duplicate
  • Status changed from new to closed

In any case, custom sql loading is handled in #3214.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.