Code

Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#5806 closed (wontfix)

Would like a callback on models.Field during SQL generation

Reported by: sam@… Owned by: nobody
Component: Uncategorized Version: master
Severity: Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

It would be useful if Fields had the opportunity to specify SQL code to be included when generating the SQL for their model.

I am currently trying to add some stuff to the SQL generated by Django to implement full text indexes in PostgreSQL for several models. Each model requires the following to be added:

-- full text indexing on blogs_post
CREATE INDEX blogs_post_fts
    ON blogs_post
    USING gin
    (vector);
CREATE FUNCTION blogs_post_fts_update () RETURNS trigger AS $$
BEGIN
    NEW.vector = ts2.to_tsvector ('default', NEW.title || ' ' || NEW.body);
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER blogs_post_fts_update BEFORE INSERT OR UPDATE ON blogs_post FOR EACH
ROW EXECUTE PROCEDURE blogs_post_fts_update ();
-- end of full text indexing on blogs_post

Entering this for every table that I want to create an index upon is really annoying!

It is possible to generate these SQL statements based on information in the model definition:

class Blog (models.Model):
   title = models.TextField ()
   body = models.TextField ()
   vector = TsvectorField (fields = ['title', 'body'])

In order to do this, I created a custom field type that adds the actual field to the CREATE TABLE statement:

class TsvectorField (models.Field):
    def __init__ (self, fields, *args, **kwargs):
        if kwargs.pop ('editable', False) == True:
            raise ValueError ('TsvectorFields can not be editable')
        super (TsvectorField, self).__init__ (*args, **kwargs)

        # do something with fields!

    def db_type (self):
        return 'ts2.tsvector'

But at this point I am stuck: as far as I can see, there is nowhere to place the code to generate the above SQL.

I propose that django.core.management.sql.sql_indexes_for_model by modified to call a method on each field, named something like get_sql_index. This method would generate & return the above SQL statements.

Thoughts?

Attachments (0)

Change History (5)

comment:1 Changed 6 years ago by anonymous

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Summary changed from Would like a callback on field creation to Would like a callback on models.Field during SQL generation

comment:2 Changed 6 years ago by ubernostrum

Personally, I think that trying to support every conceivable SQL feature people might want to use during table creation is a hopeless task -- the API would need to bloat up to the level of something like SQLAlchemy, and the complexity would quickly get overwhelming. Meanwhile, you can pipe the output of manage.py sql into a file, edit it to add absolutely anything you like, and then execute it against your database.

comment:3 Changed 6 years ago by anonymous

The problem with having to manually executing the SQL statements means that they can't be run automatically when a project is installed, including when manage.py test is run. Not to mention how tedious and error-prone it is, compared with (in this case) adding a single TsvectorField to each model.

I actually think that this will help prevent the database API from becoming bloated. It will allow users to make use of entire classes of features that require additional, custom database statements such as these, without having to add APIs for each individual feature.

This is particularly desirable in cases such as ours, where the actual SQL generated is quite specific to our rqeuirements; there are many, many different strategies for implementing things such as full-text indexing, none of which is suitable for everyone.

Having done some more thinking about this, I think that rather than hijacking the sqlindexes command, a new command such as sqlextra should be added; this would call each field's get_sql_extra method. The sqlextra command would be called as part of the sqlall command.

I'm happy to work on this myself, of course, but I don't really want to do so if I'm going to have to maintain the patch out-of-tree.

comment:4 Changed 6 years ago by mtredinnick

  • Resolution set to wontfix
  • Status changed from new to closed

The post_syncdb exists for purposes like this. Catch it for the models you care about and then start talking directly to the database to add whatever extra stuff you need. It is called once per app and you are passed a list ofmodels created in that app.

comment:5 Changed 6 years ago by sam@…

Thanks for the hint--that works well, except for one problem: the function is called twice when I run the 'tests' management command.

It seems that both 'syncdb' and 'flush' emit the 'post_syncdb' signal. Since 'test' calls 'syncdb' and then 'flush', the DDL statements all fail the second time they are called (and any subsequent times, too).

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.