Opened 10 years ago

Last modified 3 years ago

#20516 new New feature

Allow use of prepared statements

Reported by: est Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: db, prepared statement
Cc: mpessas@…, aron45, Jeongsoo, Park Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no



Since we have persistant connection in Django 1.6 now, can we consider adding prepared statements?

So we can create statements upon the first query, and use the prepared statement for all the rest and so on.

This could be a huge speed up for heavy db IO applications. Reduce traffice size as well as RTT. Also improves security by parameterize queries.

Change History (15)

comment:1 Changed 10 years ago by Anssi Kääriäinen

Keywords: mysql removed
Summary: Prepared statement for MySQLAllow use of prepared statements
Triage Stage: UnreviewedAccepted
Version: 1.6-alpha-1master

Yes, I think addition of prepared statements is something to consider. No idea of what would be a nice API, or if this could be somehow delegated to third party apps. The latter option means some hooks are needed somewhere in the query execution path. One such point is execute_sql() where it would be fairly straightforward to change the used SQL to prepared statements if the query is set to use prepared statements.

FWIW there are use cases where prepared statements can give easily over 50% speedup. Such cases are simple to execute queries which retrieve just a few rows of data, but which have a massive amount of joins (so that planning overhead is big). Using .select_related() and complicated .filter() clauses are easy ways to produce such queries.

I don't see any reason to restrict this to MySQL only.

(This feature seems to need at least these parts:

  • API for setting given queryset to use prepared statements - using them automatically is generally a bad idea
  • Convert the query's execution to use prepared statement instead of SQL. This consists of:
    • A check if the given query is already prepared for the used connection
    • If so, use the correct prepared statement
    • If not, create a new prepared statement and use that)

comment:2 Changed 10 years ago by Apostolis Bessas

Cc: mpessas@… added

comment:3 Changed 10 years ago by anonymous

with statements seem to be the correct way to handle this from an API standpoint.

comment:4 Changed 10 years ago by FunkyBob

Surely the security only improves if the DB driver is faulty? Django already passes parameterised queries to the DB driver to handle escaping [or pass through parameterised, as some do].

But in general, I'm happy to throw my brain behind this effort :)

comment:5 Changed 10 years ago by FunkyBob

One idea that struck me the other day-

A prepared statement could yield a function with keyword arguments matching its bound parameters.

Something like:

    stmt = MyModel.objects.filter(foo__bar=Param('able'), baz=Param('baker')).exclude(foo_quux=7)

Though perhaps an explicit "prepare()" would be better, the Param arguments mark where to bind parameters, and what to call them.

The, you can execute the query with:

for my in stmt(able=1, baker="wibble"):

Last edited 9 years ago by Florian Apolloner (previous) (diff)

comment:6 Changed 10 years ago by david@…

This isn't just for MySQL. If possible, it would be really nice to have some facility for using server-side prepare statements not created by Django. is an example of such a use case

comment:7 Changed 9 years ago by Omer Katz

PostgresSQL also supports prepare statements.

@FunkyBob: That syntax looks great but the PreparedQuerySet object should support a context manager.

comment:8 Changed 9 years ago by FunkyBob

Could you elaborate on how you see context manager working with this feature?

comment:9 Changed 9 years ago by Omer Katz

The way I see it a PreparedQuerySet should support the following:

  • Preparing on execution of a query for the entire session:
qs = list(MyModel.objects.filter(foo__bar=1).prepare())
qs2 = list(MyModel.objects.filter(foo__bar=2)) # Already prepared for this session (that is the connection)

The query parameters are inferred from the query itself.
This is very convenient and doesn't require anyone to learn the new syntax.

  • Preparing for a certain scope and deallocating:
with MyModel.objects.filter(foo__bar=Param('foobar')).prepare() as qs:
     results = qs(1)
     # Some logic
     results = qs(2)
     # Some logic

When the PreparedQuerySet enters the context the query is prepared. When it exists the prepared query is deallocated.
It's not always a good idea to prepare queries for a very long time for rapidly changing tables (small tables don't need to use indexes for example).

  • Preparing a query for usage in a Manager or QuerySet:
class MyQuerySet(QuerySet):
  foobars = MyModel.objects.filter(foo__bar=Param('foobar')).prepare()

This will turn foobars into a callable with an argument named foobar.
The query will be prepared once the first query using it will be executed.
You can prepare ahead of time:

class MyQuerySet(QuerySet):
  foobars = MyModel.objects.filter(foo__bar=Param('foobar')).prepare(ahead=True) # I need a better name for the argument here

You can also prepare a query that will be deallocated after a TTL:

class MyQuerySet(QuerySet):
  foobars = MyModel.objects.filter(foo__bar=Param('foobar')).prepare(ttl=60000)

Explicitly deallocating a prepared query should also be possible:

qs = MyModel.objects.filter(foo__bar=1).prepare()
results = list(qs)
qs2 = list(MyModel.objects.filter(foo__bar=2)) # Already prepared for this session (that is the connection)
results = list(qs) # the query is prepared again

The query will be prepared again once the first query is executed.

What do you guys think?

comment:10 Changed 8 years ago by Omer Katz

I investigated about implementing it today and it turns out that the Query class has a method called sql_with_params() which returns the SQL query without concatenating the parameters.

MySQL uses the following syntax:

PREPARE usrrptplan FROM "SELECT * FROM users u, logs l WHERE u.usrid=? AND u.usrid=l.usrid
    AND = ?;"
EXECUTE usrrptplan USING 1, NOW()

PostgreSQL uses the following syntax:

PREPARE usrrptplan (int) AS
    SELECT * FROM users u, logs l WHERE u.usrid=$1 AND u.usrid=l.usrid
    AND = $2;
EXECUTE usrrptplan(1, NOW());

It seems that when you call prepare() you need to switch the query entirely with the EXECUTE syntax.
Moreover, at least in PostgreSQL you have to check if the statement is already prepared and there's no SQL syntax to do that.
You need to query the pg_prepared_statements table in order to verify that it doesn't exist.
Since a prepared statement requires a name we can either randomize one or have one provided for us.
I think that providing both options is a good idea.

comment:11 Changed 8 years ago by Omer Katz

As apollo13 mentioned on IRC, prepared statements are at the session level on PostgresSQL so we can just store them on the connection object.

comment:12 Changed 8 years ago by Omer Katz

akaariai says we can just subclass the Query object and replace the parameters with the prepared statements placeholders.

comment:13 Changed 8 years ago by Shai Berger

On Oracle, the concept of prepared statements is linked to a connection-level statement cache. Prepared statements are not handled with different SQL, but with different API calls -- specifically, cx_Oracle provides a prepare method on its cursors. This method is not currently used by Django's Oracle backend.

comment:14 Changed 8 years ago by aron45

Cc: aron45 added

comment:15 Changed 3 years ago by Jeongsoo, Park

Cc: Jeongsoo, Park added
Note: See TracTickets for help on using tickets.
Back to Top