Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#26899 closed Cleanup/optimization (fixed)

Document why RawSQL requires parameters

Reported by: David Wolever Owned by: petedmarsh
Component: Documentation Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description

The params argument to RawSQL isn't optional so an empty tuple must be provided if no arguments are needed:

>>> RawSQL("some_column")
...
TypeError: __init__() takes at least 3 arguments (2 given)

It seems reasonable to make params=() the default: RawSQL(sql, params=(), output_field=None), or possibly params=None and self.params = params if params is not None else ().

See also: https://github.com/django/django/blob/master/django/db/models/expressions.py#L607

Change History (13)

comment:1 by petedmarsh, 8 years ago

Owner: changed from nobody to petedmarsh
Status: newassigned

comment:3 by Simon Charette, 8 years ago

I like to think of params being an explicit argument as a feature that forces developers to think twice about not interpolating user provided data into their SQL and expose their application to possible injections.

Last edited 8 years ago by Simon Charette (previous) (diff)

comment:4 by Tim Graham, 8 years ago

Sounds reasonable to me. If we decide to keep the status quo, let's update the documentation with that rationale.

comment:5 by petedmarsh, 8 years ago

I've updated the PR to add a note to the documentation as described above, happy to re-word or bring back the code changes, just let me know!

comment:6 by Tim Graham, 8 years ago

Component: Database layer (models, ORM)Documentation
Has patch: set
Summary: RawSQL requires parameters even if they are emptyDocument why RawSQL requires parameters
Triage Stage: UnreviewedReady for checkin
Type: UncategorizedCleanup/optimization

comment:7 by Tim Graham <timograham@…>, 8 years ago

Resolution: fixed
Status: assignedclosed

In 7bf3ba0d:

Fixed #26899 -- Documented why RawSQL params is a required parameter.

comment:8 by Tim Graham <timograham@…>, 8 years ago

In 8f7008c4:

[1.9.x] Fixed #26899 -- Documented why RawSQL params is a required parameter.

Backport of 7bf3ba0d0c700670d13d7683eec7bd3eb3d4dd1f from master

comment:9 by Tim Graham <timograham@…>, 8 years ago

In 5cc6190:

[1.10.x] Fixed #26899 -- Documented why RawSQL params is a required parameter.

Backport of 7bf3ba0d0c700670d13d7683eec7bd3eb3d4dd1f from master

comment:10 by David Wolever, 8 years ago

I'm glad the rationale is documented, but I can't help but feel like keeping this parameter required is both frustrating for folk who aren't performing injectable queries and more importantly inconsistent with the other mechanisms Django offers to drop into raw SQL. Specifically, QuerySet.raw(…) doesn't require an empty params, and neither does QuerySet.extra(…).

comment:11 by Josh Smeaton, 8 years ago

I can understand your frustration. There's certainly an inconsistency here, which is unfortunate, but I think I'd prefer to make the raw and extra params a necessary argument if I was writing them from scratch given the argument made earlier. It's way too easy to slip in a % (format) and just ignore the params argument altogether. All parameters should be passed via params, whether they're potentially injectable or not.

I consider the overhead involved in adding an empty params argument a decent tradeoff to have a reminder of good security practises built into the API and not just within the docs. RawSQL will probably not be used very much within projects so it shouldn't lead to any/much API fatigue.

comment:12 by David Wolever, 8 years ago

That's absolutely fair!

I just want to make sure that the (potentially many?) use cases which don't involve any parameters are being considered (ex, adding an array_agg to a groupby, selecting a column from an intermediary many-to-many table, a subselect that doesn't require parameters, and I'm sure you can imagine a bunch more).

Anyway, if this has been considered and parameters are the final decision, the one last request I would make (and PR I could provide): provide a default value for the params argument which raises an explanatory exception (it took me 15 minutes to figure out why RawSQL("some_column") was raising a __init__ takes 3 arguments but 2 were provided error).

def __init__(self, query, params=Undefined, …):
    if params is Undefined:
        raise TypeError(
            "params must always be provided to RawSQL to reduce the risk of SQL injection. "
            "Use `params=()` if your query does not require any parameters."
        )


comment:13 by Josh Smeaton, 8 years ago

I think that change is fairly reasonable. If we want to encode good practises into the API it's only fair that correct usage is guided by it too. If you put a patch together then just "Refs #26899" to begin the commit.

Note: See TracTickets for help on using tickets.
Back to Top