Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#27241 closed Bug (fixed)

Annotate doesn't work with PostgreSQL views anymore

Reported by: Jaap Roes Owned by:
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: postgresql view aggregate annotate
Cc: Shai Berger, Simon Charette, Josh Smeaton, olivier.tabone@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: yes Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Using annotate on (unmanaged) models backed by views has stopped working and now throws a:

psycopg2.ProgrammingError: column "..." must appear in the GROUP BY clause or be used in an aggregate function...

This is caused by the fix to ticket #19259.

Queries with an aggregation now only list the PK field in the GROUP BY clause, but this doesn't work for views because they cannot have primary keys in PostgreSQL.

Now Django doesn't officially support views, or models without a primary key, but it did work before so it's somewhat of a regression.

I ran into this problem because I'm working on a legacy database and have models that are backed by a view using the db_table Meta option.

Change History (17)

comment:1 by Jaap Roes, 8 years ago

I've been able to work around this by defining my own DatabaseWrapper and DatabaseFeatures (which inherits from the ones in django.db.backends.postgresql) and setting allows_group_by_selected_pks to False. However, this disables this feature for all models not just the ones backed by a view, so it's not ideal.

comment:2 by Tim Graham, 8 years ago

Do you have a solution in mind?

comment:3 by Jaap Roes, 8 years ago

Unfortunately I don't have a solution that would just make things work.

One of the underlying issues here is that Django has never officially supported database views in the first place so I don't know how much effort should be put into making this work again. The fact that this used to work in the past was pure coincidence anyway.

Come to think of it, this problem might not just limited to database views. I believe any (unmanged) model that hasn't set a PRIMARY KEY constraint at the database level could trigger this issue.

comment:4 by Shai Berger, 8 years ago

Cc: Shai Berger added

Possible workaround (no time to test ATM): Before the annotate() call, add only() naming all the fields. Or maybe even drop the PK, if that makes sense.

comment:5 by Jaap Roes, 8 years ago

Using only seems to have no effect on the GROUP BY clause.

I am not sure what you mean by "drop the PK", Django doesn't allow models without a field marked as primary_key and will add one if you don't specify one.

comment:6 by Simon Charette, 8 years ago

Cc: Simon Charette added
Triage Stage: UnreviewedAccepted
Version: 1.10master

Before the annotate() call, add only() naming all the fields. Or maybe even drop the PK, if that makes sense.

The trick only() won't work as pk is always implicitly selected (only('field') == only('field', 'pk')).

The only solution I can think of is disabling the optimization for unmanaged models but the real issue here is really that unmanaged models are not meant to be used to deal with database views. The fact that Django enforces the presence of a primary key and views are not allowed to have one really highlights the conceptual clash.

As Django doesn't provide any way to interact with views and even suggests managed can be used for this purpose I think we should simply disable the optimization for this specific case.

comment:7 by Jaap Roes, 8 years ago

I don't known if disabling the optimisation for all unmanaged models is the right choice. It seems that this is a fairly esoteric edge case. The fact that this behaviour was changed in 1.9 and almost nobody has noticed until now speaks for itself (I can only find one other issue that looks similar: #26758).

I'm sure that most unmanaged models are normal database tables with a primary key constraint. It would be a shame to have to disable something that works fine for the vast majority of use cases just because of this one edge case.

Instead, would it be feasible to be able to opt-out of the optimisation by setting a flag on the model's Meta class? So instead of checking if managed is False, Django could check if allows_group_by_selected_pks is set to False for the model and then choose not to perform the related optimisation.

I am actively maintaining this application and would have no qualms about adding a flag to the few models affected by this issue. If anyone else runs into this the must also be actively maintaining theirs so they can do the same thing.

comment:8 by Josh Smeaton, 8 years ago

Can you try adding a field as a primary key on the view? Ideally each row would have a unique key in a view (I know this isn't always the case), but if you just tell Django that your unmanaged model has a primary key, it'll try to group on that.

For what it's worth I've always been a big user of unmanaged models over views - so I'm keen to see this resolved and not abandoned.

comment:9 by Josh Smeaton, 8 years ago

Cc: Josh Smeaton added

comment:10 by Jaap Roes, 8 years ago

As far as Django is concerned these views do have a primary key (there's a id field, it even has a unique index). The problem is that at the database level this field isn't, and cannot, have a primary key constraint. Without a real primary key Postgres cannot group by primary key only, since they don't exist.

This is what happens if you try to add a PK to a (materialized) view:

=> ALTER MATERIALIZED VIEW my_view ADD CONSTRAINT my_view_pkey PRIMARY KEY (id);
ERROR:  "my_view" is not a table
Version 0, edited 8 years ago by Jaap Roes (next)

comment:11 by Josh Smeaton, 8 years ago

Yes, you're right. Sorry for the noise! I mistakenly thought it was Django throwing an error.

comment:12 by Simon Charette, 8 years ago

I don't think a meta option for this special case is worth it but a flag to prevent Django from implicitly adding an auto-incrementing primary key when none is explicitly specified could be useful in multiple cases (migrations comes to mind here) and leveraged in this one.

comment:13 by Olivier Tabone, 8 years ago

Cc: olivier.tabone@… added
Needs tests: set
Owner: changed from nobody to Olivier Tabone
Status: newassigned

comment:14 by Olivier Tabone, 8 years ago

Owner: Olivier Tabone removed
Status: assignednew

comment:15 by Simon Charette, 8 years ago

Fixed in daf2bd3efe53cbfc1c9fd00222b8315708023792. #28107 was a duplicate of this ticket.

comment:16 by Simon Charette, 8 years ago

Resolution: fixed
Status: newclosed

comment:17 by Simon Charette, 8 years ago

I just submitted a post on the mailing list to gather feedback. It includes an alternative solution to make turning the optimization on for unmanaged models backed by tables easier.

Note: See TracTickets for help on using tickets.
Back to Top