Opened 8 years ago

Closed 10 months ago

#9368 closed Cleanup/optimization (fixed)

Clean up code for getting columns for select query

Reported by: adunar Owned by:
Component: Database layer (models, ORM) Version:
Severity: Normal Keywords:
Cc: mtredinnick, adunar Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no


I cleaned up some code in db/models/sql/ (see mtredinnick's suggestion on

In particular:

  • The code to get a column's SQL alias was duplicated in 6 places, so I factored that out into a function
  • Depending on whether the get_default_columns function was called for the base model or a related model, it took different parameters, did different things, and returned different values. So I split it into two separate functions, get_base_columns and get_related_columns. This refactoring makes the Query class easier to understand and easier to subclass with less code duplication.
  • I added a function Query.get_base_fields, which the QuerySet class uses to determine which fields correspond to the values from the result set. Currently, get_base_fields returns all fields of the base model. However, Query subclasses could potentially override get_base_fields (as well as get_base_columns) so that only a subset of the model's fields are returned (e.g. if you want to lazy-load some large fields).

A large motivation for this refactoring is that functional decomposition of get_columns made it difficult to subclass Query to provide custom behavior. This can be seen especially well in contrib.gis.db.models.query, where it needed to override a small bit of logic in the middle of get_columns and get_default_columns, but ended up having to copy and paste those entire functions (which were both quite large) as a result. My change at least makes it so gis doesn't have to copy and paste quite as much. I would have liked to not copy and paste anything in gis, but I didn't quite figure out how to do that without breaking the regression tests. (Also, I haven't even run the code in gis because I don't have the necessary libraries installed. So somebody else might want to test that part...)

Attachments (1)

query_cleanup.diff (19.8 KB) - added by adunar 8 years ago.
clean up the code for getting a query's columns/fields v1.02

Download all attachments as: .zip

Change History (11)

Changed 8 years ago by adunar

clean up the code for getting a query's columns/fields v1.02

comment:1 Changed 8 years ago by mtredinnick

  • Needs documentation unset
  • Needs tests unset
  • Owner changed from nobody to mtredinnick
  • Patch needs improvement unset
  • Status changed from new to assigned
  • Triage Stage changed from Unreviewed to Accepted

Looks like a good set of changes, thanks. A few points.

  1. You can't use decorators in Django core code (must be Python 2.3 compatible).
  2. The extra classmethod you added to models.base.Model can't actually be used. I've already made that mistake once. The problem is that entirely avoids __init__ and there is lots of code that extends __init__ to do other stuff on initialisation. The QuerySet code should be a good citizen and use the public interface for constructing a model instance. Zipping a parallel list of field names and values into a dictionary isn't hard when it's required. It's tempting to try and shortcut __init__, but it's backwards compatible and, even when it wasn't, it turned out to be very inconvenient for other code when something tried to come in via the back door. So we'll stick to initialising a model with a list of values when we have all the fields and use the keyword argument approach when necessary (it's pretty easy to tell the two cases apart).
  3. As a stylistic issue, all the methods inside Query have the same level of visibility. It's a non-public class, so we don't need to worry about adding leading underscores to new methods.

Most of these I'll probably just fix when I commit this (I want to think about parts of the changes a bit more, so it'll be a few days before I commit it). So no need to write an updated patch; these are more notes for next time and so you understand why some of the changes won't go in as is.

comment:2 Changed 8 years ago by adunar

  • Cc adunar added

Great, thanks for the feedback.

As for 2, _new_from_select_values does actually call __init__ on the model (I realize my docstring was confusing about that). It makes sense though if you want to put that functionality inline in QuerySet, since it's only used twice and (as you mentioned) isn't very complicated.

Last edited 11 months ago by timgraham (previous) (diff)

comment:3 Changed 8 years ago by anonymous

  • milestone post-1.0 deleted

Milestone post-1.0 deleted

comment:4 Changed 5 years ago by patchhammer

  • Easy pickings unset
  • Patch needs improvement set
  • Severity set to Normal
  • Type set to Uncategorized

query_cleanup.diff fails to apply cleanly on to trunk

comment:5 Changed 5 years ago by lukeplant

  • Type changed from Uncategorized to Cleanup/optimization

comment:6 Changed 5 years ago by aaugustin

  • UI/UX unset

Change UI/UX from NULL to False.

comment:7 Changed 4 years ago by akaariai

I did some investigation some time ago for rewriting the cols output code - the code is available from

The original idea was to have just "col" syntax instead of "tbl_alias"."col" syntax unless the col is duplicated in some other table. This would result in much nicer looking SQL. However, this approach has a fundamental problem: We don't actually know all the cols in a table, just those the user has in the models. At the least we would need some way for the user to tell that this table has these extra cols, even if they are not part of the model definition. Otherwise we will break existing code with no upgrade path at all.

After the dedupe cols experiment I hacked a bit more, and the changes ended happening all-around. So, the branch isn't meant for inclusion. Still, I think the code could be useful as a reference point if somebody decides to tackle this issue.

comment:8 Changed 3 years ago by claudep

  • Owner mtredinnick deleted
  • Status changed from assigned to new

comment:9 Changed 13 months ago by timgraham

It seems that get_columns() is no longer overridden in GIS after 0c7633178fa9410f102e4708cef979b873bccb76. Is it time to close this ticket or is there more work to do?

comment:10 Changed 10 months ago by akaariai

  • Resolution set to fixed
  • Status changed from new to closed

There is likely more we can clean in select clause generation, but the main issue of this ticket (duplication of login) has been fixed.

Note: See TracTickets for help on using tickets.
Back to Top