Opened 16 years ago
Closed 9 years ago
#9368 closed Cleanup/optimization (fixed)
Clean up code for getting columns for select query
Reported by: | Jesse Young | Owned by: | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | |
Severity: | Normal | Keywords: | |
Cc: | Malcolm Tredinnick, Jesse Young | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
I cleaned up some code in db/models/sql/query.py (see mtredinnick's suggestion on http://code.djangoproject.com/ticket/5420)
In particular:
- The code to get a column's SQL alias was duplicated in 6 places, so I factored that out into a function
- Depending on whether the get_default_columns function was called for the base model or a related model, it took different parameters, did different things, and returned different values. So I split it into two separate functions, get_base_columns and get_related_columns. This refactoring makes the Query class easier to understand and easier to subclass with less code duplication.
- I added a function Query.get_base_fields, which the QuerySet class uses to determine which fields correspond to the values from the result set. Currently, get_base_fields returns all fields of the base model. However, Query subclasses could potentially override get_base_fields (as well as get_base_columns) so that only a subset of the model's fields are returned (e.g. if you want to lazy-load some large fields).
A large motivation for this refactoring is that functional decomposition of get_columns made it difficult to subclass Query to provide custom behavior. This can be seen especially well in contrib.gis.db.models.query, where it needed to override a small bit of logic in the middle of get_columns and get_default_columns, but ended up having to copy and paste those entire functions (which were both quite large) as a result. My change at least makes it so gis doesn't have to copy and paste quite as much. I would have liked to not copy and paste anything in gis, but I didn't quite figure out how to do that without breaking the regression tests. (Also, I haven't even run the code in gis because I don't have the necessary libraries installed. So somebody else might want to test that part...)
Attachments (1)
Change History (11)
by , 16 years ago
Attachment: | query_cleanup.diff added |
---|
comment:1 by , 16 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Triage Stage: | Unreviewed → Accepted |
Looks like a good set of changes, thanks. A few points.
- You can't use decorators in Django core code (must be Python 2.3 compatible).
- The extra classmethod you added to
models.base.Model
can't actually be used. I've already made that mistake once. The problem is that entirely avoids__init__
and there is lots of code that extends__init__
to do other stuff on initialisation. TheQuerySet
code should be a good citizen and use the public interface for constructing a model instance. Zipping a parallel list of field names and values into a dictionary isn't hard when it's required. It's tempting to try and shortcut__init__
, but it's backwards compatible and, even when it wasn't, it turned out to be very inconvenient for other code when something tried to come in via the back door. So we'll stick to initialising a model with a list of values when we have all the fields and use the keyword argument approach when necessary (it's pretty easy to tell the two cases apart). - As a stylistic issue, all the methods inside
Query
have the same level of visibility. It's a non-public class, so we don't need to worry about adding leading underscores to new methods.
Most of these I'll probably just fix when I commit this (I want to think about parts of the changes a bit more, so it'll be a few days before I commit it). So no need to write an updated patch; these are more notes for next time and so you understand why some of the changes won't go in as is.
comment:2 by , 16 years ago
Cc: | added |
---|
Great, thanks for the feedback.
As for 2, _new_from_select_values
does actually call __init__
on the model (I realize my docstring was confusing about that). It makes sense though if you want to put that functionality inline in QuerySet
, since it's only used twice and (as you mentioned) isn't very complicated.
comment:4 by , 14 years ago
Easy pickings: | unset |
---|---|
Patch needs improvement: | set |
Severity: | → Normal |
Type: | → Uncategorized |
query_cleanup.diff fails to apply cleanly on to trunk
comment:5 by , 14 years ago
Type: | Uncategorized → Cleanup/optimization |
---|
comment:7 by , 12 years ago
I did some investigation some time ago for rewriting the cols output code - the code is available from https://github.com/akaariai/django/compare/dedupe_cols.
The original idea was to have just "col" syntax instead of "tbl_alias"."col" syntax unless the col is duplicated in some other table. This would result in much nicer looking SQL. However, this approach has a fundamental problem: We don't actually know all the cols in a table, just those the user has in the models. At the least we would need some way for the user to tell that this table has these extra cols, even if they are not part of the model definition. Otherwise we will break existing code with no upgrade path at all.
After the dedupe cols experiment I hacked a bit more, and the changes ended happening all-around. So, the branch isn't meant for inclusion. Still, I think the code could be useful as a reference point if somebody decides to tackle this issue.
comment:8 by , 12 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:9 by , 9 years ago
It seems that get_columns()
is no longer overridden in GIS after 0c7633178fa9410f102e4708cef979b873bccb76. Is it time to close this ticket or is there more work to do?
comment:10 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
There is likely more we can clean in select clause generation, but the main issue of this ticket (duplication of login) has been fixed.
clean up the code for getting a query's columns/fields v1.02