Opened 11 years ago
Closed 10 years ago
#22394 closed Cleanup/optimization (fixed)
Built-in datetime Lookups should actually be Transforms
Reported by: | jarshwah | Owned by: | Marc Tamlyn |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | lookups |
Cc: | jon.dufresne@… | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The following built in lookups should be Transforms rather than Lookups, because we're transforming the date/time into a date/time part:
Year, Month, Day, WeekDay, Hour, Minute, Second.
Converting the above into Transforms then allows a Lookup to proceed:
qs.filter(date_field__year__gte=2001)
Change History (13)
comment:1 by , 11 years ago
comment:2 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Doing something with this issue is part of my kickstarter, so it would probably be unfair for someone else to work on it.
I believe the current machinery is enough to provide "optimal" queries in all circumstances. This will work by making __year
a Transform
which is actually a no-op, and that transform having a set of lookups available to it distinct from the normal lookups, which each both extract the year *and* do the comparison. This allows us to use EXTRACT
for the exact case but not for the less than case (for example). (Note that if __year
becomes a transform all calls to __year
explicitly become __year__exact
). At present, lookups and only used in filter queries. We will need to be more intelligent here when considering their use in values calls, custom index etc as then a __year
transform cannot be a noop.
To work out the best option for each possible call here, I will be running benchmarks and checking query plans, with and without indexes, for all the possible queries. We can then work out what the best query is for each db in each case - they need not necessarily be the same.
It is also possible with custom indexes that there may be a case for the __year
transform having a __extract
transform which forces the use of EXTRACT()
at SQL level. This would be an unusual use case for year, but perhaps more likely for __date
or __week
where a user could add an index for this particular extract and then perform efficient queries using it without needing to index the entire table. This could be much more efficient with logging-like tables, but until I try it out I don't know.
comment:3 by , 11 years ago
There is a way to have year lookup as a Transform, and also have specialized Lookups for exact and friends. The way to do this is to write YearExact's as_sql() in a way that it accesses directly self.lhs.lhs, that is YearExact doesn't execute self.lhs.as_sql() at all, instead it skips directly to the column of self.lhs. This works as self.lhs is known to be a YearTransform instance, only way to access YearExact is through YearTransform.
As it happens this has already been written as a test case in custom_lookups. Check YearTransform and YearExact in custom_lookups/tests.py. The implementation is PostgreSQL specific, so there is some work to make it generic.
BTW for the kickstarter part - I think it is completely fair if somebody else has time to work on this and implements it for you for your kickstarter project. If you would be able to just merge patches others wrote I think that would be completely fair way to accomplish your kickstarter project. Of course, this is just my opinion.
comment:4 by , 11 years ago
Ah yes, I'd forgotten about accessing lhs.lhs
directly that is a good approach to this.
And that is a good point re the kickstarter - worst case is it gives me more time to work towards supporting transforms elsewhere in the code base.
comment:5 by , 11 years ago
Triage Stage: | Unreviewed → Accepted |
---|---|
Type: | Uncategorized → Cleanup/optimization |
comment:6 by , 10 years ago
Cc: | added |
---|---|
Has patch: | set |
Added an initial PR: https://github.com/django/django/pull/4393
This PR does not yet address the year BETWEEN
indexing issue outline above. I will look into it. Feedback on the initial status is welcome.
comment:7 by , 10 years ago
My PR has addressed the index concern for __year
lookups. The PR is ready for review. Thanks.
comment:8 by , 10 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
Looks good to me (awaiting +1 from Josh).
comment:10 by , 10 years ago
Patch needs improvement: | set |
---|---|
Triage Stage: | Ready for checkin → Accepted |
comment:12 by , 10 years ago
Summary: | Several built in Lookups should actually be Transforms → Built-in datetime Lookups should actually be Transforms |
---|---|
Triage Stage: | Accepted → Ready for checkin |
Discussion on IRC mentions that the
Year
lookup does a range query (BETWEEN x AND y) which is able to use date based indexes. If a Transform were to be used, any field based indexes would be invalid, as a computed/function-based index would need to be created instead to preserve performance.However, all other operations based on date fields do an EXTRACT(), which can't use traditional indexes anyway. Having Year as a Lookup but the other date/time lookups as Transforms doesn't really make sense either, and would probably be more confusing than it is now.