Opened 15 years ago
Closed 13 years ago
#11670 closed Bug (fixed)
Model fields named 'year', 'month', 'gt', 'lt' etc. get mistaken for lookup types in lookups across relations
Reported by: | Owned by: | Julien Phalip | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | joel@…, anssi.kaariainen@…, Anssi Kääriäinen | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Using a field named 'year' in a related table:
Event.objects.filter(eventinfo__year=settings.YEAR)
Throws a type error:
TypeError at /events/submit_review/ Related Field has invalid lookup: year
The problem is easily worked around by specifying a type of match to use (I used the exact keyword):
Event.objects.filter(eventinfo__year__exact=settings.YEAR)
I suspect this is a clash with the date field lookup functions. I'm not sure if this is a bug or if it's just something to mention in the documentation for related searches.
Attachments (8)
Change History (35)
comment:1 by , 15 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 15 years ago
Cc: | added |
---|
comment:3 by , 15 years ago
Has patch: | set |
---|---|
Owner: | changed from | to
Version: | 1.1 → SVN |
by , 15 years ago
Attachment: | 11670.diff added |
---|
comment:4 by , 14 years ago
Owner: | removed |
---|
comment:5 by , 14 years ago
Severity: | → Normal |
---|---|
Type: | → Bug |
comment:6 by , 14 years ago
Patch needs improvement: | set |
---|
The tests would need to be rewritten using unittests since this is now Django's preferred way.
comment:7 by , 13 years ago
Easy pickings: | unset |
---|---|
UI/UX: | unset |
This was again reported in #16714. Note that the same issue occurs with fields named 'month' and 'day' as well as 'year'.
comment:8 by , 13 years ago
See also ticket:8424#comment:13 for a similar problem and #16187 (possibly a long term solution). There is also another ticket or thread I can't find right now that proposed changing lookup names to all uppercase.
by , 13 years ago
Attachment: | 11670.field-lookup-collisions.diff added |
---|
by , 13 years ago
Attachment: | 11670.field-lookup-collisions.2.diff added |
---|
comment:9 by , 13 years ago
OK so I've updated the patch simplifying the code's algorithm and the tests.
comment:10 by , 13 years ago
Cc: | added |
---|
I read somewhat quickly through the patch. This looks simple and efficient.
I think this could be extended in the future if need be (referring for example to ticket #16187).
However, I wonder if it would be better to make this a separate method. This could be usable in checking the validitity of other calls, order_by for example. I would actually go so far as to make this model._meta method, returning 3-tuple field_path, final_field (which can be in field_path or not), and lookup_parts. This could be useful outside of sql.query, and this would make for proper separation of concerns.
I don't have time for more thorough review at the moment. I am not changing patch needs improvement. I wonder if it is set for the latest patch or just forgotten to checked?
comment:11 by , 13 years ago
Patch needs improvement: | unset |
---|
Thanks for your review and ideas! Splitting that code out into a separate method sounds interesting -- could you elaborate on where else and how it would be useful?
I'm removing the Patch needs improvement flag, since I think this patch fixes the original bug, which appears to be the highest priority at this stage -- Larger refactors could be tracked elsewhere like in #16187. The patch would still need a close review before it can proceed, though.
comment:12 by , 13 years ago
Well, the idea would be that when for example order by is called, we could check if the order is valid at that point. It would be an trivial check to do:
for field in fields: if self.resolve_lookup_path(field)[LOOKUP_PART]: raise ValueError('Invalid order by!')
and that is pretty much it. We should first remove the leading '-' though...
But I also think it really belongs to model._meta. Why does sql/query.py need to know how to travel the fields? IMHO it doesn't need to know that. Abstract that away in a model._meta method and just call it when needed. There are also other parts in Django which would benefit from this. Take a look at django/contrib/admin/options.py:L236.
comment:13 by , 13 years ago
Thanks, I'll like this idea. I've made good progress writing a new patch for that and I'll try to post it here tomorrow.
by , 13 years ago
Attachment: | 11670.field-lookup-collisions.3.diff added |
---|
comment:14 by , 13 years ago
Actually, I can post my patch right now (see above).
As suggested by akaariai I've created a new Options.resolve_lookup_path()
method and used it in multiple places, which allows to simplify the code quite a bit, in particular in the admin. I've tried to make the algorithm as efficient as possible and used some caching mechanism too.
Feedback welcome. Thank you!
by , 13 years ago
Attachment: | 11670.field-lookup-collisions.4.diff added |
---|
comment:15 by , 13 years ago
Patch needs improvement: | set |
---|
I've pushed the refactoring further, notably by modifying setup_joins. Now a given lookup wouldn't be parsed multiple times -- only once. I'm still getting 6 test errors that I haven't got the time to get to the bottom of yet, but I think we're getting pretty close now.
by , 13 years ago
Attachment: | 11670.field-lookup-collisions.5.diff added |
---|
comment:16 by , 13 years ago
Patch needs improvement: | unset |
---|
All tests are now passing. I'm reasonably happy of where the patch is at now. I'm just wondering whether the values fetched from the resolve_lookup_path cache should systematically be copied (as currently in the the patch) or if it should be the calling code's responsibility to make copies before modifying the values -- if the later, then there would probably be a slight speed increase.
At this point I'd really like to get the patch reviewed. Any feedback is welcome. Thanks!
comment:17 by , 13 years ago
Why not cache and return tuples? That way there is no way the caller can modify them. If caller needs to do it, the caller can turn them into lists. However, I doubt you can show any real speed difference, as clone will take all the time in the interesting cases.
I would like to review this, but I don't know when I will have the time to do it.
comment:18 by , 13 years ago
Yeah using tuples does the trick (updated in latest patch). That removes my only concern with the patch. I'm pretty sure this refactoring would increase speed in the Python code processing. I'll try to do some benchmarking.
by , 13 years ago
Attachment: | 11670.field-lookup-collisions.6.diff added |
---|
comment:19 by , 13 years ago
Summary: | Minor limitation using fields named 'year' in related searches. → Model fields named 'year', 'month', 'gt', 'lt' etc. get mistaken for lookup types in lookups across relations |
---|
comment:20 by , 13 years ago
To make it easier to track changes, I've pushed everything to a github branch: https://github.com/jphalip/django/compare/master...lookup_refactor
The benchmarks I've run with djangobench seem to show a very slight speed increase but nothing significant. Therefore this patch is mostly about simplifying and DRY-ing up the lookup system's codebase.
comment:21 by , 13 years ago
Patch needs improvement: | set |
---|
Some notes:
- TYPO: Lookups on non-existants fields are ok, since they're ignored
- Shouldn't the cache be cleared in add_field like the other field caches?
- Is it absolutely necessary to pass in the query into resolve_lookup_path, it is only used in:
field_name not in query.query_terms and field_name not in query.aggregate_select.keys() and field_name not in query.aggregates.keys() and LOOKUP_SEP.join(parts) not in query.aggregates.keys()
It would be IMHO better to pass in two sets, one containing the first three lines unioned, and the other the query.aggregates.keys(). Even better would be to move some of this logic into query.py. One idea is just to ignore the whole FieldDoesNotExist
and let the caller work out what the result should be. So if you resolve 'foo__
bar_id', you would only resolve the foo part and let the caller figure out what to do with the bar_id part. There is also a related bug here: the cache key is the passed in path, but the result which is cached depends also on the query and allow_explicit_fk, so if first called with 'foo__
bar_id', True, we would cache the wrong result for the call 'foo__
bar_id', False.
Otherwise the patch looks good to me. This is a big cleanup to lookup path handling in my opinion.
comment:22 by , 13 years ago
@akaariai: Thanks for your review. I've updated the branch: https://github.com/jphalip/django/compare/master...lookup_refactor#files_bucket
The cache is now flushed in add_field()
. As for your comment re: not passing the query to resolve_lookup_path()
, I've tried but couldn't come up with a clean implementation. Have you got any specific ideas for it?
comment:23 by , 13 years ago
Cc: | added |
---|
I have an idea, but maybe it is not workable. So, my first idea is to get the patch in as is, and revisit this issue later on.
The maybe-unworkable idea is that the code for traveling the lookups translates to this logic (at least I think it does):
while lookup_leads_to_field: fields.append(found_field) if found_field is related_field: related_field = found_field.related_field else: goto marker1 if lookups_left: if not allow_explicit_fk: raise FieldError marker1: ...
Now, if you get a lookup like rel_field__non_existing_field
you are going to raise FieldError
. But if you instead do nonrel_field__non_existing_field
you are going to say that nonrel_field was found and non_existing_field part wasn't looked. This is because nonrel_field will "goto marker1" (via NotRelatedField exception) and skip the if lookups_left part.
So, the only idea I have is to return the rel_field in the rel_field__non_existing_field
lookup case. So, something like this:
# pseudoish... for counter, field_name in enumerate(parts): try: if field_name == 'pk': field_name = opts.pk.name field_info = opts.get_field_by_name(field_name) fields += (field_info,) if (counter + 1) < num_parts: # If we haven't reached the end of the list of # lookups yet, then let's attempt to continue # traversing relations. related_model = get_model_from_relation(field_info[0]) opts = related_model._meta except FieldDoesNotExist, NotRelatedField: break if fields: # We found at least one field, lets return that last_field = fields[-1] else: # The lookup was totally bogus - it didn't resolve any field from # the base model raise FieldError return parts, fields, last_field
The caller is then responsible to raising FieldError
if that is the appropriate action, doing the explicit_fk logic and checking for existing aggregates, etc. I haven't tried this at all, so it is possible this idea is bogus, or will make the caller's job more complicate than it needs to be.
As said before: getting the patch in as is could be the thing to do. Polish it later on when we know how custom lookups are going to be handled. I have a tendency to over-complicate things, and I have a feeling I am in over-complicating mode right now... :)
by , 13 years ago
Attachment: | 11670.field-lookup-collisions.7.diff added |
---|
comment:24 by , 13 years ago
The patch above is a slight improvement on 11670.field-lookup-collisions.2.diff, as the full lookup resolution now only occurs if the last part isn't a lookup type. That means there wouldn't be any extra computation for the most common lookups.
Since it's a bit late before 1.4 gets released to do a full refactor as discussed earlier, I suggest we start by fixing the bug using that latest patch, and then look into a proper refactor later for 1.5.
comment:25 by , 13 years ago
Patch needs improvement: | unset |
---|
If the patch is good to go, it would be nice to get this into 1.4.
In that case opening a new ticket about the larger refactor would be nice, so that the patches about the refactoring do not get lost.
I unchecked the "patch needs improvement" box, as I think that is per the triaging guide: patch author removes that box when he feels he has something ready, reviewer rechecks it, or marks as ready for checkin as needed. Unfortunately, I don't have time for review just now.
comment:26 by , 13 years ago
I'm going to commit the fix for the originally reported issue. A deeper refactoring would still be welcome but it should be tackled as part of a separate ticket. #16187 could be a good candidate.
I'm attaching a patch that fixes this problem by restricting which kinds of lookups will be attempted on related fields. I've included a regression test and verified that the rest of Django's test suite passes (with the exception of the test that isn't passing on trunk either).