#19401 closed Bug (fixed)
'swapped' checking does not account for case insensitivity
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.5-beta-1 |
Severity: | Release blocker | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
According to the code at django/db/models/options.py:230 the test for "has this class been swapped?" is essentially: "if there's a swapped meta option, and the option is not None, and the option isn't simply the name of the current model, in "appname.model" format"
However, the test doesn't take into account that the model part of appname.model is case insensitive.
While we could simply normalise app_name and that coming back from the setting, I think a better test would be:
"if there's a swapped meta option, and the option is not none, and the "get_model()" using that option is not the same as the current class instance." Ie, rather than assume model names are case insensitive, just use the get_model and ensure it doesn't return us.
Change History (19)
comment:1 by , 12 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
comment:2 by , 12 years ago
Resolution: | invalid |
---|---|
Status: | closed → reopened |
Actually, this was something I only recently discovered myself, in that while app names are case sensitive, model names are not. Check out these lines from django/db/models/loading.py
def get_model(self, app_label, model_name, seed_cache=True, only_installed=True): """ Returns the model matching the given app_label and case-insensitive model_name. Returns None if no model is found. """ if seed_cache: self._populate() if only_installed and app_label not in self.app_labels: return None return self.app_models.get(app_label, SortedDict()).get(model_name.lower())
So, while this is unlikely going to cause many problems for just AUTH_USER_MODEL, if the usage of swappable models grows, the rules for case-sensitive/insensitive inconsistency are going to progressively be more of a problem.
comment:3 by , 12 years ago
Resolution: | → needsinfo |
---|---|
Status: | reopened → closed |
Because the swap check involves the app label, I don't see how you are ever going to have a swappable collision between myapp.Mymodel and myapp.MyMODEL because you will have plenty of other problems with having models in the same app be the name name like that.
Do you have a concrete failure example, or is this just supposition from looking at the code?
comment:4 by , 12 years ago
Resolution: | needsinfo |
---|---|
Status: | closed → reopened |
Sorry sorry sorry. I understand that this appears to be a trivial issue, perhaps even leaning towards "argument over programming preference", but please hear me out.
There are two issues with this current implementation. 1. inconsistent behaviour with the rest of django. 2. DRY not being maintained
- Consistency
The code as stands will operate correctly if AUTH_USER_MODEL='auth.User' ... the swapping logic will detect that the model is being directed to be swapped out with itself, and will act like a no-op. No worries!
However, if you set AUTH_USER_MODEL='auth.user' very strange things happen. During syncdb, it sets up the auth tables as expected, but does not prompt to create a superuser. The "createsuperuser" is available in manager, but errors-out if run, and a whole bunch of other stuff comes back with AttributeError: Manager isn't available; User has been swapped for 'auth.user'
So, a quick response to setting 'auth.user' is, of course, "dont do that".
However, django developers have already been trained to use lowercase names when using the 'appname.modelname' nomenclature This is not just taking it from the code snippet in a previous comment above, but in things like this documentation: https://docs.djangoproject.com/en/1.4/ref/contrib/contenttypes/#methods-on-contenttype-instances
Notice that they use "site" and "user" rather than "Site" and "User"
So, we're asking developers to "not care about" case in one circumstance, to "very much caring about case" in another. Not consistent, and the odd behaviour above will lead to (admittedly rare, at least initially) problems.
And, I'm not even including the case where there's weird interactions between swapped models and content types.
Note, I'm just using Content type as one example, there are others.. plenty of places where "django.db.models.loading.get_model" is called from developer input. Like it or not, at this point Django needs to treat model names as case insensitive when being referred to from strings.
- DRY
In 1.4, the only place where strings are turned into model names is in django.db.models.loading.get_model().
With the introduction of 1.5 model swapping, we've put that into another location: models.db.options.Options .. which imho is not tightly-bound enough to the above.
At the VERY least, I propose we ensure such translations stay in the same location. However, that would just highlight that object-comparions and string-comparisons are not orthogonal.
So, a very easy fix to both of these is to replace the string based comparison with a object based. Ie turn this:
if self.swappable: model_label = '%s.%s' % (self.app_label, self.object_name) swapped_for = getattr(settings, self.swappable, None) if swapped_for not in (None, model_label): return swapped_for
into this:
if self.swappable: swapped_for = getattr(settings, self.swappable, None) if swapped_for is None: return None swapped_for_model = get_model(*swapped_for.split('.',1)) this_model = get_model(self.app_label,self.module_name) if swapped_for_model is not this_model: return swapped_for
Now, this appears a little "messy" because I cannot see any attribute on the Options object that has a reference to the model. Ie, it always needs to go back through get_model to get that. (that a cycle prevention technique?). However since we're calling get_model (which is the model cache), this has "cacheability" written all over it.
Also note that I've used module_name rather than object_name... module_name is always lowercased. (Not relevant here, since get_model lowercases the model names anyway)
Tested the above code, and it works for both variations of AUTH_USER_MODEL: syncdb does what it's supposed to, and the admin screens are behaving themselves. I'm going to run the regression test on it in a moment, and perhaps even do that "git pull request" thingy.
(note, I'm not really happy about the get_model(*somestr.split('.',1))
pattern... but that's used _everywhere_ all over Django right now. Very non-DRY)
Summary:
I think this, or something to the same effect is a worthwhile change as it increases consistency and DRY. However seeing that _swapped gets called frequently, this looks like something that should really be part of the model cache or a "just in time" setting on the Options object itself. (ie, not set till its called, then subsequent queries return from a cache). I would not object to this not being included in 1.5 but if swapped models gain traction, we should look at improving the design there.
comment:5 by , 12 years ago
Ran regression tests and it failed... in a very strange way. Got
Traceback (most recent call last): File "./runtests.py", line 330, in <module> options.failfast, args) File "./runtests.py", line 156, in django_tests state = setup(verbosity, test_labels) File "./runtests.py", line 135, in setup mod = load_app(module_label) File "/Users/chris/dev/dj15/Django-1.5b2/django/db/models/loading.py", line 96, in load_app models = import_module('.models', app_name) File "/Users/chris/dev/dj15/Django-1.5b2/django/utils/importlib.py", line 35, in import_module __import__(name) File "/Users/chris/dev/dj15/Django-1.5b2/tests/modeltests/fixtures/models.py", line 12, in <module> from django.contrib.contenttypes import generic File "/Users/chris/dev/dj15/Django-1.5b2/django/contrib/contenttypes/generic.py", line 17, in <module> from django.contrib.admin.options import InlineModelAdmin, flatten_fieldsets File "/Users/chris/dev/dj15/Django-1.5b2/django/contrib/admin/__init__.py", line 6, in <module> from django.contrib.admin.sites import AdminSite, site File "/Users/chris/dev/dj15/Django-1.5b2/django/contrib/admin/sites.py", line 3, in <module> from django.contrib.admin import ModelAdmin, actions ImportError: cannot import name actions
when it tries to import modeltests.fixtures.models as part of the run_tests loop. This _only_ happens if I call get_model inside the _swapped method, even if I just call it with fixed params and trap every exception, and the call has no bearing on the execution of the function. Those last few entries in the stack trace have come up before in other issues, too. Viz: https://groups.google.com/forum/?fromgroups=#!topic/south-users/Gykm9cyitpo
Very odd... this did not happen just on the first call to get_model either... it was called 7 times in _swapped(), and then _after_ 20 or so test modules were loaded, the fixtures module failed.
comment:6 by , 12 years ago
Severity: | Normal → Release blocker |
---|---|
Triage Stage: | Unreviewed → Accepted |
OK - so setting AUTH_USER_MODEL='auth.user' is something that should be handled better - auth.User should not think it is swapped with 'auth.user'
The appname.model is established as a convention for Django - swapped models furthers that (for better or worse), and has the flaw in being easily confused with a python path.
Because get_model has always used lowercase, it seems that we should just say that the Django specific "appname.modelname" convention is case insensitive.
Which could make the solution as simple as:
https://github.com/django/django/blob/7eba5fbc027face8dd33475148d442655a92cecd/django/db/models/options.py#L220
if swapped_for.lower() not in (None, model_label.lower()):
supporting case sensitive module names is against PEP-8, and unless you get tricky about where modules are on your path, not even possible for systems that don't have case-sensitive FS.
comment:7 by , 12 years ago
Thanks for responding and accepting. Might not seem significant, but I've had too many recent bad experiences with getting R&D to see the issues (not Django related at all), and the acknowledgement of the contribution is highly appreciated.
I've been doing more tests, and I cant get adding get_model to work... too many circular module dependencies going on. I think that's a non-starter for now.
Your proposed solution may not be universal enough either; while model names are treated case-insensitive, app_labels (by default derived from the module names), _are_ case sensitive: there is nothing in the model loading or module loading mechanism that case-normalises app names... they're imported with the case specified, and stored in the app cache with the case specified.
So, my suggested change becomes
try: swapped_for = getattr(settings, self.swappable) except AttributeError: return None swapped_for_app, swapped_for_model = swapped_for.split('.') if swapped_for_app == self.app_label and swapped_for_model.lower() == self.module_name: return swapped_for
- I have committed a cardinal sin and not tested this yet. (At work!)
- I'm using self.module_name, which is precalculated as self.object_name.lower().
- Does not handle cases where the app_label is blank (do they exist?)
comment:8 by , 12 years ago
Update: What you have _will_ work _if_ you assume that PEP-8 is being followed. While I can't come up with a real-world example, though, I would still suggest my more-awkward-looking change as it avoids breaking applications that "happen to be working in violation of PEP-8". If we're going to enforce "no case-only distinctions", we should do it in a much earlier place, such as the app loader, and with a clear error message.
comment:9 by , 12 years ago
so yeah - case sensitive app_labels have been around too long to change now probably - I'd hope it would be rare that someone would have two apps, one Foo and the other foo, or bar.foo - but I'm sure it is out there.
We could still just lowercase the object_name for the swapped check
comment:10 by , 12 years ago
Yeah... I got the logic of that test wrong. Hooray for testing! Here's a working, and passing the unit tests, version:
def _swapped(self): """ Has this model been swapped out for another? If so, return the model name of the replacement; otherwise, return None. """ if self.swappable: try: swapped_for = getattr(settings, self.swappable) except AttributeError: return None try: swapped_for_app, swapped_for_model = swapped_for.split('.') except ValueError: # This is not ideal, but we want the model validation to catch the error return swapped_for if swapped_for_app != self.app_label or swapped_for_model.lower() != self.module_name: return swapped_for return None swapped = property(_swapped)
- Returning swapped_for even if we know it is wrong was done because the unit tests expect the error to come from the model validation code, rather than here.
- That this test code is run each time anything needs to determine if the model has been swapped feels really ugly. Is it too late in the release cycle to refactor this? Perhaps set up this very test inside Options.contribute_to_class() and have is_swapped, swapped_app_label and swapped_object_name as attributes on Object. The app_label.model_name format test can then be removed from the model validation (and we'll need to adjust the invalid_models.badswappablevalue test. I'll happily work on this if its worth it. Github fork appropriate? We can use "swapped" instead of "is_swapped" to be compatible with all the code that simply uses swapped for swappededness.
follow-up: 13 comment:11 by , 12 years ago
I've refactored the swapped check so that its checked once when the options are applied to the class, and all tests for swapped-ed-ness are done against the "is_swapped", "swapped_app_label" and "swapped_object_name" attributes.
These changes cause three regression tests to fail, but these appear to be "non issues", or perhaps even fixing a limitation of the older method. All three tests were looking for a failure condition that appears to be perfectly fine code. There's insufficient documentation to know if the test should have failed by design, or was failing for a known or unknown reasons.
I've put the changes into a github fork and sent a pull request: https://github.com/django/django/pull/588
That appropriate?
I'd like someone with more knowledge about those tests to determine if they really are "tests that used to fail, but now work as it should be."
comment:12 by , 12 years ago
Has patch: | set |
---|
comment:13 by , 12 years ago
Replying to chriscog:
I've put the changes into a github fork and sent a pull request: https://github.com/django/django/pull/588
That appropriate?
So using a pull request is completely appropriate - but I have to let you know that changing the value of the flag from swapped
to is_swapped
was completely unnecessary and will cause many reviewers give your patch less consideration. Likewise the change from property to attribute is not related to this ticket. These are design stylistic choices one gets to make when writing the original feature, not to rehash while working on a bugfix patch. I hope you take that as constructive feedback - your efforts to give this issue some significant thought are appreciated.
The main issue at hand is to remove a confusing result with a new feature when, what is likely to be a relatively common honest mistake, is made.
That means raising a clearer error in cases where the setting is incorrect - or allowing a looser (case insensitive object name) test of the swapped model.
My own choice would be to allow for the looser test, but I'm going to defer to Russ on the determination of the severity of this issue, and its best resolution.
I have worked up a more lightweight fix using the case insensitive approach (with a test) here:
comment:14 by , 12 years ago
Your concerns are understood. Soon after I made those changes, I realised that a far simpler one would be to keep just the one attribute, "swapped".
However, I'm adamant that turning swapped from a function into an attribute that is calculated _once_ during Options.contribute_to_class() is the better (and much more efficient) way of handling it. I'll create another pull request with that variation and I'll take my chances :)
comment:17 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
comment:19 by , 12 years ago
@chriscog Thanks for your work (and pull request); however, I opted to go for a simpler approach -- your patch didn't contain tests, but it did break a number of existing tests. From a quick analysis, this was because doing the swapped check at the time of class instantiation messes with the test suite, where swapped models are changed.
Ultimately, the swapped property isn't an expensive check, so it shouldn't be a huge performance difference.
Thanks again for your report and persistence :-)
I must be missing something here, because there's nothing case insensitive about the model name in AUTH_USER_MODEL. If your user class is called MyUser in an app call myapp, the value of the AUTH_USER_MODEL should be myapp.MyUser - myapp.myuser is an error, and should be reported as such, and as far as I can make out, it is.