Code

Opened 17 months ago

Closed 16 months ago

Last modified 16 months ago

#19401 closed Bug (fixed)

'swapped' checking does not account for case insensitivity

Reported by: chris@… Owned by: nobody
Component: Database layer (models, ORM) Version: 1.5-beta-1
Severity: Release blocker Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

According to the code at django/db/models/options.py:230 the test for "has this class been swapped?" is essentially: "if there's a swapped meta option, and the option is not None, and the option isn't simply the name of the current model, in "appname.model" format"

However, the test doesn't take into account that the model part of appname.model is case insensitive.

While we could simply normalise app_name and that coming back from the setting, I think a better test would be:

"if there's a swapped meta option, and the option is not none, and the "get_model()" using that option is not the same as the current class instance." Ie, rather than assume model names are case insensitive, just use the get_model and ensure it doesn't return us.

Attachments (0)

Change History (19)

comment:1 Changed 17 months ago by russellm

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Resolution set to invalid
  • Status changed from new to closed

I must be missing something here, because there's nothing case insensitive about the model name in AUTH_USER_MODEL. If your user class is called MyUser in an app call myapp, the value of the AUTH_USER_MODEL should be myapp.MyUser - myapp.myuser is an error, and should be reported as such, and as far as I can make out, it is.

comment:2 Changed 17 months ago by chriscog

  • Resolution invalid deleted
  • Status changed from closed to reopened

Actually, this was something I only recently discovered myself, in that while app names are case sensitive, model names are not. Check out these lines from django/db/models/loading.py

    def get_model(self, app_label, model_name,
                  seed_cache=True, only_installed=True):
        """
        Returns the model matching the given app_label and case-insensitive
        model_name.

        Returns None if no model is found.
        """
        if seed_cache:
            self._populate()
        if only_installed and app_label not in self.app_labels:
            return None
        return self.app_models.get(app_label, SortedDict()).get(model_name.lower())

So, while this is unlikely going to cause many problems for just AUTH_USER_MODEL, if the usage of swappable models grows, the rules for case-sensitive/insensitive inconsistency are going to progressively be more of a problem.

Last edited 17 months ago by chriscog (previous) (diff)

comment:3 Changed 17 months ago by ptone

  • Resolution set to needsinfo
  • Status changed from reopened to closed

Because the swap check involves the app label, I don't see how you are ever going to have a swappable collision between myapp.Mymodel and myapp.MyMODEL because you will have plenty of other problems with having models in the same app be the name name like that.

Do you have a concrete failure example, or is this just supposition from looking at the code?

comment:4 Changed 16 months ago by chriscog

  • Resolution needsinfo deleted
  • Status changed from closed to reopened

Sorry sorry sorry. I understand that this appears to be a trivial issue, perhaps even leaning towards "argument over programming preference", but please hear me out.

There are two issues with this current implementation. 1. inconsistent behaviour with the rest of django. 2. DRY not being maintained

  1. Consistency

The code as stands will operate correctly if AUTH_USER_MODEL='auth.User' ... the swapping logic will detect that the model is being directed to be swapped out with itself, and will act like a no-op. No worries!

However, if you set AUTH_USER_MODEL='auth.user' very strange things happen. During syncdb, it sets up the auth tables as expected, but does not prompt to create a superuser. The "createsuperuser" is available in manager, but errors-out if run, and a whole bunch of other stuff comes back with AttributeError: Manager isn't available; User has been swapped for 'auth.user'

So, a quick response to setting 'auth.user' is, of course, "dont do that".

However, django developers have already been trained to use lowercase names when using the 'appname.modelname' nomenclature This is not just taking it from the code snippet in a previous comment above, but in things like this documentation: https://docs.djangoproject.com/en/1.4/ref/contrib/contenttypes/#methods-on-contenttype-instances

Notice that they use "site" and "user" rather than "Site" and "User"

So, we're asking developers to "not care about" case in one circumstance, to "very much caring about case" in another. Not consistent, and the odd behaviour above will lead to (admittedly rare, at least initially) problems.

And, I'm not even including the case where there's weird interactions between swapped models and content types.

Note, I'm just using Content type as one example, there are others.. plenty of places where "django.db.models.loading.get_model" is called from developer input. Like it or not, at this point Django needs to treat model names as case insensitive when being referred to from strings.

  1. DRY

In 1.4, the only place where strings are turned into model names is in django.db.models.loading.get_model().

With the introduction of 1.5 model swapping, we've put that into another location: models.db.options.Options .. which imho is not tightly-bound enough to the above.

At the VERY least, I propose we ensure such translations stay in the same location. However, that would just highlight that object-comparions and string-comparisons are not orthogonal.

So, a very easy fix to both of these is to replace the string based comparison with a object based. Ie turn this:

        if self.swappable:
            model_label = '%s.%s' % (self.app_label, self.object_name)
            swapped_for = getattr(settings, self.swappable, None)
            if swapped_for not in (None, model_label):
                return swapped_for

into this:

        if self.swappable:
            swapped_for = getattr(settings, self.swappable, None)
            if swapped_for is None:
                return None
            swapped_for_model = get_model(*swapped_for.split('.',1))
            this_model = get_model(self.app_label,self.module_name)
            if swapped_for_model is not this_model:
                return swapped_for

Now, this appears a little "messy" because I cannot see any attribute on the Options object that has a reference to the model. Ie, it always needs to go back through get_model to get that. (that a cycle prevention technique?). However since we're calling get_model (which is the model cache), this has "cacheability" written all over it.

Also note that I've used module_name rather than object_name... module_name is always lowercased. (Not relevant here, since get_model lowercases the model names anyway)

Tested the above code, and it works for both variations of AUTH_USER_MODEL: syncdb does what it's supposed to, and the admin screens are behaving themselves. I'm going to run the regression test on it in a moment, and perhaps even do that "git pull request" thingy.

(note, I'm not really happy about the get_model(*somestr.split('.',1)) pattern... but that's used _everywhere_ all over Django right now. Very non-DRY)

Summary:

I think this, or something to the same effect is a worthwhile change as it increases consistency and DRY. However seeing that _swapped gets called frequently, this looks like something that should really be part of the model cache or a "just in time" setting on the Options object itself. (ie, not set till its called, then subsequent queries return from a cache). I would not object to this not being included in 1.5 but if swapped models gain traction, we should look at improving the design there.

comment:5 Changed 16 months ago by anonymous

Ran regression tests and it failed... in a very strange way. Got

Traceback (most recent call last):
  File "./runtests.py", line 330, in <module>
    options.failfast, args)
  File "./runtests.py", line 156, in django_tests
    state = setup(verbosity, test_labels)
  File "./runtests.py", line 135, in setup
    mod = load_app(module_label)
  File "/Users/chris/dev/dj15/Django-1.5b2/django/db/models/loading.py", line 96, in load_app
    models = import_module('.models', app_name)
  File "/Users/chris/dev/dj15/Django-1.5b2/django/utils/importlib.py", line 35, in import_module
    __import__(name)
  File "/Users/chris/dev/dj15/Django-1.5b2/tests/modeltests/fixtures/models.py", line 12, in <module>
    from django.contrib.contenttypes import generic
  File "/Users/chris/dev/dj15/Django-1.5b2/django/contrib/contenttypes/generic.py", line 17, in <module>
    from django.contrib.admin.options import InlineModelAdmin, flatten_fieldsets
  File "/Users/chris/dev/dj15/Django-1.5b2/django/contrib/admin/__init__.py", line 6, in <module>
    from django.contrib.admin.sites import AdminSite, site
  File "/Users/chris/dev/dj15/Django-1.5b2/django/contrib/admin/sites.py", line 3, in <module>
    from django.contrib.admin import ModelAdmin, actions
ImportError: cannot import name actions

when it tries to import modeltests.fixtures.models as part of the run_tests loop. This _only_ happens if I call get_model inside the _swapped method, even if I just call it with fixed params and trap every exception, and the call has no bearing on the execution of the function. Those last few entries in the stack trace have come up before in other issues, too. Viz: https://groups.google.com/forum/?fromgroups=#!topic/south-users/Gykm9cyitpo

Very odd... this did not happen just on the first call to get_model either... it was called 7 times in _swapped(), and then _after_ 20 or so test modules were loaded, the fixtures module failed.

comment:6 Changed 16 months ago by ptone

  • Severity changed from Normal to Release blocker
  • Triage Stage changed from Unreviewed to Accepted

OK - so setting AUTH_USER_MODEL='auth.user' is something that should be handled better - auth.User should not think it is swapped with 'auth.user'

The appname.model is established as a convention for Django - swapped models furthers that (for better or worse), and has the flaw in being easily confused with a python path.

Because get_model has always used lowercase, it seems that we should just say that the Django specific "appname.modelname" convention is case insensitive.

Which could make the solution as simple as:

https://github.com/django/django/blob/7eba5fbc027face8dd33475148d442655a92cecd/django/db/models/options.py#L220
if swapped_for.lower() not in (None, model_label.lower()):

supporting case sensitive module names is against PEP-8, and unless you get tricky about where modules are on your path, not even possible for systems that don't have case-sensitive FS.

comment:7 Changed 16 months ago by chriscog

Thanks for responding and accepting. Might not seem significant, but I've had too many recent bad experiences with getting R&D to see the issues (not Django related at all), and the acknowledgement of the contribution is highly appreciated.

I've been doing more tests, and I cant get adding get_model to work... too many circular module dependencies going on. I think that's a non-starter for now.

Your proposed solution may not be universal enough either; while model names are treated case-insensitive, app_labels (by default derived from the module names), _are_ case sensitive: there is nothing in the model loading or module loading mechanism that case-normalises app names... they're imported with the case specified, and stored in the app cache with the case specified.

So, my suggested change becomes

            try:
                swapped_for = getattr(settings, self.swappable)
            except AttributeError:
                return None
            swapped_for_app, swapped_for_model = swapped_for.split('.')
            if swapped_for_app == self.app_label and swapped_for_model.lower() == self.module_name:
                return swapped_for
  • I have committed a cardinal sin and not tested this yet. (At work!)
  • I'm using self.module_name, which is precalculated as self.object_name.lower().
  • Does not handle cases where the app_label is blank (do they exist?)

comment:8 Changed 16 months ago by chriscog

Update: What you have _will_ work _if_ you assume that PEP-8 is being followed. While I can't come up with a real-world example, though, I would still suggest my more-awkward-looking change as it avoids breaking applications that "happen to be working in violation of PEP-8". If we're going to enforce "no case-only distinctions", we should do it in a much earlier place, such as the app loader, and with a clear error message.

comment:9 Changed 16 months ago by ptone

so yeah - case sensitive app_labels have been around too long to change now probably - I'd hope it would be rare that someone would have two apps, one Foo and the other foo, or bar.foo - but I'm sure it is out there.

We could still just lowercase the object_name for the swapped check

comment:10 Changed 16 months ago by chriscog

Yeah... I got the logic of that test wrong. Hooray for testing! Here's a working, and passing the unit tests, version:

    def _swapped(self):
        """
        Has this model been swapped out for another? If so, return the model
        name of the replacement; otherwise, return None.
        """
        if self.swappable:
            try:
                swapped_for = getattr(settings, self.swappable)
            except AttributeError:
                return None
            try:
                swapped_for_app, swapped_for_model = swapped_for.split('.')
            except ValueError:
                # This is not ideal, but we want the model validation to catch the error
                return swapped_for
            if swapped_for_app != self.app_label or swapped_for_model.lower() != self.module_name:
                return swapped_for
        return None
    swapped = property(_swapped)
  • Returning swapped_for even if we know it is wrong was done because the unit tests expect the error to come from the model validation code, rather than here.
  • That this test code is run each time anything needs to determine if the model has been swapped feels really ugly. Is it too late in the release cycle to refactor this? Perhaps set up this very test inside Options.contribute_to_class() and have is_swapped, swapped_app_label and swapped_object_name as attributes on Object. The app_label.model_name format test can then be removed from the model validation (and we'll need to adjust the invalid_models.badswappablevalue test. I'll happily work on this if its worth it. Github fork appropriate? We can use "swapped" instead of "is_swapped" to be compatible with all the code that simply uses swapped for swappededness.

comment:11 follow-up: Changed 16 months ago by chriscog

I've refactored the swapped check so that its checked once when the options are applied to the class, and all tests for swapped-ed-ness are done against the "is_swapped", "swapped_app_label" and "swapped_object_name" attributes.

These changes cause three regression tests to fail, but these appear to be "non issues", or perhaps even fixing a limitation of the older method. All three tests were looking for a failure condition that appears to be perfectly fine code. There's insufficient documentation to know if the test should have failed by design, or was failing for a known or unknown reasons.

I've put the changes into a github fork and sent a pull request: https://github.com/django/django/pull/588

That appropriate?

I'd like someone with more knowledge about those tests to determine if they really are "tests that used to fail, but now work as it should be."

comment:12 Changed 16 months ago by chriscog

  • Has patch set

comment:13 in reply to: ↑ 11 Changed 16 months ago by ptone

Replying to chriscog:

I've put the changes into a github fork and sent a pull request: https://github.com/django/django/pull/588

That appropriate?

So using a pull request is completely appropriate - but I have to let you know that changing the value of the flag from swapped to is_swapped was completely unnecessary and will cause many reviewers give your patch less consideration. Likewise the change from property to attribute is not related to this ticket. These are design stylistic choices one gets to make when writing the original feature, not to rehash while working on a bugfix patch. I hope you take that as constructive feedback - your efforts to give this issue some significant thought are appreciated.

The main issue at hand is to remove a confusing result with a new feature when, what is likely to be a relatively common honest mistake, is made.

That means raising a clearer error in cases where the setting is incorrect - or allowing a looser (case insensitive object name) test of the swapped model.

My own choice would be to allow for the looser test, but I'm going to defer to Russ on the determination of the severity of this issue, and its best resolution.

I have worked up a more lightweight fix using the case insensitive approach (with a test) here:

https://github.com/ptone/django/compare/swapped-case

comment:14 Changed 16 months ago by chriscog

Your concerns are understood. Soon after I made those changes, I realised that a far simpler one would be to keep just the one attribute, "swapped".

However, I'm adamant that turning swapped from a function into an attribute that is calculated _once_ during Options.contribute_to_class() is the better (and much more efficient) way of handling it. I'll create another pull request with that variation and I'll take my chances :)

comment:16 Changed 16 months ago by charettes

As suggested on github we could just use a cached_property instead.

comment:17 Changed 16 months ago by Russell Keith-Magee <russell@…>

  • Resolution set to fixed
  • Status changed from reopened to closed

In c04c03daa3620dc5106740a976738ada35203ab5:

Fixed #19401 -- Ensure that swappable model references are case insensitive.

This is necessary because get_model() checks are case insensitive, and if the swapable check isn't, the
swappable logic gets tied up in knots with models that are partially swapped out.

Thanks to chris@… for the report and extensive analysis, and Preston for his work on the draft patch.

comment:18 Changed 16 months ago by Russell Keith-Magee <russell@…>

In b7607003a543b11c62ad4d2ca1f8e12bd4b97b89:

[1.5.x] Fixed #19401 -- Ensure that swappable model references are case insensitive.

This is necessary because get_model() checks are case insensitive, and if the swapable check isn't, the
swappable logic gets tied up in knots with models that are partially swapped out.

Thanks to chris@… for the report and extensive analysis, and Preston for his work on the draft patch.

Backport of c04c03d from trunk.

comment:19 Changed 16 months ago by russellm

@chriscog Thanks for your work (and pull request); however, I opted to go for a simpler approach -- your patch didn't contain tests, but it did break a number of existing tests. From a quick analysis, this was because doing the swapped check at the time of class instantiation messes with the test suite, where swapped models are changed.

Ultimately, the swapped property isn't an expensive check, so it shouldn't be a huge performance difference.

Thanks again for your report and persistence :-)

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.