Version 3 (modified by simon, 7 years ago) (diff)

--

Replacing get_absolute_url

Summary: get_absolute_url() is poorly defined and poorly named. It's too late to fix it for Django 1.0, but we should completely re-think it for Django 1.1.

This page is a work in progress - I'm still figuring out the extent of the problem before I start working out the solution.

The problem

It's often useful for a model to "know" it's URL. This is especially true for sites that follow RESTful principles, where any entity within the site should have one and only one canonical URL.

It's also useful to keep URL logic in the same place as much as possible. Django's {% url %} template tag and reverse() function solve a slightly different problem - they resolve URLs for view functions, not for individual model objects, and treat the URLconf as the single point of truth for URLs. {% url "profile-view" user.id %} isn't as pragmatic as {{ user.get_absolute_url }}, since if we change the profile-view to take a username instead of a user ID in the URL we'll have to go back and update all of our templates.

Being able to get the URL for a model is also useful outside of the template system. Django's admin, syndication and sitemaps modules all attempt to derive a URL for a model at various points, currently using the get_absolute_url method.

The current mechanism for making model's aware of their URL is the semi-standardised get_absolute_url method. If you provide this method on your model class, a number of different places in Django will use it to create URLs. You can also over-ride this using settings.ABSOLUTE_URL_OVERRIDES.

Unfortunately, get_absolute_url is mis-named. An "absolute" URL should be expected to include the protocol and domain, but in most cases get_absolute_url just returns the path. It was proposed to rename get_absolute_url to get_url_path, but this doesn't make sense either as some objects DO return a full URL from get_absolute_url (and in fact some places in Django check to see if the return value starts with http:// and behave differently as a result).

From this, we can derive that there are actually two important URLs for a given model:

  1. The full URL, including protocol and domain. This is needed for the following cases:
    • links in e-mails, e.g. a "click here to activate your account" link
    • URLs included in syndication feeds
    • links used for things like "share this page on del.icio.us" widgets
    • links from the admin to "this object live on the site" where the admin is hosted on a separate domain or subdomain from the live site
  2. The path component of the URL. This is needed for internal links - it's a waste of bytes to jam the full URL in a regular link when a path could be used instead.

A third type of URL - URLs relative to the current page - is not being considered here because of the complexity involved in getting it right. That said, it would be possible to automatically derive a relative URL using the full path and a request-aware template tag.

So, for a given model we need a reliable way of determining its path on the site AND its full URL including domain. The path can be derived from the full URL, and sometimes vice versa depending on how the site's domain relates to the model objects in question.

Django currently uses django.contrib.sites in a number of places to attempt to derive a complete URL from just a path, but this has its own problems. The sites framework assumes the presence of a number of things: a django_site table, a SITE_ID in the settings and a record corresponding to that SITE_ID. This arrangement does not always make sense - consider the case of a site which provides a unique subdomain for every one of the site's users (simonwillison.myopenid.com for example). Additionally, making users add a record to the sites table when they start their project is Yet Another Step, and one that many people ignore. Finally, the site system doesn't really take development / staging / production environments in to account. Handling these properly requires additional custom code, which often ends up working around the sites system entirely.

Finally, it's important that places that use get_absolute_url (such as the admin, sitemaps, syndication etc) always provide an over-ridable alternative. Syndication feeds may wish to include extra hit-tracking material on URLs, admin sites may wish to link to staging or production depending on other criteria etc. At the moment some but not all of these tools provide over-riding mechanisms, but without any consistency as to what they are called or how they work.

Current uses of get_absolute_url()

By grepping the Django source code, I've identified the following places where get_absolute_url is used:

grep -r get_absolute_url django | grep -v ".svn" | grep -v '.pyc'
  • contrib/admin/options.py: Uses hasattr(obj, 'get_absolute_url') to populate 'has_absolute_url' and 'show_url' properties which are passed through to templates and used to show links to that object on the actual site.
  • contrib/auth/models.py: Defines get_absolute_url on the User class to be /users/{{ username }}/ - this may be a bug since that URL is not defined by default anywhere in Django.
  • contrib/comments/models.py: Defines get_absolute_url on the Comment and FreeComment classes, to be the get_absolute_url of the comment's content object + '#c' + the comment's ID.
  • contrib/flatpages/models.py: Defined on FlatPage model, returns this.url (which is managed in the admin)
  • contrib/sitemaps/init.py: Sitemap.location(self, obj) uses obj.get_absolute_url() by default to figure out the URL to include in the sitemap - designed to be over-ridden
  • contrib/syndication/feeds.py: The default Feed.item_link(self, item) method (which is designed to be over-ridden) uses get_absolute_url, and raises an informative exception if it's not available. It also uses its own add_domain() function along with current_site.domain, which in turn uses Site.objects.get_current() and falls back on RequestSite(self.request) to figure out the full URL (both Site and RequestSite come from the django.contrib.sites package).
  • db/models/base.py: Takes get_absolute_url in to account when constructing the model class - this is where settings.ABSOLUTE_URL_OVERRIDES setting has its affect.
  • views/defaults.py: The thoroughly magic shorcut(request, content_type_id, object_id) view, which attempts to figure out a full URL to something based on a content_type and an object_id, makes extensive use of get_absolute_url - including behaving differently if the return value starts with http://.
  • views/generic/create_update.py: Both create and update views default to redirecting the user to get_absolute_url() if and only if post_save_redirect has not been configured for that view.

Finally, in the documentation:

  • docs/contributing.txt - mentioned in coding standards, model ordering section
  • docs/generic_views.txt
  • docs/model-api.txt - lots of places, including "It's good practice to use get_absolute_url() in templates..."
  • docs/settings.txt - in docs for ABSOLUTE_URL_OVERRIDES
  • docs/sitemaps.txt
  • docs/sites.txt - referred to as a "convention"
  • docs/syndication_feeds.txt
  • docs/templates.txt: - in an example
  • docs/unicode.txt - "Taking care in get_absolute_url..."
  • docs/url_dispatch.txt

And in the tests:

ABSOLUTE_URL_OVERRIDES is not tested.

get_absolute_url is referenced in:

  • tests/regressiontests/views/models.py
  • tests/regressiontests/views/tests/defaults.py
  • tests/regressiontests/views/tests/generic/create_update.py
  • tests/regressiontests/views/urls.py
Back to Top