=====================
The sitemap framework
=====================

Django comes with a high-level sitemap-generating framework that makes
creating `Google Sitemap`_ XML files easy.

.. _Google Sitemap: http://www.google.com/webmasters/sitemaps/docs/en/protocol.html

The "sitemap" framework
=======================

Overview
--------

The sitemap-generating framework is based largely off of Django's
`syndication framework`_. You tell the framework what you want to include
in your sitemap by creating ``Sitemap`` classes and pointing to them
in your URLconf_.

.. _syndication framework: http://www.djangoproject.com/documentation/syndication/
.. _URLconf: http://www.djangoproject.com/documentation/url_dispatch/

Initialization
--------------

To activate sitemap generation on your Django site, add this line to your
URLconf_:

    ( r'^sitemap.xml$', 'django.contrib.sitemap.views.sitemap', {'sitemaps': sitemaps} )

This will tell Django to build a sitemap when a client accesses ``/sitemap.xml``.
The name of the sitemap is not important, but the location is. Google will only
index links in your sitemap for the current URL level and below. For instance, if
``sitemap.xml`` lives in your root directory, it may reference any URL in your
site. However, if your sitemap lives at ``/content/sitemap.xml``, it may only
reference URLs under ``/content/``.

The sitemap view takes an extra argument: ``{'sitemaps': sitemaps}``. ``sitemaps``
should be a dictionary that maps a short section label (i.e. ``blog`` or ``news``)
to its ``Sitemap`` class (i.e. ``BlogSitemap`` or ``NewsSitemap``). It may also map
to an instance of a ``Sitemap`` class (i.e. ``BlogSitemap(some_var)``).

.. _URLconf: http://www.djangoproject.com/documentation/url_dispatch/

Sitemap Classes
---------------

A ``Sitemap`` class is a simple python class that represents a "section" of
entries in your sitemap. In the simplest case, all these sections get lumped
together in one ``sitemap.xml``. It is also possible to use the framework to
generate a sitemap index that references individual sitemap files, one per
section.

``Sitemap`` classes must subclass ``django.contrib.sitemap.Sitemap``. They can
live anywhere in your codebase.

A simple example
----------------

Let's assume you have an ``Entry`` model in your blog, and you want to include
all the individual links to your blog entries in your sitemap::

    from django.contrib.sitemap import Sitemap
    from myproject.blog.models import Entry

    class BlogSitemap(Sitemap):
        changefreq = "never"
        priority = 0.5

        def items(self):
            return Entry.objects.filter(is_draft=False)

        def lastmod(self, obj):
            return obj.pub_date

Note:

    * ``changefreq`` and ``priority`` are class attributes corresponding to
      ``<changefreq>`` and ``<priority>`` elements, respectively. They could be
      made callable as functions, as ``lastmod`` was in the example.
    * ``items()`` is simply a method that returns a list of objects. The objects
      returned will get passed to any callable methods corresponding to a sitemap
      property (``location``, ``lastmod``, ``changefreq``, and ``priority``).
    * ``lastmod`` should return a ``datetime`` object.
    * There is no ``location`` method. ``Sitemap`` provides a default implementation
      for you that calls ``get_absolute_url()`` on each object and returns the result.

Shortcuts
---------

The sitemap framework provides a couple convenience classes for common cases:

    * FlatpageSitemap
    * GenericSitemap

The ``FlatpageSitemap`` class looks at all flatpages_ defined for the current ``SITE_ID``
(see the sites_ documentation) and creates an entry in the sitemap. These entries include
only the ``location`` attribute.

The ``GenericSitemap`` class works with any `generic views`_ you already have. To use
it, create an instance, passing in the same ``info_dict`` you pass to the
generic views. The only requirement is that the dict have a ``queryset`` entry.
It may also have a ``date_field`` entry that specifies a date field for objects
retrieved from the ``queryset``. This will be used for the ``lastmod`` attribute in
the generated sitemap.

Here's an example of a URLconf_ using both::

    from django.conf.urls.defaults import *
    from django.contrib.sitemap import FlatpageSitemap, GenericSitemap
    from myproject.blog.models import Entry

    info_dict = {
        'queryset': Entry.objects.all(),
        'date_field': 'pub_date',
    }

    sitemaps = {
        'flatpages': FlatpageSitemap,
        'blog': GenericSitemap(info_dict),
    }

    urlpatterns = patterns('',
        # ... some generic view using info_dict
        ( r'^sitemap.xml$', 'django.contrib.sitemap.views.sitemap', {'sitemaps': sitemaps} )
    )

.. _flatpages: http://www.djangoproject.com/documentation/flatpages/
.. _sites: http://www.djangoproject.com/documentation/sites/
.. _generic views: http://www.djangoproject.com/documentation/generic_views/
.. _URLconf: http://www.djangoproject.com/documentation/url_dispatch/

Creating a sitemap index
------------------------

The sitemap framework also has the ability to create a sitemap index
that references individual sitemap files, one per each section defined
in your ``sitemaps`` dict. The only differences in usage are:

    * You use two views in your URLconf: ``django.contrib.sitemap.views.index``
      and ``django.contrib.sitemap.views.sitemap``
    * The ``django.contrib.sitemap.views.sitemap`` view should take a
      ``section`` keyword argument.

Here is what the relevant URLconf lines would look like for the example above::

    ( r'^sitemap.xml$', 'django.contrib.sitemap.views.index', {'sitemaps': sitemaps} )
    ( r'^sitemap-(?P<section>.+).xml$', 'django.contrib.sitemap.views.sitemap', {'sitemaps': sitemaps} )

This will automatically generate a ``sitemap.xml`` file that references
both ``sitemap-flatpages.xml`` and ``sitemap-blog.xml``. The ``Sitemap``
classes and the ``sitemaps`` dict don't change at all.

Pinging Google
--------------

For sites with dynamic content, it may be desirable to "ping" Google when
your sitemap changes, to let them know to re-index your site. The framework
provides a function to do just that: ``ping_google()``. This will automatically
determine your sitemap URL and send a ping to Google.

One useful way to call ``ping_google()`` is from a model's ``save()`` method::

    from django.contrib.sitemap import ping_google

    def save(self):
        super(Entry, self).save()
        ping_google()
