Django

Code

root/django/branches/sqlalchemy/docs/sitemaps.txt

Revision 4455, 11.2 kB (checked in by rmunn, 2 years ago)

Merged revisions 4186 to 4454 from trunk.

Line 
1 =====================
2 The sitemap framework
3 =====================
4
5 **New in Django development version**.
6
7 Django comes with a high-level sitemap-generating framework that makes
8 creating sitemap_ XML files easy.
9
10 .. _sitemap: http://www.sitemaps.org/
11
12 Overview
13 ========
14
15 A sitemap is an XML file on your Web site that tells search-engine indexers how
16 frequently your pages change and how "important" certain pages are in relation
17 to other pages on your site. This information helps search engines index your
18 site.
19
20 The Django sitemap framework automates the creation of this XML file by letting
21 you express this information in Python code.
22
23 It works much like Django's `syndication framework`_. To create a sitemap, just
24 write a ``Sitemap`` class and point to it in your URLconf_.
25
26 .. _syndication framework: ../syndication/
27 .. _URLconf: ../url_dispatch/
28
29 Installation
30 ============
31
32 To install the sitemap app, follow these steps:
33
34     1. Add ``'django.contrib.sitemaps'`` to your INSTALLED_APPS_ setting.
35     2. Make sure ``'django.template.loaders.app_directories.load_template_source'``
36        is in your TEMPLATE_LOADERS_ setting. It's in there by default, so
37        you'll only need to change this if you've changed that setting.
38     3. Make sure you've installed the `sites framework`_.
39
40 (Note: The sitemap application doesn't install any database tables. The only
41 reason it needs to go into ``INSTALLED_APPS`` is so that the
42 ``load_template_source`` template loader can find the default templates.)
43
44 .. _INSTALLED_APPS: ../settings/#installed-apps
45 .. _TEMPLATE_LOADERS: ../settings/#template-loaders
46 .. _sites framework: ../sites/
47
48 Initialization
49 ==============
50
51 To activate sitemap generation on your Django site, add this line to your
52 URLconf_:
53
54     (r'^sitemap.xml$', 'django.contrib.sitemaps.views.sitemap', {'sitemaps': sitemaps})
55
56 This tells Django to build a sitemap when a client accesses ``/sitemap.xml``.
57
58 The name of the sitemap file is not important, but the location is. Search
59 engines will only index links in your sitemap for the current URL level and
60 below. For instance, if ``sitemap.xml`` lives in your root directory, it may
61 reference any URL in your site. However, if your sitemap lives at
62 ``/content/sitemap.xml``, it may only reference URLs that begin with
63 ``/content/``.
64
65 The sitemap view takes an extra, required argument: ``{'sitemaps': sitemaps}``.
66 ``sitemaps`` should be a dictionary that maps a short section label (e.g.,
67 ``blog`` or ``news``) to its ``Sitemap`` class (e.g., ``BlogSitemap`` or
68 ``NewsSitemap``). It may also map to an *instance* of a ``Sitemap`` class
69 (e.g., ``BlogSitemap(some_var)``).
70
71 .. _URLconf: ../url_dispatch/
72
73 Sitemap classes
74 ===============
75
76 A ``Sitemap`` class is a simple Python class that represents a "section" of
77 entries in your sitemap. For example, one ``Sitemap`` class could represent all
78 the entries of your weblog, while another could represent all of the events in
79 your events calendar.
80
81 In the simplest case, all these sections get lumped together into one
82 ``sitemap.xml``, but it's also possible to use the framework to generate a
83 sitemap index that references individual sitemap files, one per section. (See
84 `Creating a sitemap index`_ below.)
85
86 ``Sitemap`` classes must subclass ``django.contrib.sitemaps.Sitemap``. They can
87 live anywhere in your codebase.
88
89 A simple example
90 ================
91
92 Let's assume you have a blog system, with an ``Entry`` model, and you want your
93 sitemap to include all the links to your individual blog entries. Here's how
94 your sitemap class might look::
95
96     from django.contrib.sitemaps import Sitemap
97     from mysite.blog.models import Entry
98
99     class BlogSitemap(Sitemap):
100         changefreq = "never"
101         priority = 0.5
102
103         def items(self):
104             return Entry.objects.filter(is_draft=False)
105
106         def lastmod(self, obj):
107             return obj.pub_date
108
109 Note:
110
111     * ``changefreq`` and ``priority`` are class attributes corresponding to
112       ``<changefreq>`` and ``<priority>`` elements, respectively. They can be
113       made callable as functions, as ``lastmod`` was in the example.
114     * ``items()`` is simply a method that returns a list of objects. The objects
115       returned will get passed to any callable methods corresponding to a
116       sitemap property (``location``, ``lastmod``, ``changefreq``, and
117       ``priority``).
118     * ``lastmod`` should return a Python ``datetime`` object.
119     * There is no ``location`` method in this example, but you can provide it
120       in order to specify the URL for your object. By default, ``location()``
121       calls ``get_absolute_url()`` on each object and returns the result.
122
123 Sitemap class reference
124 =======================
125
126 A ``Sitemap`` class can define the following methods/attributes:
127
128 ``items``
129 ---------
130
131 **Required.** A method that returns a list of objects. The framework doesn't
132 care what *type* of objects they are; all that matters is that these objects
133 get passed to the ``location()``, ``lastmod()``, ``changefreq()`` and
134 ``priority()`` methods.
135
136 ``location``
137 ------------
138
139 **Optional.** Either a method or attribute.
140
141 If it's a method, it should return the absolute URL for a given object as
142 returned by ``items()``.
143
144 If it's an attribute, its value should be a string representing an absolute URL
145 to use for *every* object returned by ``items()``.
146
147 In both cases, "absolute URL" means a URL that doesn't include the protocol or
148 domain. Examples:
149
150     * Good: ``'/foo/bar/'``
151     * Bad: ``'example.com/foo/bar/'``
152     * Bad: ``'http://example.com/foo/bar/'``
153
154 If ``location`` isn't provided, the framework will call the
155 ``get_absolute_url()`` method on each object as returned by ``items()``.
156
157 ``lastmod``
158 -----------
159
160 **Optional.** Either a method or attribute.
161
162 If it's a method, it should take one argument -- an object as returned by
163 ``items()`` -- and return that object's last-modified date/time, as a Python
164 ``datetime.datetime`` object.
165
166 If it's an attribute, its value should be a Python ``datetime.datetime`` object
167 representing the last-modified date/time for *every* object returned by
168 ``items()``.
169
170 ``changefreq``
171 --------------
172
173 **Optional.** Either a method or attribute.
174
175 If it's a method, it should take one argument -- an object as returned by
176 ``items()`` -- and return that object's change frequency, as a Python string.
177
178 If it's an attribute, its value should be a string representing the change
179 frequency of *every* object returned by ``items()``.
180
181 Possible values for ``changefreq``, whether you use a method or attribute, are:
182
183     * ``'always'``
184     * ``'hourly'``
185     * ``'daily'``
186     * ``'weekly'``
187     * ``'monthly'``
188     * ``'yearly'``
189     * ``'never'``
190
191 ``priority``
192 ------------
193
194 **Optional.** Either a method or attribute.
195
196 If it's a method, it should take one argument -- an object as returned by
197 ``items()`` -- and return that object's priority, as either a string or float.
198
199 If it's an attribute, its value should be either a string or float representing
200 the priority of *every* object returned by ``items()``.
201
202 Example values for ``priority``: ``0.4``, ``1.0``. The default priority of a
203 page is ``0.5``. See the `sitemaps.org documentation`_ for more.
204
205 .. _sitemaps.org documentation: http://www.sitemaps.org/protocol.html#prioritydef
206
207 Shortcuts
208 =========
209
210 The sitemap framework provides a couple convenience classes for common cases:
211
212 ``FlatPageSitemap``
213 -------------------
214
215 The ``django.contrib.sitemaps.FlatPageSitemap`` class looks at all flatpages_
216 defined for the current ``SITE_ID`` (see the `sites documentation`_) and
217 creates an entry in the sitemap. These entries include only the ``location``
218 attribute -- not ``lastmod``, ``changefreq`` or ``priority``.
219
220 .. _flatpages: ../flatpages/
221 .. _sites documentation: ../sites/
222
223 ``GenericSitemap``
224 ------------------
225
226 The ``GenericSitemap`` class works with any `generic views`_ you already have.
227 To use it, create an instance, passing in the same ``info_dict`` you pass to
228 the generic views. The only requirement is that the dictionary have a
229 ``queryset`` entry. It may also have a ``date_field`` entry that specifies a
230 date field for objects retrieved from the ``queryset``. This will be used for
231 the ``lastmod`` attribute in the generated sitemap. You may also pass
232 ``priority`` and ``changefreq`` keyword arguments to the ``GenericSitemap``
233 constructor to specify these attributes for all URLs.
234
235 .. _generic views: ../generic_views/
236
237 Example
238 -------
239
240 Here's an example of a URLconf_ using both::
241
242     from django.conf.urls.defaults import *
243     from django.contrib.sitemaps import FlatPageSitemap, GenericSitemap
244     from mysite.blog.models import Entry
245
246     info_dict = {
247         'queryset': Entry.objects.all(),
248         'date_field': 'pub_date',
249     }
250
251     sitemaps = {
252         'flatpages': FlatPageSitemap,
253         'blog': GenericSitemap(info_dict, priority=0.6),
254     }
255
256     urlpatterns = patterns('',
257         # some generic view using info_dict
258         # ...
259
260         # the sitemap
261         (r'^sitemap.xml$', 'django.contrib.sitemaps.views.sitemap', {'sitemaps': sitemaps})
262     )
263
264 .. _URLconf: ../url_dispatch/
265
266 Creating a sitemap index
267 ========================
268
269 The sitemap framework also has the ability to create a sitemap index that
270 references individual sitemap files, one per each section defined in your
271 ``sitemaps`` dictionary. The only differences in usage are:
272
273     * You use two views in your URLconf: ``django.contrib.sitemaps.views.index``
274       and ``django.contrib.sitemaps.views.sitemap``.
275     * The ``django.contrib.sitemaps.views.sitemap`` view should take a
276       ``section`` keyword argument.
277
278 Here is what the relevant URLconf lines would look like for the example above::
279
280     (r'^sitemap.xml$', 'django.contrib.sitemaps.views.index', {'sitemaps': sitemaps})
281     (r'^sitemap-(?P<section>.+).xml$', 'django.contrib.sitemaps.views.sitemap', {'sitemaps': sitemaps})
282
283 This will automatically generate a ``sitemap.xml`` file that references
284 both ``sitemap-flatpages.xml`` and ``sitemap-blog.xml``. The ``Sitemap``
285 classes and the ``sitemaps`` dict don't change at all.
286
287 Pinging Google
288 ==============
289
290 You may want to "ping" Google when your sitemap changes, to let it know to
291 reindex your site. The framework provides a function to do just that:
292 ``django.contrib.sitemaps.ping_google()``.
293
294 ``ping_google()`` takes an optional argument, ``sitemap_url``, which should be
295 the absolute URL of your site's sitemap (e.g., ``'/sitemap.xml'``). If this
296 argument isn't provided, ``ping_google()`` will attempt to figure out your
297 sitemap by performing a reverse looking in your URLconf.
298
299 ``ping_google()`` raises the exception
300 ``django.contrib.sitemaps.SitemapNotFound`` if it cannot determine your sitemap
301 URL.
302
303 One useful way to call ``ping_google()`` is from a model's ``save()`` method::
304
305     from django.contrib.sitemaps import ping_google
306
307     class Entry(models.Model):
308         # ...
309         def save(self):
310             super(Entry, self).save()
311             try:
312                 ping_google()
313             except Exception:
314                 # Bare 'except' because we could get a variety
315                 # of HTTP-related exceptions.
316                 pass
317
318 A more efficient solution, however, would be to call ``ping_google()`` from a
319 cron script, or some other scheduled task. The function makes an HTTP request
320 to Google's servers, so you may not want to introduce that network overhead
321 each time you call ``save()``.
Note: See TracBrowser for help on using the browser.