Django

Code

root/django/trunk/docs/custom_model_fields.txt

Revision 7301, 24.2 kB (checked in by ubernostrum, 4 months ago)

Fixed #6759: Corrected example of get_db_prep_save() in docs/custom_model_fields.txt

  • Property svn:eol-style set to native
Line 
1 ===================
2 Custom model fields
3 ===================
4
5 **New in Django development version**
6
7 Introduction
8 ============
9
10 The `model reference`_ documentation explains how to use Django's standard
11 field classes -- ``CharField``, ``DateField``, etc. For many purposes, those
12 classes are all you'll need. Sometimes, though, the Django version won't meet
13 your precise requirements, or you'll want to use a field that is entirely
14 different from those shipped with Django.
15
16 Django's built-in field types don't cover every possible database column type --
17 only the common types, such as ``VARCHAR`` and ``INTEGER``. For more obscure
18 column types, such as geographic polygons or even user-created types such as
19 `PostgreSQL custom types`_, you can define your own Django ``Field`` subclasses.
20
21 Alternatively, you may have a complex Python object that can somehow be
22 serialized to fit into a standard database column type. This is another case
23 where a ``Field`` subclass will help you use your object with your models.
24
25 Our example object
26 ------------------
27
28 Creating custom fields requires a bit of attention to detail. To make things
29 easier to follow, we'll use a consistent example throughout this document.
30 Suppose you have a Python object representing the deal of cards in a hand of
31 Bridge_. (Don't worry, you don't know how to play Bridge to follow this
32 example. You only need to know that 52 cards are dealt out equally to four
33 players, who are traditionally called *north*, *east*, *south* and *west*.)
34 Our class looks something like this::
35
36     class Hand(object):
37         def __init__(self, north, east, south, west):
38             # Input parameters are lists of cards ('Ah', '9s', etc)
39             self.north = north
40             self.east = east
41             self.south = south
42             self.west = west
43
44         # ... (other possibly useful methods omitted) ...
45
46 This is just an ordinary Python class, with nothing Django-specific about it.
47 We'd like to be able to do things like this in our models (we assume the
48 ``hand`` attribute on the model is an instance of ``Hand``)::
49
50     example = MyModel.objects.get(pk=1)
51     print example.hand.north
52
53     new_hand = Hand(north, east, south, west)
54     example.hand = new_hand
55     example.save()
56
57 We assign to and retrieve from the ``hand`` attribute in our model just like
58 any other Python class. The trick is to tell Django how to handle saving and
59 loading such an object.
60
61 In order to use the ``Hand`` class in our models, we **do not** have to change
62 this class at all. This is ideal, because it means you can easily write
63 model support for existing classes where you cannot change the source code.
64
65 .. note::
66     You might only be wanting to take advantage of custom database column
67     types and deal with the data as standard Python types in your models;
68     strings, or floats, for example. This case is similar to our ``Hand``
69     example and we'll note any differences as we go along.
70
71 .. _model reference: ../model_api/
72 .. _PostgreSQL custom types: http://www.postgresql.org/docs/8.2/interactive/sql-createtype.html
73 .. _Bridge: http://en.wikipedia.org/wiki/Contract_bridge
74
75 Background theory
76 =================
77
78 Database storage
79 ----------------
80
81 The simplest way to think of a model field is that it provides a way to take a
82 normal Python object -- string, boolean, ``datetime``, or something more
83 complex like ``Hand`` -- and convert it to and from a format that is useful
84 when dealing with the database (and serialization, but, as we'll see later,
85 that falls out fairly naturally once you have the database side under control).
86
87 Fields in a model must somehow be converted to fit into an existing database
88 column type. Different databases provide different sets of valid column types,
89 but the rule is still the same: those are the only types you have to work
90 with. Anything you want to store in the database must fit into one of
91 those types.
92
93 Normally, you're either writing a Django field to match a particular database
94 column type, or there's a fairly straightforward way to convert your data to,
95 say, a string.
96
97 For our ``Hand`` example, we could convert the card data to a string of 104
98 characters by concatenating all the cards together in a pre-determined order --
99 say, all the *north* cards first, then the *east*, *south* and *west* cards. So
100 ``Hand`` objects can be saved to text or character columns in the database.
101
102 What does a field class do?
103 ---------------------------
104
105 All of Django's fields (and when we say *fields* in this document, we always
106 mean model fields and not `form fields`_) are subclasses of
107 ``django.db.models.Field``. Most of the information that Django records about a
108 field is common to all fields -- name, help text, validator lists, uniqueness
109 and so forth. Storing all that information is handled by ``Field``. We'll get
110 into the precise details of what ``Field`` can do later on; for now, suffice it
111 to say that everything descends from ``Field`` and then customizes key pieces
112 of the class behavior.
113
114 .. _form fields: ../newforms/#fields
115
116 It's important to realize that a Django field class is not what is stored in
117 your model attributes. The model attributes contain normal Python objects. The
118 field classes you define in a model are actually stored in the ``Meta`` class
119 when the model class is created (the precise details of how this is done are
120 unimportant here). This is because the field classes aren't necessary when
121 you're just creating and modifying attributes. Instead, they provide the
122 machinery for converting between the attribute value and what is stored in the
123 database or sent to the serializer.
124
125 Keep this in mind when creating your own custom fields. The Django ``Field``
126 subclass you write provides the machinery for converting between your Python
127 instances and the database/serializer values in various ways (there are
128 differences between storing a value and using a value for lookups, for
129 example). If this sounds a bit tricky, don't worry -- it will become clearer in
130 the examples below. Just remember that you will often end up creating two
131 classes when you want a custom field:
132
133     * The first class is the Python object that your users will manipulate.
134       They will assign it to the model attribute, they will read from it for
135       displaying purposes, things like that. This is the ``Hand`` class in our
136       example.
137
138     * The second class is the ``Field`` subclass. This is the class that knows
139       how to convert your first class back and forth between its permanent
140       storage form and the Python form.
141
142 Writing a ``Field`` subclass
143 =============================
144
145 When planning your ``Field`` subclass, first give some thought to which
146 existing ``Field`` class your new field is most similar to. Can you subclass an
147 existing Django field and save yourself some work? If not, you should subclass
148 the ``Field`` class, from which everything is descended.
149
150 Initializing your new field is a matter of separating out any arguments that
151 are specific to your case from the common arguments and passing the latter to
152 the ``__init__()`` method of ``Field`` (or your parent class).
153
154 In our example, we'll call our field ``HandField``. (It's a good idea to call
155 your ``Field`` subclass ``(Something)Field``, so it's easily identifiable as a
156 ``Field`` subclass.) It doesn't behave like any existing field, so we'll
157 subclass directly from ``Field``::
158
159     from django.db import models
160
161     class HandField(models.Field):
162         def __init__(self, *args, **kwargs):
163             kwargs['max_length'] = 104
164             super(HandField, self).__init__(*args, **kwargs)
165
166 Our ``HandField`` accept most of the standard field options (see the list
167 below), but we ensure it has a fixed length, since it only needs to hold 52
168 card values plus their suits; 104 characters in total.
169
170 .. note::
171     Many of Django's model fields accept options that they don't do anything
172     with. For example, you can pass both ``editable`` and ``auto_now`` to a
173     ``DateField`` and it will simply ignore the ``editable`` parameter
174     (``auto_now`` being set implies ``editable=False``). No error is raised in
175     this case.
176
177     This behavior simplifies the field classes, because they don't need to
178     check for options that aren't necessary. They just pass all the options to
179     the parent class and then don't use them later on. It's up to you whether
180     you want your fields to be more strict about the options they select, or
181     to use the simpler, more permissive behavior of the current fields.
182
183 The ``Field.__init__()`` method takes the following parameters, in this
184 order:
185
186     * ``verbose_name``
187     * ``name``
188     * ``primary_key``
189     * ``max_length``
190     * ``unique``
191     * ``blank``
192     * ``null``
193     * ``db_index``
194     * ``core``
195     * ``rel``: Used for related fields (like ``ForeignKey``). For advanced use
196       only.
197     * ``default``
198     * ``editable``
199     * ``serialize``: If ``False``, the field will not be serialized when the
200       model is passed to Django's serializers_. Defaults to ``True``.
201     * ``prepopulate_from``
202     * ``unique_for_date``
203     * ``unique_for_month``
204     * ``unique_for_year``
205     * ``validator_list``
206     * ``choices``
207     * ``radio_admin``
208     * ``help_text``
209     * ``db_column``
210     * ``db_tablespace``: Currently only used with the Oracle backend and only
211       for index creation. You can usually ignore this option.
212
213 All of the options without an explanation in the above list have the same
214 meaning they do for normal Django fields. See the `model documentation`_ for
215 examples and details.
216
217 .. _serializers: ../serialization/
218 .. _model documentation: ../model-api/
219
220 The ``SubfieldBase`` metaclass
221 ------------------------------
222
223 As we indicated in the introduction_, field subclasses are often needed for
224 two reasons: either to take advantage of a custom database column type, or to
225 handle complex Python types. Obviously, a combination of the two is also
226 possible. If you're only working with custom database column types and your
227 model fields appear in Python as standard Python types direct from the
228 database backend, you don't need to worry about this section.
229
230 If you're handling custom Python types, such as our ``Hand`` class, we need
231 to make sure that when Django initializes an instance of our model and assigns
232 a database value to our custom field attribute, we convert that value into the
233 appropriate Python object. The details of how this happens internally are a
234 little complex, but the code you need to write in your ``Field`` class is
235 simple: make sure your field subclass uses ``django.db.models.SubfieldBase`` as
236 its metaclass::
237
238     class HandField(models.Field):
239         __metaclass__ = models.SubfieldBase
240
241         def __init__(self, *args, **kwargs):
242             # ...
243
244 This ensures that the ``to_python()`` method, documented below_, will always be
245 called when the attribute is initialized.
246
247 .. _below: #to-python-self-value
248
249 Useful methods
250 --------------
251
252 Once you've created your ``Field`` subclass and set up up the
253 ``__metaclass__``, you might consider overriding a few standard methods,
254 depending on your field's behavior. The list of methods below is in
255 approximately decreasing order of importance, so start from the top.
256
257 ``db_type(self)``
258 ~~~~~~~~~~~~~~~~~
259
260 Returns the database column data type for the ``Field``, taking into account
261 the current ``DATABASE_ENGINE`` setting.
262
263 Say you've created a PostgreSQL custom type called ``mytype``. You can use this
264 field with Django by subclassing ``Field`` and implementing the ``db_type()``
265 method, like so::
266
267     from django.db import models
268
269     class MytypeField(models.Field):
270         def db_type(self):
271             return 'mytype'
272
273 Once you have ``MytypeField``, you can use it in any model, just like any other
274 ``Field`` type::
275
276     class Person(models.Model):
277         name = models.CharField(max_length=80)
278         gender = models.CharField(max_length=1)
279         something_else = MytypeField()
280
281 If you aim to build a database-agnostic application, you should account for
282 differences in database column types. For example, the date/time column type
283 in PostgreSQL is called ``timestamp``, while the same column in MySQL is called
284 ``datetime``. The simplest way to handle this in a ``db_type()`` method is to
285 import the Django settings module and check the ``DATABASE_ENGINE`` setting.
286 For example::
287
288     class MyDateField(models.Field):
289         def db_type(self):
290             from django.conf import settings
291             if settings.DATABASE_ENGINE == 'mysql':
292                 return 'datetime'
293             else:
294                 return 'timestamp'
295
296 The ``db_type()`` method is only called by Django when the framework constructs
297 the ``CREATE TABLE`` statements for your application -- that is, when you first
298 create your tables. It's not called at any other time, so it can afford to
299 execute slightly complex code, such as the ``DATABASE_ENGINE`` check in the
300 above example.
301
302 Some database column types accept parameters, such as ``CHAR(25)``, where the
303 parameter ``25`` represents the maximum column length. In cases like these,
304 it's more flexible if the parameter is specified in the model rather than being
305 hard-coded in the ``db_type()`` method. For example, it wouldn't make much
306 sense to have a ``CharMaxlength25Field``, shown here::
307
308     # This is a silly example of hard-coded parameters.
309     class CharMaxlength25Field(models.Field):
310         def db_type(self):
311             return 'char(25)'
312
313     # In the model:
314     class MyModel(models.Model):
315         # ...
316         my_field = CharMaxlength25Field()
317
318 The better way of doing this would be to make the parameter specifiable at run
319 time -- i.e., when the class is instantiated. To do that, just implement
320 ``__init__()``, like so::
321
322     # This is a much more flexible example.
323     class BetterCharField(models.Field):
324         def __init__(self, max_length, *args, **kwargs):
325             self.max_length = max_length
326             super(BetterCharField, self).__init__(*args, **kwargs)
327
328         def db_type(self):
329             return 'char(%s)' % self.max_length
330
331     # In the model:
332     class MyModel(models.Model):
333         # ...
334         my_field = BetterCharField(25)
335
336 Finally, if your column requires truly complex SQL setup, return ``None`` from
337 ``db_type()``. This will cause Django's SQL creation code to skip over this
338 field. You are then responsible for creating the column in the right table in
339 some other way, of course, but this gives you a way to tell Django to get out
340 of the way.
341
342 ``to_python(self, value)``
343 ~~~~~~~~~~~~~~~~~~~~~~~~~~
344
345 Converts a value as returned by your database (or a serializer) to a Python
346 object.
347
348 The default implementation simply returns ``value``, for the common case in
349 which the database backend already returns data in the correct format (as a
350 Python string, for example).
351
352 If your custom ``Field`` class deals with data structures that are more complex
353 than strings, dates, integers or floats, then you'll need to override this
354 method. As a general rule, the method should deal gracefully with any of the
355 following arguments:
356
357     * An instance of the correct type (e.g., ``Hand`` in our ongoing example).
358
359     * A string (e.g., from a deserializer).
360
361     * Whatever the database returns for the column type you're using.
362
363 In our ``HandField`` class, we're storing the data as a VARCHAR field in the
364 database, so we need to be able to process strings and ``Hand`` instances in
365 ``to_python()``::
366
367     import re
368
369     class HandField(models.Field):
370         # ...
371
372         def to_python(self, value):
373             if isinstance(value, Hand):
374                 return value
375
376             # The string case.
377             p1 = re.compile('.{26}')
378             p2 = re.compile('..')
379             args = [p2.findall(x) for x in p1.findall(value)]
380             return Hand(*args)
381
382 Notice that we always return a ``Hand`` instance from this method. That's the
383 Python object type we want to store in the model's attribute.
384
385 **Remember:** If your custom field needs the ``to_python()`` method to be
386 called when it is created, you should be using `The SubfieldBase metaclass`_
387 mentioned earlier. Otherwise ``to_python()`` won't be called automatically.
388
389 ``get_db_prep_save(self, value)``
390 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
391
392 This is the reverse of ``to_python()`` when working with the database backends
393 (as opposed to serialization). The ``value`` parameter is the current value of
394 the model's attribute (a field has no reference to its containing model, so it
395 cannot retrieve the value itself), and the method should return data in a
396 format that can be used as a parameter in a query for the database backend.
397
398 For example::
399
400     class HandField(models.Field):
401         # ...
402
403         def get_db_prep_save(self, value):
404             return ''.join([''.join(l) for l in (value.north,
405                     value.east, value.south, value.west)])
406
407 ``pre_save(self, model_instance, add)``
408 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
409
410 This method is called just prior to ``get_db_prep_save()`` and should return
411 the value of the appropriate attribute from ``model_instance`` for this field.
412 The attribute name is in ``self.attname`` (this is set up by ``Field``). If
413 the model is being saved to the database for the first time, the ``add``
414 parameter will be ``True``, otherwise it will be ``False``.
415
416 You only need to override this method if you want to preprocess the value
417 somehow, just before saving. For example, Django's ``DateTimeField`` uses this
418 method to set the attribute correctly in the case of ``auto_now`` or
419 ``auto_now_add``.
420
421 If you do override this method, you must return the value of the attribute at
422 the end. You should also update the model's attribute if you make any changes
423 to the value so that code holding references to the model will always see the
424 correct value.
425
426 ``get_db_prep_lookup(self, lookup_type, value)``
427 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
428
429 Prepares the ``value`` for passing to the database when used in a lookup (a
430 ``WHERE`` constraint in SQL). The ``lookup_type`` will be one of the valid
431 Django filter lookups: ``exact``, ``iexact``, ``contains``, ``icontains``,
432 ``gt``, ``gte``, ``lt``, ``lte``, ``in``, ``startswith``, ``istartswith``,
433 ``endswith``, ``iendswith``, ``range``, ``year``, ``month``, ``day``,
434 ``isnull``, ``search``, ``regex``, and ``iregex``.
435
436 Your method must be prepared to handle all of these ``lookup_type`` values and
437 should raise either a ``ValueError`` if the ``value`` is of the wrong sort (a
438 list when you were expecting an object, for example) or a ``TypeError`` if
439 your field does not support that type of lookup. For many fields, you can get
440 by with handling the lookup types that need special handling for your field
441 and pass the rest of the ``get_db_prep_lookup()`` method of the parent class.
442
443 If you needed to implement ``get_db_prep_save()``, you will usually need to
444 implement ``get_db_prep_lookup()``. The usual reason is because of the
445 ``range``  and ``in`` lookups. In these case, you will passed a list of
446 objects (presumably of the right type) and will need to convert them to a list
447 of things of the right type for passing to the database. Sometimes you can
448 reuse ``get_db_prep_save()``, or at least factor out some common pieces from
449 both methods into a help function.
450
451 For example::
452
453     class HandField(models.Field):
454         # ...
455
456         def get_db_prep_lookup(self, lookup_type, value):
457             # We only handle 'exact' and 'in'. All others are errors.
458             if lookup_type == 'exact':
459                 return self.get_db_prep_save(value)
460             elif lookup_type == 'in':
461                 return [self.get_db_prep_save(v) for v in value]
462             else:
463                 raise TypeError('Lookup type %r not supported.' % lookup_type)
464
465
466 ``formfield(self, form_class=forms.CharField, **kwargs)``
467 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
468
469 Returns the default form field to use when this field is displayed
470 in a model. This method is called by the `helper functions`_
471 ``form_for_model()`` and ``form_for_instance()``.
472
473 All of the ``kwargs`` dictionary is passed directly to the form field's
474 ``__init__()`` method. Normally, all you need to do is set up a good default
475 for the ``form_class`` argument and then delegate further handling to the
476 parent class. This might require you to write a custom form field (and even a
477 form widget). See the `forms documentation`_ for information about this, and
478 take a look at the code in ``django.contrib.localflavor`` for some examples of
479 custom widgets.
480
481 Continuing our ongoing example, we can write the ``formfield()`` method as::
482
483     class HandField(models.Field):
484         # ...
485
486         def formfield(self, **kwargs):
487             # This is a fairly standard way to set up some defaults
488             # while letting the caller override them.
489             defaults = {'form_class': MyFormField}
490             defaults.update(kwargs)
491             return super(HandField, self).formfield(**defaults)
492
493 This assumes we're imported a ``MyFormField`` field class (which has its own
494 default widget). This document doesn't cover the details of writing custom form
495 fields.
496
497 .. _helper functions: ../newforms/#generating-forms-for-models
498 .. _forms documentation: ../newforms/
499
500 ``get_internal_type(self)``
501 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
502
503 Returns a string giving the name of the ``Field`` subclass we are emulating at
504 the database level. This is used to determine the type of database column for
505 simple cases.
506
507 If you have created a ``db_type()`` method, you don't need to worry about
508 ``get_internal_type()`` -- it won't be used much. Sometimes, though, your
509 database storage is similar in type to some other field, so you can use that
510 other field's logic to create the right column.
511
512 For example::
513
514     class HandField(models.Field):
515         # ...
516
517         def get_internal_type(self):
518             return 'CharField'
519
520 No matter which database backend we are using, this will mean that ``syncdb``
521 and other SQL commands create the right column type for storing a string.
522
523 If ``get_internal_type()`` returns a string that is not known to Django for
524 the database backend you are using -- that is, it doesn't appear in
525 ``django.db.backends.<db_name>.creation.DATA_TYPES`` -- the string will still
526 be used by the serializer, but the default ``db_type()`` method will return
527 ``None``. See the documentation of ``db_type()`` above_ for reasons why this
528 might be useful. Putting a descriptive string in as the type of the field for
529 the serializer is a useful idea if you're ever going to be using the
530 serializer output in some other place, outside of Django.
531
532 .. _above: #db-type-self
533
534 ``flatten_data(self, follow, obj=None)``
535 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
536
537 .. admonition:: Subject to change
538
539     Although implementing this method is necessary to allow field
540     serialization, the API might change in the future.
541
542 Returns a dictionary, mapping the field's attribute name to a flattened string
543 version of the data. This method has some internal uses that aren't of
544 interest to use here (mostly having to do with manipulators). For our
545 purposes, it's sufficient to return a one item dictionary that maps the
546 attribute name to a string.
547
548 This method is used by the serializers to convert the field into a string for
549 output. You can ignore the input parameters for serialization purposes,
550 although calling ``Field._get_val_from_obj(obj)`` is the best way to get the
551 value to serialize.
552
553 For example, since our ``HandField`` uses strings for its data storage anyway,
554 we can reuse some existing conversion code::
555
556     class HandField(models.Field):
557         # ...
558
559         def flatten_data(self, follow, obj=None):
560             value = self._get_val_from_obj(obj)
561             return {self.attname: self.get_db_prep_save(value)}
562
563 Some general advice
564 --------------------
565
566 Writing a custom field can be a tricky process, particularly if you're doing
567 complex conversions between your Python types and your database and
568 serialization formats. Here are a couple of tips to make things go more
569 smoothly:
570
571     1. Look at the existing Django fields (in
572        ``django/db/models/fields/__init__.py``) for inspiration. Try to find a
573        field that's similar to what you want and extend it a little bit,
574        instead of creating an entirely new field from scratch.
575
576     2. Put a ``__str__()`` or ``__unicode__()`` method on the class you're
577        wrapping up as a field. There are a lot of places where the default
578        behavior of the field code is to call ``force_unicode()`` on the value.
579        (In our examples in this document, ``value`` would be a ``Hand``
580        instance, not a ``HandField``). So if your ``__unicode__()`` method
581        automatically converts to the string form of your Python object, you can
582        save yourself a lot of work.
Note: See TracBrowser for help on using the browser.