Code

Changes between Version 18 and Version 19 of MultipleColumnPrimaryKeys


Ignore:
Timestamp:
09/01/12 21:50:06 (20 months ago)
Author:
bignose
Comment:

note that a set is the obvious data type

Legend:

Unmodified
Added
Removed
Modified
  • MultipleColumnPrimaryKeys

    v18 v19  
    33Relational database designs use a set of columns as the primary key for a table. When this set includes more than one column, it is known as a “composite” or “compound” primary key. (For more on the terminology, here is an [http://www.agiledata.org/essays/keys.html article discussing database keys]). 
    44 
    5 Currently Django models only support single column primary keys, denying many designs where the natural primary key of a table is multiple columns. Django currently can't work with these schemas; they must instead introduce a redundant single-column key (a “surrogate” key), forcing applications to make arbitrary and otherwise-unnecessary choices about which key to use for the table in any given instance. 
     5Currently Django models only support a single column in this set, denying many designs where the natural primary key of a table is multiple columns. Django currently can't work with these schemas; they must instead introduce a redundant single-column key (a “surrogate” key), forcing applications to make arbitrary and otherwise-unnecessary choices about which key to use for the table in any given instance. 
    66 
    77This page discusses how to have Django support these composite primary keys. There are a lot of details to get right here, but done right, it would allow for more flexibility and potential simplicity in data modeling. 
     
    2020There are three basic problems in dealing with composite primary keys in Django.  
    2121 
    22  1. A number of APIs use "obj._meta.pk" to access the primary key field (for example, to do "pk=whatever" lookups). A composite PK implementation would need to emulate this in some way to avoid breaking everything.  
    23  2. A number of things use (content_type_id, object_pk) tuples to refer to some object -- look at the comment framework, or the admin log API. Again, a composite PK system would need to somehow not break this.  
    24  3. Admin URLs; they're of the form "/app_label/module_name/pk/"; there would need to be a way to map URLs to objects in the absence of a primary key. 
     22 1. A number of APIs use "obj._meta.pk" to access the primary key, on the assumption it is a single field (for example, to do "pk=whatever" lookups). A composite PK implementation would need to emulate this in some way to avoid breaking everything. 
     23 2. A number of things use (content_type_id, object_pk) tuples to refer to some object -- look at the comment framework, or the admin log API. Again, a composite PK system would need to somehow not break this. 
     24 3. Admin URLs; they're of the form "/app_label/module_name/pk/"; there would need to be a way to map URLs to objects with a set of columns for the primary key. 
    2525 
    26 mjm: I'm not very familiar with Django's internals, but I suspect the first point's answer lies somewhere near the observation that a CK needs a compound data structure.  In Python the obvious choices are a tuple (or list, probably interchangeably) or a dictionary.  I've actually used sequences; the obvious drawback is that they require an agreed-upon convention as to the order of the components.  If I were doing it over again I would give very serious thought to using mappings, at least for things that hold key values. 
     26mjm: I'm not very familiar with Django's internals, but I suspect the first point's answer lies somewhere near the observation that a CK needs a compound data structure. 
    2727 
    28 Hmmm, that handwaving seems to much of what I have to say about the second point as well.  Not sure just how that can not break some code, though.  In my work using sequences I chose - not without some pain - to make all keys be sequences (and all keys' names were sequences, too).  This is at least consistent, and the changes were straightforward.  The fact that strings are themselves sequences actually made it more difficult, since overlooked cases would (often) misbehave rather than tossing an exception immediately - a point in favor of using mappings, maybe. 
     28bignose: In Python, the obvious choice is to make the primary key a ''set'' (a set of columns, in the model definition; a set of values, for a given instance). A compound key does not have an implied order, and a Python set is the obvious type to represent this. 
    2929 
    30 As for the last item, obviously a choice has to be made, and some existing cases will change.  I wouldn't call the result "broken", since only hand-generated admin URLs would have a problem (well, bookmarked ones, it occurs to me, if anyone actually does that).  Is anyone really concerned about making admin URLs slightly less easily hand-generatable? 
     30mjm: In my work using sequences I chose - not without some pain - to make all keys be sequences (and all keys' names were sequences, too).  This is at least consistent, and the changes were straightforward.  The fact that strings are themselves sequences actually made it more difficult, since overlooked cases would (often) misbehave rather than tossing an exception immediately - a point in favor of using mappings, maybe. 
     31 
     32mjm: As for admin URLs, obviously a choice has to be made, and some existing cases will change.  I wouldn't call the result "broken", since only hand-generated admin URLs would have a problem (well, bookmarked ones, it occurs to me, if anyone actually does that).  Is anyone really concerned about making admin URLs slightly less easily hand-generatable? 
     33 
     34bignose: The solution should at least retain the RESTful property that the URL is transparently correlated to what is being referenced. That's true of admin URLs now, and it is valuable that it be retained when changed to accommodate compound keys. 
    3135 
    3236= Proposed Solutions =  
    3337 
    3438Proposed solutions for the admin URL issue: 
    35 "/app_label/module/pk1,pk2/" -- ~~does not support text primary keys which include a comma.~~ URL encoding to the rescue! 
     39"/app_label/module/pk1,pk2/" -- ~~does not support text columns which include a comma.~~ URL encoding to the rescue! 
    3640 
    3741"/app_label/module/pk1/pk2/" -- ~~With support for a generic view on just "/pk1/" (like the date based generic view)~~