Opened 12 years ago

Closed 10 years ago

Last modified 10 years ago

#19463 closed New feature (fixed)

Add UUID Field to core

Reported by: Thomas Güttler Owned by: Marc Tamlyn
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: trbs@…, matt@…, mike@…, Marc Aymerich, cyphase@…, jonathan+django@…, tomek@…, saxix.rome@…, loic@…, galuszkak@…, ashwoods, anubhav9042@…, lukas-hetzenecker Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

on django-dev Dec 2012

If someone can come up with a good patch I'd be fine considering it for core.

Jacob (Kaplan-Moss)

Related: #4682 was closed five years ago.

I (Thomas Güttler) want to moderate this ticket, but won't create patch.

Change History (26)

comment:2 by Claude Paroz, 12 years ago

Note that in databases other than PostgreSQL, it might be desirable to store internally the UUID value as binary, not as a char, both for performance reasons and for compatibility with Postgres' uuid (stored as a 128 bits binary). So we might need to solve #2417 beforehand...

comment:3 by Matthew Schinckel, 12 years ago

Cc: matt@… added

One thing I found with my UUIDField is that I needed to supply code to enable South to (a) handle this field type on migrations, and (b) prevent it trying to create a default at the time a migration is run.

Specifically, https://bitbucket.org/schinckel/django-uuidfield/commits/69f7c0cdf91d28da2cceaff6f46ece34f733b560 shows how to do this.

I would assume we wouldn't want to have any code related to providing data to south in django core, so perhaps we would need to ensure that South releases a version around the same time as after this patch is included.

comment:4 by Mike Fogel, 12 years ago

Cc: mike@… added

comment:5 by Anssi Kääriäinen, 12 years ago

I've been thinking that we would likely want to have a new field type: GeneratedField. This is like AutoField - the field gets a value on save() if it doesn't already have a value, and this field type is always a primary key (I am not 100% sure of the PK requirement, but it could simplify things). GeneratedField would have a backing field (the db storage type) and some generator, where the generator could fetch the value from DB using RETURNING, could generate the value in Python (like default, but with access to connection), or it could fetch the value after save from the DB (AutoField does this using select currval(someseq) on some backends).

I think such a field type would cover a lot of requests we have currently - unsigned serial fields, tiny/big/...integer serial fields, UUID fields (no matter what the UUID generator function is), and likely some more.

I don't know how hard such a field will be to write, or what the exact API should be - so this is mostly hand waving at the moment. Still, it seems there are only two public API places where this would affect current code - model.save() and bulk_create(), so it seems this should not be totally out of reach as a feature.

comment:6 by Anssi Kääriäinen, 12 years ago

Triage Stage: UnreviewedAccepted

Quoting Jacob from the recent django-developers discussion: "If someone can come up with a good patch I'd be fine considering it for core.".

So, marking as accepted based on that.

comment:7 by Marc Aymerich, 12 years ago

Cc: Marc Aymerich added

comment:8 by Sharif Naas, 12 years ago

Cc: cyphase@… added

comment:9 by Jonathan Leroy, 12 years ago

Cc: jonathan+django@… added

comment:10 by Tomek Paczkowski, 11 years ago

Cc: tomek@… added

comment:11 by Stefano Apostolico, 11 years ago

Cc: saxix.rome@… added

comment:12 by loic84, 11 years ago

Cc: loic@… added

Big +1 on @akaariai's GeneratedField idea.

For example I use extensively what I call a "readable unique ID", similar to YouTube video IDs (i.e. "sc5vraPpTcA") for which I made a custom Field. It functions like a UUID but trades the creation convenience (guaranteed uniqueness) for usage convenience (being able to read it out load, shorter URL, etc.). A GeneratedField would allow me to implement that cleanly.

That said, some databases have native support for UUIDs and it's pretty much the standard for sharding, so we could have the generic GeneratedField and a UUIDField subclass.

I'd work on a patch with some guidance from @akaariai.

Version 0, edited 11 years ago by loic84 (next)

comment:13 by Kamil Gałuszka, 11 years ago

Cc: galuszkak@… added
Version: 1.4master

comment:14 by ashwoods, 11 years ago

Cc: ashwoods added

comment:15 by Marc Tamlyn, 11 years ago

Owner: changed from nobody to Marc Tamlyn
Status: newassigned

For postgres at least, this will form part of my upcoming work on django.contrib.postgres. Support for bigserial is also likely to come in with that, so a more general base class for AutoField might be useful. That said, a UUIDField does not always want to be autogenerated (unlike an autoincrementing which probably should be) - it is a reasonable use case for an API client to generate a uuid (using the uuid4 approach which has a very high probability of avoiding clashes) and expect that to be saved by a Django backed API.

Supporting a simple UUIDField(default=uuid.uuid4) should be a good start.

comment:16 by japrogramer@…, 11 years ago

I have written a UUID Field for django that supports 1.7 and its features, migrations serialization etc.
The field can be set with a UUID instance, either a hyphenated str or one that is not. also it can be created with bytes if that is needed. It can auto generate the uuid aka uuid4 and supports the other variants that python's uuid module offers (1,3,4,5). Queries work with either str or UUID instances but not with bytes because who is ever going to query by the bytes, em I right? https://github.com/japrogramer/django-uuid-contour
P.S.
Many tests are included and supports python 3.4 ;)

comment:17 by Marc Tamlyn, 10 years ago

Has patch: set

comment:18 by Marc Tamlyn, 10 years ago

PR updated to be in core rather than contrib.postgres.

comment:19 by Anssi Kääriäinen, 10 years ago

Patch needs improvement: set

There seems to be one issue that needs solving: should we use SubfieldBase or not? SubfieldBase is used so that the field's to_python method is called any time a value is assigned to a model instance. In particular this happens when setting a value in model.__init__. So, if a database value is just bytes or string, then when the model is initialized from the database we get correctly UUID instance in the uuid field because to_python is called.

There isn't any field in core that uses to_python. There are some disadvantages when using to_python:

  1. It doesn't work when using .values('uuid_field')
  2. There is a small performance penalty when setting the field value, in particular model.init will be 10-20% slower for each field that uses SubfieldBase.
  3. Fields with subfieldbase work a bit differently from other core fields. SubfieldBase fields do value conversion on assignment, so:
     >>> s = SomeModel()
     >>> s.uuid_field = "f47ac10b-58cc-4372-a567-0e02b2c3d479"
     >>> s.uuid_field
     OUT: uuid("f47ac10b-58cc-4372-a567-0e02b2c3d479") when using SubfieldBase
     OUT: "f47ac10b-58cc-4372-a567-0e02b2c3d479" when not using SubfieldBase
    

Now, one could consider this to be a feature. But, no other field in core or contrib does this kind of conversion on assignment, so we should avoid this if possible.

Other ways forward are:

  1. Add a more generic field value conversion framework: add field.from_db_value(value, connection). This is a larger amount of work, but is needed in any case. This solution would work in .values(), and it would also be considerably faster than the current SubfieldBase way of doing things. Unfortunately this means that we can't merge this ticket before we have added the from_db_value method.
  2. Use backend specific converters. Unfortunately it seems one needs to create custom compilers for each backend (see django/db/backends/oracle/compiler.py for example)

So, in the end there seems to be just two choices: wait for field.from_db_value() or use SubfieldBase (with the possibility of removing use of SubfieldBase when field.from_db_value is introduced).

I'll mark patch needs improvement for lack of better marker that this isn't ready for merge before we agree on a solution on the SubfieldBase issue.

comment:20 by Anubhav Joshi, 10 years ago

Cc: @… added

comment:21 by Anubhav Joshi, 10 years ago

Cc: anubhav9042@… added; @… removed

comment:22 by Tim Graham, 10 years ago

Patch needs improvement: unset
Triage Stage: AcceptedReady for checkin

comment:23 by lukas-hetzenecker, 10 years ago

Cc: lukas-hetzenecker added

comment:24 by Marc Tamlyn <marc.tamlyn@…>, 10 years ago

Resolution: fixed
Status: assignedclosed

In ed7821231b7dbf34a6c8ca65be3b9bcbda4a0703:

Fixed #19463 -- Added UUIDField

Uses native support in postgres, and char(32) on other backends.

comment:25 by Mathieu Dupuy, 10 years ago

What about MariaDB which now supports UUID ?

comment:26 by Simon Charette, 10 years ago

@deronnax please open a new feature request instead of commenting on a closed ticket.

Note: See TracTickets for help on using tickets.
Back to Top