Opened 9 years ago

Closed 9 years ago

#24012 closed Uncategorized (wontfix)

models.UUIDfield fails on Oracle backend

Reported by: JorisBenschop Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: oracle, models.UUIDfield, 1.8-alpha
Cc: Shai Berger Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hi, my employers uses RAW(16) as a pk for all tables (using the SYS_GUID() function). This gives blocking issues in Django as Django keeps on trying to decode() this field. However, using models.UUIDfield does not resovle the issue:

class Population(models.Model):
    population_id = models.UUIDField(primary_key=True)
    population_name = models.CharField(max_length=400)

x=Population.objects.all()[0]
Traceback (most recent call last):
File "/home/gfjom/wingide4/lib/src/debug/tserver/_sandbox.py", line 1, in <module>
    # Used internally for debug sandbox under external interpreter
  File "/home/gfjom/django-virtualenv/lib/python2.7/site-packages/Django-1.8-py2.7.egg/django/db/models/query.py", line 203, in __getitem__
    return list(qs)[0]
  File "/home/gfjom/django-virtualenv/lib/python2.7/site-packages/Django-1.8-py2.7.egg/django/db/models/query.py", line 164, in __iter__
    self._fetch_all()
  File "/home/gfjom/django-virtualenv/lib/python2.7/site-packages/Django-1.8-py2.7.egg/django/db/models/query.py", line 999, in _fetch_all
    self._result_cache = list(self.iterator())
  File "/home/gfjom/django-virtualenv/lib/python2.7/site-packages/Django-1.8-py2.7.egg/django/db/models/query.py", line 292, in iterator
    for row in compiler.results_iter():
  File "/home/gfjom/django-virtualenv/lib/python2.7/site-packages/Django-1.8-py2.7.egg/django/db/models/sql/compiler.py", line 753, in results_iter
    row = self.apply_converters(row, converters)
  File "/home/gfjom/django-virtualenv/lib/python2.7/site-packages/Django-1.8-py2.7.egg/django/db/models/sql/compiler.py", line 696, in apply_converters
    value = converter(value, field)
  File "/home/gfjom/django-virtualenv/lib/python2.7/site-packages/Django-1.8-py2.7.egg/django/db/backends/oracle/base.py", line 326, in convert_uuidfield_value
    value = uuid.UUID(value)
  File "/tools/general/app/python-2.7.5-rhel6/lib/python2.7/uuid.py", line 134, in __init__
    raise ValueError('badly formed hexadecimal UUID string')
ValueError: badly formed hexadecimal UUID string

Change History (13)

comment:1 by Tim Graham, 9 years ago

Django's UUIDField is documented as using char(32) for the database representation. I suspect if your own field's database representation differs, you'll need to subclass the field.

comment:2 by JorisBenschop, 9 years ago

This is not true. Uuidfield forces itself to uuid by checking isnstance( x, uuid.uuid). It does not accept any char(32) input. Why do you claim thís?

Sys_guid() is a standard recommendation for oracle primary keys for large implementatipn. This is not something that is unique for my case. I'd be happy to work on an extension if anyone can point me to the interoperability that is required, as i assume it needs mor work than just a change in type checking.

comment:3 by Tim Graham, 9 years ago

Here's the code to support the documentation I pointed out.

What is the value of the value that can't be converted in your traceback?

comment:4 by JorisBenschop, 9 years ago

If you look here: here you see that models.UUIDfield forces data to fit into uuid.UUID even when the DB allows other formats also. In my naive view, this check, and the requirement to store the content of the UUIDfield in a uuid.UUID object is the main issue.

the field DDL is a RAW(16) or RAW(32). However the data is generated by Oracle SYS_GUID(), which generates a raw(16) which in turn translates into a 32-character hex string.

SELECT SYS_GUID() FROM DUAL;
0a6ce74693a906b6e0535799030a228e 

The check by UUID is:

if len(hex) != 32:
   raise ValueError('badly formed hexadecimal UUID string')

if I do

>>>hex
'\nl\xe7F\x93\xa9\x06\xb6\xe0SW\x99\x03\n"\x8e'
>>>len(hex)
16
>>>binascii.b2a_hex(hex)
'0a6ce74693a906b6e0535799030a228e'
>>>len(binascii.b2a_hex(hex))
32

I'm probably making a mistake somewhere, but the reality remains that using SYS_GUID() is a very common generator for PK in Oracle.

Last edited 9 years ago by JorisBenschop (previous) (diff)

comment:5 by Tim Graham, 9 years ago

It seems like you expect your field to work fundamentally differently from how Django's UUIDField works which is why I suggested writing a custom field. I am not sure we can make our own field interoperable with the two different database and Python representations.

comment:6 by JorisBenschop, 9 years ago

Sorry for editing my reply (4) during your reply (5).
I think the difference are not that extreme with the exception of the size of the binary field.
I'm looking into writing a custom field. However, my preference would be to write it in a way that it can be used to include in the Django master, so that Django is again fully compatible with Oracle.
I am now using this documentation: https://docs.djangoproject.com/en/dev/howto/custom-model-fields/. If you have any other documentation, code directives or other modules that I will need to make changes to, please advice me.

thanks for your patience.

comment:7 by JorisBenschop, 9 years ago

THe thing is: I think that if I would change this code from

def convert_uuidfield_value(self, value, field):
    if value is not None:
        value = uuid.UUID(value)
    return value
def convert_uuidfield_value(self, value, field):
    if value is not None:
        value = uuid.UUID(bytes=value)
    return value

the problem would be solved, except that UUID decides to stick "-" inbetween the end value, breaking the lookup..

What would the collateral damage of such an approach be?

Last edited 9 years ago by JorisBenschop (previous) (diff)

comment:8 by Tim Graham, 9 years ago

I'd suggest trying to run Django's test suite on Oracle. You'll see errors in the test like this with that change:

======================================================================
ERROR: test_uuid_instance (model_fields.test_uuid.TestSaveLoad)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tim/code/django/tests/model_fields/test_uuid.py", line 14, in test_uuid_instance
    loaded = UUIDModel.objects.get()
  File "/home/tim/code/django/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/home/tim/code/django/django/db/models/query.py", line 367, in get
    num = len(clone)
  File "/home/tim/code/django/django/db/models/query.py", line 144, in __len__
    self._fetch_all()
  File "/home/tim/code/django/django/db/models/query.py", line 996, in _fetch_all
    self._result_cache = list(self.iterator())
  File "/home/tim/code/django/django/db/models/query.py", line 289, in iterator
    for row in compiler.results_iter():
  File "/home/tim/code/django/django/db/models/sql/compiler.py", line 757, in results_iter
    row = self.apply_converters(row, converters)
  File "/home/tim/code/django/django/db/models/sql/compiler.py", line 700, in apply_converters
    value = converter(value, field)
  File "/home/tim/code/django/django/db/backends/oracle/base.py", line 333, in convert_uuidfield_value
    value = uuid.UUID(bytes=value)
  File "/usr/lib/python2.7/uuid.py", line 144, in __init__
    raise ValueError('bytes is not a 16-char string')
ValueError: bytes is not a 16-char string

Django creates UUIDField as a VARCHAR2(32) in Oracle as shown by the code I linked in comment 3.

Also Django's field generates UUIDs in Python whereas your field uses the database to generate them. See ticket:19463#comment:15 for the background.

comment:9 by Tim Graham, 9 years ago

Resolution: invalid
Status: newclosed

Please reopen if you find some way to make this work, thanks!

comment:10 by Marc Tamlyn, 9 years ago

Resolution: invalid
Status: closednew

comment:11 by Marc Tamlyn, 9 years ago

Keywords: 1.8-alpha added
Triage Stage: UnreviewedAccepted

Given that oracle has a uuid generation function, and it seems possibly an accepted way of storing that (in a RAW(16)) it would be nice to store it in a binary field on Oracle.

This is somewhere between the situation for SQLite/Mysql (no good binary field or uuid support) and the situation for PostgreSQL (complete uuid support in the database driver), so may well need a new database feature.

If this change is to be implemented, it must be done before Django 1.8 as once we have existing fields out there we cannot change the field type. Unfortunately I do not have an Oracle test environment and have little impetus to work on this.

That said, I expect a full fix to look something like:

  • Change the database type for Oracle (backends/oracle/creation.py)
  • Change the converter function above
  • Add a new database feature (has_binary_uuid)
  • Check that feature in UUIDField.get_db_prep_value to convert the lookup value into bytes instead of a hex string

You should make sure all the tests in tests.model_fields.test_uuid pass.

comment:12 by JorisBenschop, 9 years ago

I am not allowed to create databases or even tables on this oracle backend, so my test are very limited. However, I created a custom Field that does all I need. It's far from complete though. Hope this helps. Let me knwo if I can assist anywhere. The new_sys_guid function breaks if the default DB is not oracle.

import binascii
def new_sys_guid(as_binary=False):
    '''
    this retreives a new SYS_GUID value from the database
    '''
    cursor = connection.cursor()
    cursor.execute("SELECT SYS_GUID() FROM DUAL")
    sysguidval= cursor.fetchone()[0]
    if as_binary:
        return(sysguidval)
    else:
        return rawtohex(sysguidval)
            
def hextoraw(hexval):
    ''' mimic the internal oracle hextoraw function'''
    return a2b_hex(hexval)

def rawtohex(rawval):
    ''' mimic the internal oracle rawtohex function'''
    return str(b2a_hex(rawval)).upper()   


class SYSGUID16Field(models.Field):
    default_error_messages = {
        'invalid': "'%(value)s' is not a valid SYS_GUID."
    }

    description = "A connector to the SYS_GUID() fields for Oracle Backends"

    def __init__(self, *args, **kwargs):
        kwargs['max_length'] = 16
        super(SYSGUID16Field, self).__init__(*args,**kwargs)

    def from_db_value(self, value, connection):
        #print 'call from_db_value %s' % value
        if value is None:
            return value
        return str(b2a_hex(value)).upper()

    def get_internal_type(self):
        #print 'call get_internal_value'
        return "RAWField"
    
    def db_type(self, connection):
        #print 'call db_type'
        return 'RAW(16)'    
    
    def to_python(self, value):
        print 'call to_python: %s' % value
        return value

    #def get_prep_value(self, value):
        #print 'call get_prep_value %s' % value
    #    return value
        
    def get_db_prep_value(self,value,connection,prepared):
        #print 'call get_db_prep_value'
        super(SYSGUID16Field,self).get_db_prep_value(value,connection,prepared)
        '''
        sometimes (specifically with foreign keys) the field content is not translated to hex.
        As there is no easy way to check we make the (dangerous) assumption that 
        all the binary fields will have at least one non-hex character.
        '''
        try:
            return connection.Database.Binary(a2b_hex(value))
        except TypeError:
            return connection.Database.Binary(value)
Last edited 9 years ago by JorisBenschop (previous) (diff)

comment:13 by Shai Berger, 9 years ago

Cc: Shai Berger added
Resolution: wontfix
Status: newclosed

Oracle's SYS_GUID generates globally-unique identifiers, but not RFC4122-compliant UUIDs. Added to Tim's comment:8 note that our UUIDField generates UUIDs in Python (where SYS_GUID is a database function), I am quite reluctant to make UUIDField compatible with SYS_GUID on Oracle.

A SYSGUIDField along the lines suggested sounds like a good idea -- except that I see no reason for it to be added to core. It can live happily as a 3rd-party component, at least until proven otherwise. Thus, I am closing this as "won't fix", but (anyone) feel free to reopen if you have new arguments for it.

With respect to the specific code you've written, I have two comments:

  1. I think if you inherited from BinaryField, rather than Field, you could drop all the type-conversion code. At a first glance, it seems the only feature of BinaryField you need to change is the underlying field type (it uses BLOB).
  1. I presume that you need the new_sys_guid() function in order to set the default on fields you use. I think that is suboptimal -- it requires a separate database round-trip. You should be able to change the field type to something like RAW(16) DEFAULT SYS_GUID(), and set a flag on it to make your field behave like AutoField -- have the value filled in only when inserted to the DB, and returned to the object by the insertion operation. If you find you need any changes in core to make that work, they would be considered favorably.

HTH, Shai.

Note: See TracTickets for help on using tickets.
Back to Top