Opened 2 years ago

Last modified 10 days ago

#22669 new Bug

bulk_create with empty model fields fails on oracle

Reported by: sns1081@… Owned by:
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: QuerySet.bulk_create
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I did the minimal example for reproducing this bug. Model looks very simple:

class ProbaModel(models.Model):
    num = models.IntegerField(null=True, blank=True)

When i tried to run the following test code:

class ProbaTestCase(TestCase):
    def test_bulk_create(self):
        ProbaModel.objects.bulk_create(
            [
                ProbaModel(num=1),
                ProbaModel()
            ]
        )

I got an

Traceback (most recent call last):
  File "/home/sns/devel/proba/proba/app1/tests.py", line 10, in test_bulk_create
    ProbaModel()
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/manager.py", line 160, in bulk_create
    return self.get_queryset().bulk_create(*args, **kwargs)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/query.py", line 359, in bulk_create
    self._batched_insert(objs_without_pk, fields, batch_size)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/query.py", line 838, in _batched_insert
    using=self.db)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/manager.py", line 232, in _insert
    return insert_query(self.model, objs, fields, **kwargs)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/query.py", line 1514, in insert_query
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 903, in execute_sql
    cursor.execute(sql, params)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/backends/util.py", line 53, in execute
    return self.cursor.execute(sql, params)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/utils.py", line 99, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/backends/util.py", line 53, in execute
    return self.cursor.execute(sql, params)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/backends/oracle/base.py", line 816, in execute
    return self.cursor.execute(query, self._param_generator(params))
DatabaseError: ORA-01790: expression must have same datatype as corresponding expression

This bug confirmed on Oracle 11.2.0.4 and 12. Complete project located on my GitHub: https://github.com/nsadovskiy/bulk_create_fail

Solution that works fo me is patch on django/db/backends/oracle/base.py:

*** base.py	2014-05-20 18:37:07.000000000 +0700
--- base.py	2014-05-20 18:41:45.000000000 +0700
***************
*** 467,509 ****
          return '%s_TR' % util.truncate_name(table, name_length).upper()
  
      def bulk_insert_sql(self, fields, num_values):
!         MAPPER = {
!             'CharField': 'to_nchar(%s)',
!             'CommaSeparatedIntegerField': 'to_nchar(%s)',
!             'EmailField': 'to_nchar(%s)',
!             'FileField': 'to_nchar(%s)',
!             'FilePathField': 'to_nchar(%s)',
!             'ImageField': 'to_nchar(%s)',
!             'SlugField': 'to_nchar(%s)',
!             'URLField': 'to_nchar(%s)',
! 
!             'IPAddressField': 'to_char(%s)',
!             'GenericIPAddressField': 'to_char(%s)',
! 
!             'AutoField': 'to_number(%s)',
!             'BigIntegerField': 'to_number(%s)',
!             'BooleanField': 'to_number(%s)',
!             'DecimalField': 'to_number(%s)',
!             'FloatField': 'to_number(%s)',
!             'IntegerField': 'to_number(%s)',
!             'NullBooleanField': 'to_number(%s)',
!             'PositiveIntegerField': 'to_number(%s)',
!             'PositiveSmallIntegerField': 'to_number(%s)',
!             'SmallIntegerField': 'to_number(%s)',
!             'ForeignKey': 'to_number(%s)',
!             'ManyToManyField': 'to_number(%s)',
!             'OneToOneField': 'to_number(%s)',
! 
!             'DateField': 'to_date(%s)',
! 
!             'DateTimeField': 'to_timestamp(%s)',
!             'TimeField': 'to_timestamp(%s)',
! 
!             'BinaryField': 'to_blob(%s)',
! 
!             'TextField': 'to_nclob(%s)'
!         }
!         items_sql = "SELECT %s FROM DUAL" % ", ".join([MAPPER.get(field.get_internal_type(), '%s') for field in fields])
          return " UNION ALL ".join([items_sql] * num_values)
  
  
--- 467,473 ----
          return '%s_TR' % util.truncate_name(table, name_length).upper()
  
      def bulk_insert_sql(self, fields, num_values):
!         items_sql = "SELECT %s FROM DUAL" % ", ".join(["%s"] * len(fields))
          return " UNION ALL ".join([items_sql] * num_values)
  

Attachments (2)

Screenshot from 2016-04-15 18:15:04.png (10.6 KB) - added by mnach 3 months ago.
Current "Execution plan"
Screenshot from 2016-04-15 18:13:22.png (9.7 KB) - added by mnach 3 months ago.
INSET ALL execution plan

Download all attachments as: .zip

Change History (11)

comment:1 Changed 2 years ago by timo

  • Needs documentation unset
  • Needs tests set
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

I'm not sure about the patch, but it at least needs tests.

comment:2 Changed 2 years ago by mnach

Hi folks!

I faced with this issue too. As I understand this error happen because django used empty string(") to represent NULL value. It's OK when we work with single row. However in current bulk_create implementation oracle tries to do UNION ALL first and if there is no-string values in the same column there will be an error as described above.
I can see two approaches here:

  • We can change representation of NULL value in django-oracle backend to "NULL" from empty string
  • We can change insertion method from
    INSERT INTO mytable (column1, column2, column3) 
    SELECT 'val1.1', 'val1.2', 'val1.3' FROM dual
    UNION ALL
    SELECT 'val2.1', 'val2.2', 'val2.3' FROM dual
    UNION ALL
    SELECT 'val3.1', 'val3.2', 'val3.3' FROM dual
    
    to
    INSERT ALL
      INTO mytable (column1, column2, column3) VALUES ('val1.1', 'val1.2', 'val1.3')
      INTO mytable (column1, column2, column3) VALUES ('val2.1', 'val2.2', 'val2.3')
      INTO mytable (column1, column2, column3) VALUES ('val3.1', 'val3.2', 'val3.3')
    SELECT * FROM dual;
    
    http://www.techonthenet.com/oracle/questions/insert_rows.php

What do you think?

comment:3 Changed 11 months ago by timgraham

  • Keywords QuerySet.bulk_create added

comment:4 Changed 5 months ago by mbande

@mnach is there any way one can patch your suggested approaches, as a workaround for now?
@devs any progress on resolving this issue? really needed

Last edited 5 months ago by mbande (previous) (diff)

comment:5 Changed 5 months ago by timgraham

As indicated by the flags on the ticket, we are waiting for a patch that includes tests.

comment:6 Changed 3 months ago by mnach

  • Owner changed from nobody to mnach
  • Status changed from new to assigned
  • Version changed from 1.6 to master

Changed 3 months ago by mnach

Current "Execution plan"

Changed 3 months ago by mnach

INSET ALL execution plan

comment:7 Changed 3 months ago by mnach

Hi all!

I need a review! It's first time I did a pull request to django, sorry if I missed something..

Tested in Oracle 12c Enterprise Edition Release 12.1.0.2.0. Thanks to OracleTestSetup!


I replace single_table_insert clause to multi_table_insert in terms used in oracle documentation because it is more efficient acording to execution plan

single_table_insert
Current "Execution plan"

multi_table_insert
INSET ALL execution plan

but solution suggested by @sns1081 also works, and I am ready to do another PR if this will be better for future purposes

comment:8 Changed 2 months ago by mnach

My investigation of this subject came to this approach:

cursor.setinputsizes(arg0 = cx_Oracle.NUMBER)
cursor.execute("select 1 from dual union all select :arg0 from dual", arg0
= None)

by Anthony Tuininga at cx_Oracle mailing list

method setinputsizes is a part of Python Database API Specification v2.0 and Python Database API Specification v1.0. Backends which support it:

So, I guess that SQLCompiller classes can contain method like "prepare_cursor" which gets arguments from, for example, a database wrapper and pass them to setinputsizes method (and maybe to setoutputsize if it will be needed)

I need to ask @devs: Is this approach better than using "cast(:var as number)" inside query ?

comment:9 Changed 10 days ago by mnach

  • Owner mnach deleted
  • Status changed from assigned to new
Note: See TracTickets for help on using tickets.
Back to Top