Opened 4 years ago

Closed 21 months ago

Last modified 7 months ago

#22669 closed Bug (fixed)

bulk_create with empty model fields fails on oracle

Reported by: sns1081@… Owned by: Michael Nacharov
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: QuerySet.bulk_create
Cc: mnach@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I did the minimal example for reproducing this bug. Model looks very simple:

class ProbaModel(models.Model):
    num = models.IntegerField(null=True, blank=True)

When i tried to run the following test code:

class ProbaTestCase(TestCase):
    def test_bulk_create(self):
        ProbaModel.objects.bulk_create(
            [
                ProbaModel(num=1),
                ProbaModel()
            ]
        )

I got an

Traceback (most recent call last):
  File "/home/sns/devel/proba/proba/app1/tests.py", line 10, in test_bulk_create
    ProbaModel()
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/manager.py", line 160, in bulk_create
    return self.get_queryset().bulk_create(*args, **kwargs)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/query.py", line 359, in bulk_create
    self._batched_insert(objs_without_pk, fields, batch_size)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/query.py", line 838, in _batched_insert
    using=self.db)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/manager.py", line 232, in _insert
    return insert_query(self.model, objs, fields, **kwargs)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/query.py", line 1514, in insert_query
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 903, in execute_sql
    cursor.execute(sql, params)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/backends/util.py", line 53, in execute
    return self.cursor.execute(sql, params)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/utils.py", line 99, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/backends/util.py", line 53, in execute
    return self.cursor.execute(sql, params)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/backends/oracle/base.py", line 816, in execute
    return self.cursor.execute(query, self._param_generator(params))
DatabaseError: ORA-01790: expression must have same datatype as corresponding expression

This bug confirmed on Oracle 11.2.0.4 and 12. Complete project located on my GitHub: https://github.com/nsadovskiy/bulk_create_fail

Solution that works fo me is patch on django/db/backends/oracle/base.py:

*** base.py	2014-05-20 18:37:07.000000000 +0700
--- base.py	2014-05-20 18:41:45.000000000 +0700
***************
*** 467,509 ****
          return '%s_TR' % util.truncate_name(table, name_length).upper()
  
      def bulk_insert_sql(self, fields, num_values):
!         MAPPER = {
!             'CharField': 'to_nchar(%s)',
!             'CommaSeparatedIntegerField': 'to_nchar(%s)',
!             'EmailField': 'to_nchar(%s)',
!             'FileField': 'to_nchar(%s)',
!             'FilePathField': 'to_nchar(%s)',
!             'ImageField': 'to_nchar(%s)',
!             'SlugField': 'to_nchar(%s)',
!             'URLField': 'to_nchar(%s)',
! 
!             'IPAddressField': 'to_char(%s)',
!             'GenericIPAddressField': 'to_char(%s)',
! 
!             'AutoField': 'to_number(%s)',
!             'BigIntegerField': 'to_number(%s)',
!             'BooleanField': 'to_number(%s)',
!             'DecimalField': 'to_number(%s)',
!             'FloatField': 'to_number(%s)',
!             'IntegerField': 'to_number(%s)',
!             'NullBooleanField': 'to_number(%s)',
!             'PositiveIntegerField': 'to_number(%s)',
!             'PositiveSmallIntegerField': 'to_number(%s)',
!             'SmallIntegerField': 'to_number(%s)',
!             'ForeignKey': 'to_number(%s)',
!             'ManyToManyField': 'to_number(%s)',
!             'OneToOneField': 'to_number(%s)',
! 
!             'DateField': 'to_date(%s)',
! 
!             'DateTimeField': 'to_timestamp(%s)',
!             'TimeField': 'to_timestamp(%s)',
! 
!             'BinaryField': 'to_blob(%s)',
! 
!             'TextField': 'to_nclob(%s)'
!         }
!         items_sql = "SELECT %s FROM DUAL" % ", ".join([MAPPER.get(field.get_internal_type(), '%s') for field in fields])
          return " UNION ALL ".join([items_sql] * num_values)
  
  
--- 467,473 ----
          return '%s_TR' % util.truncate_name(table, name_length).upper()
  
      def bulk_insert_sql(self, fields, num_values):
!         items_sql = "SELECT %s FROM DUAL" % ", ".join(["%s"] * len(fields))
          return " UNION ALL ".join([items_sql] * num_values)
  

Attachments (2)

Screenshot from 2016-04-15 18:15:04.png (10.6 KB) - added by Michael Nacharov 3 years ago.
Current "Execution plan"
Screenshot from 2016-04-15 18:13:22.png (9.7 KB) - added by Michael Nacharov 3 years ago.
INSET ALL execution plan

Download all attachments as: .zip

Change History (22)

comment:1 Changed 4 years ago by Tim Graham

Needs tests: set
Triage Stage: UnreviewedAccepted

I'm not sure about the patch, but it at least needs tests.

comment:2 Changed 4 years ago by Michael Nacharov

Hi folks!

I faced with this issue too. As I understand this error happen because django used empty string(") to represent NULL value. It's OK when we work with single row. However in current bulk_create implementation oracle tries to do UNION ALL first and if there is no-string values in the same column there will be an error as described above.
I can see two approaches here:

  • We can change representation of NULL value in django-oracle backend to "NULL" from empty string
  • We can change insertion method from
    INSERT INTO mytable (column1, column2, column3) 
    SELECT 'val1.1', 'val1.2', 'val1.3' FROM dual
    UNION ALL
    SELECT 'val2.1', 'val2.2', 'val2.3' FROM dual
    UNION ALL
    SELECT 'val3.1', 'val3.2', 'val3.3' FROM dual
    
    to
    INSERT ALL
      INTO mytable (column1, column2, column3) VALUES ('val1.1', 'val1.2', 'val1.3')
      INTO mytable (column1, column2, column3) VALUES ('val2.1', 'val2.2', 'val2.3')
      INTO mytable (column1, column2, column3) VALUES ('val3.1', 'val3.2', 'val3.3')
    SELECT * FROM dual;
    
    http://www.techonthenet.com/oracle/questions/insert_rows.php

What do you think?

comment:3 Changed 3 years ago by Tim Graham

Keywords: QuerySet.bulk_create added

comment:4 Changed 3 years ago by Mohsen Bande

@mnach is there any way one can patch your suggested approaches, as a workaround for now?
@devs any progress on resolving this issue? really needed

Last edited 3 years ago by Mohsen Bande (previous) (diff)

comment:5 Changed 3 years ago by Tim Graham

As indicated by the flags on the ticket, we are waiting for a patch that includes tests.

comment:6 Changed 3 years ago by Michael Nacharov

Owner: changed from nobody to Michael Nacharov
Status: newassigned
Version: 1.6master

Changed 3 years ago by Michael Nacharov

Current "Execution plan"

Changed 3 years ago by Michael Nacharov

INSET ALL execution plan

comment:7 Changed 3 years ago by Michael Nacharov

Hi all!

I need a review! It's first time I did a pull request to django, sorry if I missed something..

Tested in Oracle 12c Enterprise Edition Release 12.1.0.2.0. Thanks to OracleTestSetup!


I replace single_table_insert clause to multi_table_insert in terms used in oracle documentation because it is more efficient acording to execution plan

single_table_insert
Current "Execution plan"

multi_table_insert
INSET ALL execution plan

but solution suggested by @sns1081 also works, and I am ready to do another PR if this will be better for future purposes

comment:8 Changed 3 years ago by Michael Nacharov

My investigation of this subject came to this approach:

cursor.setinputsizes(arg0 = cx_Oracle.NUMBER)
cursor.execute("select 1 from dual union all select :arg0 from dual", arg0
= None)

by Anthony Tuininga at cx_Oracle mailing list

method setinputsizes is a part of Python Database API Specification v2.0 and Python Database API Specification v1.0. Backends which support it:

So, I guess that SQLCompiller classes can contain method like "prepare_cursor" which gets arguments from, for example, a database wrapper and pass them to setinputsizes method (and maybe to setoutputsize if it will be needed)

I need to ask @devs: Is this approach better than using "cast(:var as number)" inside query ?

comment:9 Changed 2 years ago by Michael Nacharov

Owner: Michael Nacharov deleted
Status: assignednew

comment:10 Changed 2 years ago by Tim Court

This bug just bit me. I am working around it by replacing the bulk_create with ~20,000 individual inserts. It'll be slow but it's just a migration so I can be patient.

Still, +1 for getting this fixed! :)

comment:11 Changed 2 years ago by Michael Nacharov

Cc: mnach@… added
Owner: set to Michael Nacharov
Status: newassigned

comment:12 Changed 2 years ago by Tim Graham

Needs tests: unset
Patch needs improvement: set

A PR includes tests but some failures on Oracle remain.

comment:13 Changed 2 years ago by Tim Graham

Patch needs improvement: unset

Tests are passing now and the patch is ready for some feedback from Oracle users.

comment:14 Changed 2 years ago by Tim Graham

Patch needs improvement: set

I left some comments for improvement.

comment:15 Changed 21 months ago by Michael Nacharov

Patch needs improvement: unset

comment:16 Changed 21 months ago by Tim Graham

Triage Stage: AcceptedReady for checkin

comment:17 Changed 21 months ago by Tim Graham <timograham@…>

Resolution: fixed
Status: assignedclosed

In c4e2fc5d:

Fixed #22669 -- Fixed QuerySet.bulk_create() with empty model fields on Oracle.

comment:18 Changed 20 months ago by Tim Graham <timograham@…>

In 3effe3a9:

Refs #22669 -- Fixed bulk_create test if Pillow isn't installed.

comment:19 Changed 7 months ago by Robert Meyers

Any chance of including this fix in Django 1.11? bulk_create worked fine in our project in Django 1.10, but is failing in 1.11 with 'DatabaseError: ORA-00918: column ambiguously defined' for certain models. I'm still trying to determine what distinguishes the models where bulk_create fails from where it succeeds, but manually applying the fixes in this ticket (https://github.com/django/django/commit/c4e2fc5d9872c9a0c9c052a2e124f8a9b87de9b4) resolves the issue.

comment:20 Changed 7 months ago by Tim Graham

It's doubtful. It looks like this bug precedes Django 1.11 and is not a regression. If you have a slightly different issue, you might open a ticket and indicate where the regression was introduced, however, generally Django 1.11 is only receiving security and data loss fixes at this point.

Note: See TracTickets for help on using tickets.
Back to Top