Opened 11 years ago

Closed 8 years ago

Last modified 7 years ago

#22669 closed Bug (fixed)

bulk_create with empty model fields fails on oracle

Reported by: sns1081@… Owned by: Michael Nacharov
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: QuerySet.bulk_create
Cc: mnach@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I did the minimal example for reproducing this bug. Model looks very simple:

class ProbaModel(models.Model):
    num = models.IntegerField(null=True, blank=True)

When i tried to run the following test code:

class ProbaTestCase(TestCase):
    def test_bulk_create(self):
        ProbaModel.objects.bulk_create(
            [
                ProbaModel(num=1),
                ProbaModel()
            ]
        )

I got an

Traceback (most recent call last):
  File "/home/sns/devel/proba/proba/app1/tests.py", line 10, in test_bulk_create
    ProbaModel()
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/manager.py", line 160, in bulk_create
    return self.get_queryset().bulk_create(*args, **kwargs)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/query.py", line 359, in bulk_create
    self._batched_insert(objs_without_pk, fields, batch_size)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/query.py", line 838, in _batched_insert
    using=self.db)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/manager.py", line 232, in _insert
    return insert_query(self.model, objs, fields, **kwargs)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/query.py", line 1514, in insert_query
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 903, in execute_sql
    cursor.execute(sql, params)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/backends/util.py", line 53, in execute
    return self.cursor.execute(sql, params)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/utils.py", line 99, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/backends/util.py", line 53, in execute
    return self.cursor.execute(sql, params)
  File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/backends/oracle/base.py", line 816, in execute
    return self.cursor.execute(query, self._param_generator(params))
DatabaseError: ORA-01790: expression must have same datatype as corresponding expression

This bug confirmed on Oracle 11.2.0.4 and 12. Complete project located on my GitHub: https://github.com/nsadovskiy/bulk_create_fail

Solution that works fo me is patch on django/db/backends/oracle/base.py:

*** base.py	2014-05-20 18:37:07.000000000 +0700
--- base.py	2014-05-20 18:41:45.000000000 +0700
***************
*** 467,509 ****
          return '%s_TR' % util.truncate_name(table, name_length).upper()
  
      def bulk_insert_sql(self, fields, num_values):
!         MAPPER = {
!             'CharField': 'to_nchar(%s)',
!             'CommaSeparatedIntegerField': 'to_nchar(%s)',
!             'EmailField': 'to_nchar(%s)',
!             'FileField': 'to_nchar(%s)',
!             'FilePathField': 'to_nchar(%s)',
!             'ImageField': 'to_nchar(%s)',
!             'SlugField': 'to_nchar(%s)',
!             'URLField': 'to_nchar(%s)',
! 
!             'IPAddressField': 'to_char(%s)',
!             'GenericIPAddressField': 'to_char(%s)',
! 
!             'AutoField': 'to_number(%s)',
!             'BigIntegerField': 'to_number(%s)',
!             'BooleanField': 'to_number(%s)',
!             'DecimalField': 'to_number(%s)',
!             'FloatField': 'to_number(%s)',
!             'IntegerField': 'to_number(%s)',
!             'NullBooleanField': 'to_number(%s)',
!             'PositiveIntegerField': 'to_number(%s)',
!             'PositiveSmallIntegerField': 'to_number(%s)',
!             'SmallIntegerField': 'to_number(%s)',
!             'ForeignKey': 'to_number(%s)',
!             'ManyToManyField': 'to_number(%s)',
!             'OneToOneField': 'to_number(%s)',
! 
!             'DateField': 'to_date(%s)',
! 
!             'DateTimeField': 'to_timestamp(%s)',
!             'TimeField': 'to_timestamp(%s)',
! 
!             'BinaryField': 'to_blob(%s)',
! 
!             'TextField': 'to_nclob(%s)'
!         }
!         items_sql = "SELECT %s FROM DUAL" % ", ".join([MAPPER.get(field.get_internal_type(), '%s') for field in fields])
          return " UNION ALL ".join([items_sql] * num_values)
  
  
--- 467,473 ----
          return '%s_TR' % util.truncate_name(table, name_length).upper()
  
      def bulk_insert_sql(self, fields, num_values):
!         items_sql = "SELECT %s FROM DUAL" % ", ".join(["%s"] * len(fields))
          return " UNION ALL ".join([items_sql] * num_values)
  

Attachments (2)

Screenshot from 2016-04-15 18:15:04.png (10.6 KB ) - added by Michael Nacharov 9 years ago.
Current "Execution plan"
Screenshot from 2016-04-15 18:13:22.png (9.7 KB ) - added by Michael Nacharov 9 years ago.
INSET ALL execution plan

Download all attachments as: .zip

Change History (22)

comment:1 by Tim Graham, 10 years ago

Needs tests: set
Triage Stage: UnreviewedAccepted

I'm not sure about the patch, but it at least needs tests.

comment:2 by Michael Nacharov, 10 years ago

Hi folks!

I faced with this issue too. As I understand this error happen because django used empty string(") to represent NULL value. It's OK when we work with single row. However in current bulk_create implementation oracle tries to do UNION ALL first and if there is no-string values in the same column there will be an error as described above.
I can see two approaches here:

  • We can change representation of NULL value in django-oracle backend to "NULL" from empty string
  • We can change insertion method from
    INSERT INTO mytable (column1, column2, column3) 
    SELECT 'val1.1', 'val1.2', 'val1.3' FROM dual
    UNION ALL
    SELECT 'val2.1', 'val2.2', 'val2.3' FROM dual
    UNION ALL
    SELECT 'val3.1', 'val3.2', 'val3.3' FROM dual
    
    to
    INSERT ALL
      INTO mytable (column1, column2, column3) VALUES ('val1.1', 'val1.2', 'val1.3')
      INTO mytable (column1, column2, column3) VALUES ('val2.1', 'val2.2', 'val2.3')
      INTO mytable (column1, column2, column3) VALUES ('val3.1', 'val3.2', 'val3.3')
    SELECT * FROM dual;
    
    http://www.techonthenet.com/oracle/questions/insert_rows.php

What do you think?

comment:3 by Tim Graham, 9 years ago

Keywords: QuerySet.bulk_create added

comment:4 by Mohsen Bande, 9 years ago

@mnach is there any way one can patch your suggested approaches, as a workaround for now
@devs any progress on resolving this issue? really needed

Version 0, edited 9 years ago by Mohsen Bande (next)

comment:5 by Tim Graham, 9 years ago

As indicated by the flags on the ticket, we are waiting for a patch that includes tests.

comment:6 by Michael Nacharov, 9 years ago

Owner: changed from nobody to Michael Nacharov
Status: newassigned
Version: 1.6master

by Michael Nacharov, 9 years ago

Current "Execution plan"

by Michael Nacharov, 9 years ago

INSET ALL execution plan

comment:7 by Michael Nacharov, 9 years ago

Hi all!

I need a review! It's first time I did a pull request to django, sorry if I missed something..

Tested in Oracle 12c Enterprise Edition Release 12.1.0.2.0. Thanks to OracleTestSetup!


I replace single_table_insert clause to multi_table_insert in terms used in oracle documentation because it is more efficient acording to execution plan

single_table_insert
Current "Execution plan"

multi_table_insert
INSET ALL execution plan

but solution suggested by @sns1081 also works, and I am ready to do another PR if this will be better for future purposes

comment:8 by Michael Nacharov, 9 years ago

My investigation of this subject came to this approach:

cursor.setinputsizes(arg0 = cx_Oracle.NUMBER)
cursor.execute("select 1 from dual union all select :arg0 from dual", arg0
= None)

by Anthony Tuininga at cx_Oracle mailing list

method setinputsizes is a part of Python Database API Specification v2.0 and Python Database API Specification v1.0. Backends which support it:

So, I guess that SQLCompiller classes can contain method like "prepare_cursor" which gets arguments from, for example, a database wrapper and pass them to setinputsizes method (and maybe to setoutputsize if it will be needed)

I need to ask @devs: Is this approach better than using "cast(:var as number)" inside query ?

comment:9 by Michael Nacharov, 8 years ago

Owner: Michael Nacharov removed
Status: assignednew

comment:10 by Tim Court, 8 years ago

This bug just bit me. I am working around it by replacing the bulk_create with ~20,000 individual inserts. It'll be slow but it's just a migration so I can be patient.

Still, +1 for getting this fixed! :)

comment:11 by Michael Nacharov, 8 years ago

Cc: mnach@… added
Owner: set to Michael Nacharov
Status: newassigned

comment:12 by Tim Graham, 8 years ago

Needs tests: unset
Patch needs improvement: set

A PR includes tests but some failures on Oracle remain.

comment:13 by Tim Graham, 8 years ago

Patch needs improvement: unset

Tests are passing now and the patch is ready for some feedback from Oracle users.

comment:14 by Tim Graham, 8 years ago

Patch needs improvement: set

I left some comments for improvement.

comment:15 by Michael Nacharov, 8 years ago

Patch needs improvement: unset

comment:16 by Tim Graham, 8 years ago

Triage Stage: AcceptedReady for checkin

comment:17 by Tim Graham <timograham@…>, 8 years ago

Resolution: fixed
Status: assignedclosed

In c4e2fc5d:

Fixed #22669 -- Fixed QuerySet.bulk_create() with empty model fields on Oracle.

comment:18 by Tim Graham <timograham@…>, 8 years ago

In 3effe3a9:

Refs #22669 -- Fixed bulk_create test if Pillow isn't installed.

comment:19 by Robert Meyers, 7 years ago

Any chance of including this fix in Django 1.11? bulk_create worked fine in our project in Django 1.10, but is failing in 1.11 with 'DatabaseError: ORA-00918: column ambiguously defined' for certain models. I'm still trying to determine what distinguishes the models where bulk_create fails from where it succeeds, but manually applying the fixes in this ticket (https://github.com/django/django/commit/c4e2fc5d9872c9a0c9c052a2e124f8a9b87de9b4) resolves the issue.

comment:20 by Tim Graham, 7 years ago

It's doubtful. It looks like this bug precedes Django 1.11 and is not a regression. If you have a slightly different issue, you might open a ticket and indicate where the regression was introduced, however, generally Django 1.11 is only receiving security and data loss fixes at this point.

Note: See TracTickets for help on using tickets.
Back to Top