#22669 closed Bug (fixed)
bulk_create with empty model fields fails on oracle
Reported by: | Owned by: | Michael Nacharov | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | QuerySet.bulk_create |
Cc: | mnach@… | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I did the minimal example for reproducing this bug. Model looks very simple:
class ProbaModel(models.Model): num = models.IntegerField(null=True, blank=True)
When i tried to run the following test code:
class ProbaTestCase(TestCase): def test_bulk_create(self): ProbaModel.objects.bulk_create( [ ProbaModel(num=1), ProbaModel() ] )
I got an
Traceback (most recent call last): File "/home/sns/devel/proba/proba/app1/tests.py", line 10, in test_bulk_create ProbaModel() File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/manager.py", line 160, in bulk_create return self.get_queryset().bulk_create(*args, **kwargs) File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/query.py", line 359, in bulk_create self._batched_insert(objs_without_pk, fields, batch_size) File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/query.py", line 838, in _batched_insert using=self.db) File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/manager.py", line 232, in _insert return insert_query(self.model, objs, fields, **kwargs) File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/query.py", line 1514, in insert_query return query.get_compiler(using=using).execute_sql(return_id) File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 903, in execute_sql cursor.execute(sql, params) File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/backends/util.py", line 53, in execute return self.cursor.execute(sql, params) File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/utils.py", line 99, in __exit__ six.reraise(dj_exc_type, dj_exc_value, traceback) File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/backends/util.py", line 53, in execute return self.cursor.execute(sql, params) File "/home/sns/devel/proba/local/lib/python2.7/site-packages/django/db/backends/oracle/base.py", line 816, in execute return self.cursor.execute(query, self._param_generator(params)) DatabaseError: ORA-01790: expression must have same datatype as corresponding expression
This bug confirmed on Oracle 11.2.0.4 and 12. Complete project located on my GitHub: https://github.com/nsadovskiy/bulk_create_fail
Solution that works fo me is patch on django/db/backends/oracle/base.py:
*** base.py 2014-05-20 18:37:07.000000000 +0700 --- base.py 2014-05-20 18:41:45.000000000 +0700 *************** *** 467,509 **** return '%s_TR' % util.truncate_name(table, name_length).upper() def bulk_insert_sql(self, fields, num_values): ! MAPPER = { ! 'CharField': 'to_nchar(%s)', ! 'CommaSeparatedIntegerField': 'to_nchar(%s)', ! 'EmailField': 'to_nchar(%s)', ! 'FileField': 'to_nchar(%s)', ! 'FilePathField': 'to_nchar(%s)', ! 'ImageField': 'to_nchar(%s)', ! 'SlugField': 'to_nchar(%s)', ! 'URLField': 'to_nchar(%s)', ! ! 'IPAddressField': 'to_char(%s)', ! 'GenericIPAddressField': 'to_char(%s)', ! ! 'AutoField': 'to_number(%s)', ! 'BigIntegerField': 'to_number(%s)', ! 'BooleanField': 'to_number(%s)', ! 'DecimalField': 'to_number(%s)', ! 'FloatField': 'to_number(%s)', ! 'IntegerField': 'to_number(%s)', ! 'NullBooleanField': 'to_number(%s)', ! 'PositiveIntegerField': 'to_number(%s)', ! 'PositiveSmallIntegerField': 'to_number(%s)', ! 'SmallIntegerField': 'to_number(%s)', ! 'ForeignKey': 'to_number(%s)', ! 'ManyToManyField': 'to_number(%s)', ! 'OneToOneField': 'to_number(%s)', ! ! 'DateField': 'to_date(%s)', ! ! 'DateTimeField': 'to_timestamp(%s)', ! 'TimeField': 'to_timestamp(%s)', ! ! 'BinaryField': 'to_blob(%s)', ! ! 'TextField': 'to_nclob(%s)' ! } ! items_sql = "SELECT %s FROM DUAL" % ", ".join([MAPPER.get(field.get_internal_type(), '%s') for field in fields]) return " UNION ALL ".join([items_sql] * num_values) --- 467,473 ---- return '%s_TR' % util.truncate_name(table, name_length).upper() def bulk_insert_sql(self, fields, num_values): ! items_sql = "SELECT %s FROM DUAL" % ", ".join(["%s"] * len(fields)) return " UNION ALL ".join([items_sql] * num_values)
Attachments (2)
Change History (22)
comment:1 by , 10 years ago
Needs tests: | set |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:2 by , 10 years ago
Hi folks!
I faced with this issue too. As I understand this error happen because django used empty string(") to represent NULL value. It's OK when we work with single row. However in current bulk_create implementation oracle tries to do UNION ALL first and if there is no-string values in the same column there will be an error as described above.
I can see two approaches here:
- We can change representation of NULL value in django-oracle backend to "NULL" from empty string
- We can change insertion method from
INSERT INTO mytable (column1, column2, column3) SELECT 'val1.1', 'val1.2', 'val1.3' FROM dual UNION ALL SELECT 'val2.1', 'val2.2', 'val2.3' FROM dual UNION ALL SELECT 'val3.1', 'val3.2', 'val3.3' FROM dual
toINSERT ALL INTO mytable (column1, column2, column3) VALUES ('val1.1', 'val1.2', 'val1.3') INTO mytable (column1, column2, column3) VALUES ('val2.1', 'val2.2', 'val2.3') INTO mytable (column1, column2, column3) VALUES ('val3.1', 'val3.2', 'val3.3') SELECT * FROM dual;
http://www.techonthenet.com/oracle/questions/insert_rows.php
What do you think?
comment:3 by , 9 years ago
Keywords: | QuerySet.bulk_create added |
---|
comment:4 by , 9 years ago
@mnach is there any way one can patch your suggested approaches, as a workaround for now
@devs any progress on resolving this issue? really needed
comment:5 by , 9 years ago
As indicated by the flags on the ticket, we are waiting for a patch that includes tests.
comment:6 by , 9 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Version: | 1.6 → master |
by , 9 years ago
Attachment: | Screenshot from 2016-04-15 18:15:04.png added |
---|
Current "Execution plan"
by , 9 years ago
Attachment: | Screenshot from 2016-04-15 18:13:22.png added |
---|
INSET ALL execution plan
comment:7 by , 9 years ago
Hi all!
I need a review! It's first time I did a pull request to django, sorry if I missed something..
Tested in Oracle 12c Enterprise Edition Release 12.1.0.2.0. Thanks to OracleTestSetup!
I replace single_table_insert clause to multi_table_insert in terms used in oracle documentation because it is more efficient acording to execution plan
but solution suggested by @sns1081 also works, and I am ready to do another PR if this will be better for future purposes
comment:8 by , 9 years ago
My investigation of this subject came to this approach:
cursor.setinputsizes(arg0 = cx_Oracle.NUMBER) cursor.execute("select 1 from dual union all select :arg0 from dual", arg0 = None)
by Anthony Tuininga at cx_Oracle mailing list
method setinputsizes is a part of Python Database API Specification v2.0 and Python Database API Specification v1.0. Backends which support it:
- cx_Oracle > 2.4 (release notes)
- pysycopg > 1.99.12 (github tag 1_99_12)
- mysqlclient > 0.9.2 (github "blame" on this method)
- sqlite - don't actually know, but sqlite3 in cpython 2.6.8 supports this method
So, I guess that SQLCompiller classes can contain method like "prepare_cursor" which gets arguments from, for example, a database wrapper and pass them to setinputsizes method (and maybe to setoutputsize if it will be needed)
I need to ask @devs: Is this approach better than using "cast(:var as number)" inside query ?
comment:9 by , 8 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:10 by , 8 years ago
This bug just bit me. I am working around it by replacing the bulk_create with ~20,000 individual inserts. It'll be slow but it's just a migration so I can be patient.
Still, +1 for getting this fixed! :)
comment:11 by , 8 years ago
Cc: | added |
---|---|
Owner: | set to |
Status: | new → assigned |
comment:12 by , 8 years ago
Needs tests: | unset |
---|---|
Patch needs improvement: | set |
A PR includes tests but some failures on Oracle remain.
comment:13 by , 8 years ago
Patch needs improvement: | unset |
---|
Tests are passing now and the patch is ready for some feedback from Oracle users.
comment:15 by , 8 years ago
Patch needs improvement: | unset |
---|
comment:16 by , 8 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:19 by , 7 years ago
Any chance of including this fix in Django 1.11? bulk_create worked fine in our project in Django 1.10, but is failing in 1.11 with 'DatabaseError: ORA-00918: column ambiguously defined' for certain models. I'm still trying to determine what distinguishes the models where bulk_create fails from where it succeeds, but manually applying the fixes in this ticket (https://github.com/django/django/commit/c4e2fc5d9872c9a0c9c052a2e124f8a9b87de9b4) resolves the issue.
comment:20 by , 7 years ago
It's doubtful. It looks like this bug precedes Django 1.11 and is not a regression. If you have a slightly different issue, you might open a ticket and indicate where the regression was introduced, however, generally Django 1.11 is only receiving security and data loss fixes at this point.
I'm not sure about the patch, but it at least needs tests.