Opened 9 years ago

Last modified 14 months ago

#23964 new Bug

Support for Meta.constraints validation in BaseModelFormSet.validate_unique().

Reported by: Jon Dufresne Owned by: nobody
Component: Forms Version: 1.7
Severity: Normal Keywords:
Cc: Adam Johnson, Hannes Ljungberg Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Due to MySQL collation rules, two strings differing only by case are considered equal ("foo" is equal to "FOO"). If a database field is unique, upon inserting two rows where the unique field differs only by case, there will be a unique constraint violation.

If a modelformset_factory() is used to create a form set for a model with a unique field, the user can enter two strings differing only by case. The BaseModelFormSet.validate_unique() does not consider MySQL's collation rules and considers these two strings different. Upon insertion in the database, there is a constraint violation. The following new unit test demonstrates how this could happen. Code: <https://github.com/jdufresne/django/tree/mysql-formset-duplicates>.

Not sure the correct approach to fix this as other database backends will happily accept strings differing only by case. So the formset needs to have some sense of the database backend or whether or not unique fields should be case sensitive.

diff --git a/tests/forms_tests/models.py b/tests/forms_tests/models.py
index 82fa005..45d9680 100644
--- a/tests/forms_tests/models.py
+++ b/tests/forms_tests/models.py
@@ -110,3 +110,7 @@ class Cheese(models.Model):
 
 class Article(models.Model):
     content = models.TextField()
+
+
+class Tag(models.Model):
+    name = models.CharField(max_length=100, unique=True)
diff --git a/tests/forms_tests/tests/test_formsets.py b/tests/forms_tests/tests/test_formsets.py
index 94e2704..2e77afd 100644
--- a/tests/forms_tests/tests/test_formsets.py
+++ b/tests/forms_tests/tests/test_formsets.py
@@ -6,8 +6,10 @@ import datetime
 from django.forms import (CharField, DateField, FileField, Form, IntegerField,
     SplitDateTimeField, ValidationError, formsets)
 from django.forms.formsets import BaseFormSet, formset_factory
+from django.forms.models import modelformset_factory
 from django.forms.utils import ErrorList
 from django.test import TestCase
+from ..models import Tag
 
 
 class Choice(Form):
@@ -1213,3 +1215,22 @@ class TestEmptyFormSet(TestCase):
         class FileForm(Form):
             file = FileField()
         self.assertTrue(formset_factory(FileForm, extra=0)().is_multipart())
+
+
+TagFormSet = modelformset_factory(Tag, fields=['name'])
+
+
+class ModelFormSetTestCase(TestCase):
+    def test_duplicates_mixed_case(self):
+        formset = TagFormSet({
+            'form-TOTAL_FORMS': 2,
+            'form-INITIAL_FORMS': 0,
+            'form-0-name': 'TAG',
+            'form-1-name': 'tag',
+        })
+        self.assertTrue(formset.is_valid())
+        formset.save()
+        self.assertQuerysetEqual(
+            Tag.objects.all(),
+            ['TAG'],
+            lambda o: o.name)

When running this test with a MySQL database the test fails with:

======================================================================
ERROR: test_duplicates_mixed_case (forms_tests.tests.test_formsets.ModelFormSetTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jon/devel/django/tests/forms_tests/tests/test_formsets.py", line 1232, in test_duplicates_mixed_case
    formset.save()
  File "/home/jon/devel/django/django/forms/models.py", line 638, in save
    return self.save_existing_objects(commit) + self.save_new_objects(commit)
  File "/home/jon/devel/django/django/forms/models.py", line 769, in save_new_objects
    self.new_objects.append(self.save_new(form, commit=commit))
  File "/home/jon/devel/django/django/forms/models.py", line 621, in save_new
    return form.save(commit=commit)
  File "/home/jon/devel/django/django/forms/models.py", line 461, in save
    construct=False)
  File "/home/jon/devel/django/django/forms/models.py", line 103, in save_instance
    instance.save()
  File "/home/jon/devel/django/django/db/models/base.py", line 694, in save
    force_update=force_update, update_fields=update_fields)
  File "/home/jon/devel/django/django/db/models/base.py", line 722, in save_base
    updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
  File "/home/jon/devel/django/django/db/models/base.py", line 803, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/home/jon/devel/django/django/db/models/base.py", line 842, in _do_insert
    using=using, raw=raw)
  File "/home/jon/devel/django/django/db/models/manager.py", line 86, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/home/jon/devel/django/django/db/models/query.py", line 952, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/home/jon/devel/django/django/db/models/sql/compiler.py", line 930, in execute_sql
    cursor.execute(sql, params)
  File "/home/jon/devel/django/django/db/backends/utils.py", line 65, in execute
    return self.cursor.execute(sql, params)
  File "/home/jon/devel/django/django/db/utils.py", line 95, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/home/jon/devel/django/django/db/backends/utils.py", line 65, in execute
    return self.cursor.execute(sql, params)
  File "/home/jon/devel/django/django/db/backends/mysql/base.py", line 126, in execute
    return self.cursor.execute(query, args)
  File "/usr/lib64/python2.7/site-packages/MySQLdb/cursors.py", line 174, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
    raise errorclass, errorvalue
IntegrityError: (1062, "Duplicate entry 'tag' for key 'name'")

----------------------------------------------------------------------

Change History (11)

comment:1 by Jon Dufresne, 9 years ago

I am able to workaround this in my project by changing the line in BaseModelFormSet.validate_unique() from:

row_data = tuple([form.cleaned_data[field] for field in unique_check if field in form.cleaned_data])

To (adding .lower()):

row_data = tuple([form.cleaned_data[field].lower() for field in unique_check if field in form.cleaned_data])

I feel this is a hack and wrongly assumes that the database is case-insensitive. This is OK for my project, but not as a proper fix.

I'd appreciate some guidance on how this should be fixed in the general case to handle case-insensitive databases such as MySQL as well as case-sensitive databases such as PostgreSQL.

comment:2 by Tim Graham, 9 years ago

Do you really want that collation? Have you read the Django docs on the topic? I am not sure a specific mention of this problem is needed, but if you care to draft something, I'll happily review it.

It seems like a lot of complexity for Django to try to inspect MySQL's collation and alter its behavior based on that, and I'm not sure it's worth it.

comment:3 by Jon Dufresne, 9 years ago

Cc: jon.dufresne@… added

Do you really want that collation? Have you read the ​Django docs on the topic?

Thank you for pointing me to that documentation. I have read through it.

By "that collation" do you mean the default of utf8_general_ci?

To be honest, I find it a bit strange that Django is saying it can't work properly with MySQL--a supported database bacekend--in the default MySQL configuration. It seems to be suggesting I change my database collation to utf8_bin (which I can do if I must) but changing to that collation creates different set of issues (bytestrings vs Unicode strings). So I guess I have to choose which issues I can tolerate most.

I am not sure a specific mention of this problem is needed, but if you care to draft something, I'll happily review it.

Sure, I'll take a stab at it.

It seems like a lot of complexity for Django to try to inspect MySQL's collation and alter its behavior based on that, and I'm not sure it's worth it.

I agree we want to avoid inspecting MySQL's collation.

This is why I was wondering if there was any guidance to fix this in a general way that didn't look like a complete hack around MySQL specific behavior. Basically, I'm asking for some ideas. Once those ideas are presented, I don't mind doing the work to get them in place.

If it isn't worth it, that means this default database backend will knowingly break when used with useful Django features. Those useful features: model formsets and unique constraints.

comment:4 by Tim Graham <timograham@…>, 9 years ago

In 4d27d72d149b714431b77f2f15bad1591a9602b7:

Refs #23964 -- Added warning about case-insensitive, unique fields used with formsets

comment:5 by Tim Graham <timograham@…>, 9 years ago

In 4862eb6e5b6dae63e5c156fa1095c5b87503620b:

[1.7.x] Refs #23964 -- Added warning about case-insensitive, unique fields used with formsets

Backport of 4d27d72d149b714431b77f2f15bad1591a9602b7 from master

comment:6 by Tim Graham <timograham@…>, 9 years ago

In 30a12d6ca66ae2558740ef05f4a0b44566cfefa3:

[1.6.x] Refs #23964 -- Added warning about case-insensitive, unique fields used with formsets

Backport of 4d27d72d149b714431b77f2f15bad1591a9602b7 from master

comment:7 by Tim Graham, 9 years ago

Triage Stage: UnreviewedAccepted

Thanks for the documentation patch. I don't see an easy way to solve the problem generally. An option on the formset won't work if multiple databases are in use and third-party apps cannot know what database they are running on. A model method like get_normalized_unique_value() could inspect the database indicated by the write router and handle database/table specific behavior. The formset could call that then. Perhaps try raising the idea on the DevelopersMailingList to get other opinions. I'll mark the ticket accepted for now, and we can close it later if we decide it isn't worthwhile.

comment:8 by Adam Johnson, 4 years ago

Cc: Adam Johnson added

comment:9 by Mariusz Felisiak, 19 months ago

Summary: MySQL: BaseModelFormSet.validate_unique() fails for mixed case;Support for Meta.constraints validation in BaseModelFormSet.validate_unique().

As far as I'm aware adding support for Meta.constraints validation in BaseModelFormSet.validate_unique() should solve this and related issues (#28405). It should be feasible to pass all values from the formset via a constraint definition and check for duplicated rows on the database level.

comment:10 by Mariusz Felisiak, 14 months ago

Cc: Hannes Ljungberg added

#34296 was a duplicate for UniqueConstraint that only use F() expressions.

comment:11 by Jon Dufresne, 14 months ago

Cc: jon.dufresne@… removed
Note: See TracTickets for help on using tickets.
Back to Top