Opened 4 years ago

Closed 2 years ago

Last modified 35 hours ago

#33507 closed New feature (fixed)

Use native UUID data type on MariaDB 10.7+

Reported by: Mariusz Felisiak Owned by: raydeal
Component: Database layer (models, ORM) Version: 4.0
Severity: Normal Keywords: mariadb
Cc: Adam Johnson Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Carlton Gibson)

MariaDB 10.7 introduces UUID data type, see docs and blog post. We should correct the has_native_uuid_field feature flag and use uuid instead of char(32).

Change History (25)

comment:1 by Mariusz Felisiak, 4 years ago

Description: modified (diff)

comment:2 by Carlton Gibson, 4 years ago

Description: modified (diff)
Triage Stage: UnreviewedAccepted

comment:3 by Simon Charette, 4 years ago

Small note that we should make sure that changing this flag won't break installs with UUIDField backend by char(32) columns as it might be impractical for large installs to rebuild their whole table primary keys otherwise.

If we don't document a clear upgrade path I foresee a few reports when users attempt to add a ForeignKey referencing a model with a UUIDField primary key that is backed by a char(32) and hit a MariaDB error telling them that a foreign constraint cannot be created from a uuid to a char(32) column.

Last edited 4 years ago by Simon Charette (previous) (diff)

in reply to:  3 comment:4 by Mariusz Felisiak, 4 years ago

Replying to Simon Charette:

Small note that we should make sure that changing this flag won't break installs with UUIDField backend by char(32) columns as it might be impractical for large installs to rebuild their whole table primary keys otherwise.

If we don't document a clear upgrade path I foresee a few reports when users attempt to add a ForeignKey referencing a model with a UUIDField primary key that is backed by a char(32) and hit a MariaDB error telling them that a foreign constraint cannot be created from a uuid to a char(32) column.

According to the blog post, it should be possible to change a data type without manual data conversion:

With these basic conversion rules, you can migrate from your existing CHAR/VARCHAR/TEXT hexadecimal text or BINARY/VARBINARY/BLOB encoded to UUID using:

CREATE TABLE t1 (id BINARY(16));
ALTER TABLE t1 MODIFY COLUMN id UUID;

comment:5 by Adam Johnson, 4 years ago

That ALTER TABLE still requires a table rebuild, which is impractical or impossible for large installations’ PK's. But maybe the "upgrade path" there would be "create a field subclass always backed by char(32)".

comment:6 by Simon Charette, 4 years ago

With these basic conversion rules, you can migrate from your existing CHAR/VARCHAR/TEXT hexadecimal text or BINARY/VARBINARY/BLOB encoded to UUID using:

Unfortunately, this statement doesn't account for the fact that UUIDs are often used as primary keys and thus referenced by other tables where this approach simply won't work as you get in a chicken-egg problem with foreign references. To perform this properly a manual dropping of references, followed by type alterations of all tables involved, and final step to recreate constraints is required which possibly will require table rebuilds as Adam pointed out.

But maybe the "upgrade path" there would be "create a field subclass always backed by char(32)".

That seems like the most suitable upgrade path. It will likely require documenting that all the generated migration operations should be wrapped in SeparateDatabaseAndState unless we adapt the schema editor to avoid ALTER when the generated SQL column of from_field and to_field remains the same.

comment:7 by Adam Johnson, 4 years ago

unless we adapt the schema editor to avoid ALTER when the generated SQL column of from_field and to_field remains the same.

This sounds like generally a good idea any way!

comment:8 by Mariusz Felisiak, 4 years ago

Agreed, this can be tricky. I'd really like to come up with a doable pattern that could be used in the future, because other emulated fields may be implemented in the databases.

comment:9 by raydeal, 4 years ago

Owner: changed from nobody to raydeal
Status: newassigned

comment:10 by raydeal, 4 years ago

I changed type to UUID but it has broken prefetch_related tests. I have tried in myslq/operations.py add UUIDField to cast_data_types and override unification_cast_sql but it didn't help.
The issue is because value of related field is still visible as str type not as UUID type. I suspect that there is lack of cast type to uuid in query, but I can't find place where to fix it.
Work in progress :)

Last edited 3 years ago by raydeal (previous) (diff)

comment:11 by raydeal, 3 years ago

Has patch: set

comment:12 by Mariusz Felisiak, 3 years ago

Patch needs improvement: set

comment:13 by raydeal, 3 years ago

Patch needs improvement: unset

in reply to:  7 comment:14 by Mariusz Felisiak, 3 years ago

Replying to Adam Johnson:

unless we adapt the schema editor to avoid ALTER when the generated SQL column of from_field and to_field remains the same.

This sounds like generally a good idea any way!

Looks like it already works. For example, I've changed models.CharField(max_length=50, unique=True) to models.SlugField(unique=True) on PostgreSQL and Django didn't alter the db type:

--
-- Alter field field_2 on mytestmodel
--
-- (no-op)

comment:15 by Mariusz Felisiak, 3 years ago

Needs documentation: set
Patch needs improvement: set

comment:16 by raydeal, 2 years ago

Needs documentation: unset
Patch needs improvement: unset

comment:17 by Mariusz Felisiak, 2 years ago

Patch needs improvement: set

comment:18 by raydeal, 2 years ago

Patch needs improvement: unset

comment:19 by Mariusz Felisiak, 2 years ago

Triage Stage: AcceptedReady for checkin

comment:20 by Mariusz Felisiak <felisiak.mariusz@…>, 2 years ago

Resolution: fixed
Status: assignedclosed

In 7cd187a:

Fixed #33507 -- Used UUID data type on MariaDB 10.7+.

Co-Authored-By: Mariusz Felisiak <felisiak.mariusz@…>

comment:21 by GitHub <noreply@…>, 2 years ago

In 9c37103:

Refs #33507 -- Doc'd using UUID data type on MariaDB 10.7+ in UUIDField docs.

Follow up to 7cd187a5ba58d7769039f487faeb9a5a2ff05540.

comment:22 by Klaas van Schelven, 5 weeks ago

I just ran into this here.

Caused by not carefully reading the release notes, but quite hard to debug.

Having actually read those notes now, I'm left to wonder:

  1. what will happen with existing databases that are from a post-5.0 world? They will have been created with uuid columns. Will the suggested migration path accidentally brake _those_?
  2. the suggested path is "varchars everywhere". But my project is deployed on many DB backends, also postgres. So the suggested solution is a downgrade for those and or create the risk of breakage on those...

I realise these questions may be "more on me than on Django" but OTOH the present issue seems like exactly the spot where people with such problems would gather.

comment:23 by Klaas van Schelven, 5 weeks ago

I also think there's another problem with this change: someone upgrading mariadb _itself_ (to 10.7 from a lower version) would have to go through the exact same migration instructions. These are currently undocumented, and even if they were it is unlikely that someone would actually find them (because one would be looking at the maria instructions, not the django ones)

comment:24 by Christoph Bülter, 3 days ago

I would like to add some thoughts about my recent experience using MariaDB 10.11 and upgrading from Django 3 to 5:

  • This however means developers will have to remember using that custom field, so if someone forgets and accidentally creates a new models.UUIDField later on, there will be a mix of CHAR(32) and UUID columns in the database, which is probably not ideal. I would like to reiterate on an alternative upgrade path:
  • In previous comments, valid concerns have been raised about potential issues with large tables, table rebuilds and UUIDFields used as primary-keys or in foreign-key relations or constraints. Our migration scenario is much simpler, though: We have lots of UUIDFields, but the tables are relatively small and they are not used as primary-keys/relations/constraints.
  • Instead of using that aforementioned Char32UUIDField, it made more sense for us to just convert all existing models.UUIDField from CHAR(32) to UUID in a Django data migration, and then use the builtin models.UUIDField moving forward, so everything is based on the UUID type. The data migration could look like:
class Migration(migrations.Migration):
    ...
    operations = [
        # Update UUIDFields from CHAR(32) to UUID
        migrations.RunSQL(
            sql="""
            ALTER TABLE myapp_table1 MODIFY field1 UUID NOT NULL;
            ALTER TABLE myapp_table2 MODIFY field2 UUID NOT NULL;
            ALTER TABLE myapp_table3 MODIFY field3 UUID NOT NULL;
            """,
        )
    ]

I believe for our scenario this is easier to maintain longterm, since there's nothing special to remember about and no danger of introducing mixed types.

I think this is worth mentioning in the docs as an alternative option: https://github.com/django/django/pull/20091

Last edited 35 hours ago by Christoph Bülter (previous) (diff)

comment:25 by Mariusz Felisiak, 2 days ago

Personally, I'd try to avoid multiple migrations paths in our release notes, it can be confusing. Your comment in this ticket should be enough to help users who struggle.

Note: See TracTickets for help on using tickets.
Back to Top