Opened 2 months ago

Closed 8 weeks ago

#35777 closed New feature (wontfix)

MySQL: cannot add key limited index on TEXT columns

Reported by: Tobias Krönke Owned by: ishaan sangwan
Component: Database layer (models, ORM) Version:
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Heyo!

I have a model with a text field url = models.TextField() and I would like to add an index to my MySQL 8.0.36 DB to speed up queries like url__startswith='https://google.com%' -> WHERE url LIKE 'https://google.com%'. I can actually do it by creating an index manually like so:

CREATE INDEX `my_limited_prefix_index` ON `my_model_table` (url(100));

However, I would like to be able to do it with the ORM. Having it only in the migrations as a data migration has many downsides:

  • cannot run tests without running the migrations if i need the index in tests
  • makes squashing migrations more painful

But it seems, this is impossible to achieve with the Meta.indexes:

Index(fields=['url(100)'], name='my_limited_prefix_index')

-> denied by django with (models.E012) 'indexes' refers to the nonexistent field 'url(100)'

Index(expressions.RawSQL('url(100)', ()), name='my_limited_prefix_index')

-> denied by MySQL with syntax error (see https://github.com/sqlalchemy/sqlalchemy/issues/5462, django adds double parantheses which is only valid syntax for expressions, but here I need a column definition)

Index(expressions.RawSQL('LEFT(url,100)', ()), name='my_limited_prefix_index')

-> creates a useless index that cannot be used by MySQL for startswith (aka not sargable anymore).

I guess my favourite solution would be to allow the 2nd way with being able to turn off the surrounding expression parantheses.

Thx!

Change History (8)

comment:1 by Sarah Boyce, 2 months ago

Resolution: invalid
Status: newclosed

I feel this functionality exists and the way to go about this might be using Index.expressions and Left

in reply to:  1 comment:2 by Tobias Krönke, 2 months ago

Replying to Sarah Boyce:

I feel this functionality exists and the way to go about this might be using Index.expressions and Left

Thx. Unfortunately, your feeling is not accurate. Your suggestion results in the same index created as my 3rd approach in the OP. That index however is useless. It cannot be used by MySQL for speeding up startswith queries.

comment:3 by Tobias Krönke, 2 months ago

Resolution: invalid
Status: closednew

This is the kind of index I would like to be able to create: https://dev.mysql.com/doc/refman/8.4/en/create-index.html#create-index-column-prefixes. So called Column Prefix Key Parts.

comment:4 by Sarah Boyce, 2 months ago

Index.condition with Q(startswith='https://google.com') is probably also an option.

Version 0, edited 2 months ago by Sarah Boyce (next)

comment:5 by Tobias Krönke, 2 months ago

The url__startswith='https://google.com%' query was just an example. I would like to optimize all startswith queries on that field. I've created this crude workaround and implemented my own Index:

class ColumnPrefixIndex:
    max_name_length = 60
    contains_expressions = False
    fields_orders = []
    include = []
    expressions = []
    fields = []
    condition = None

    def __init__(self, column, prefix, name):
        self.prefix = prefix
        self.column = column
        self.name = name

    def create_sql(self, model, schema_editor, using="", **kwargs):
        table = model._meta.db_table
        sql = f'CREATE INDEX {self.name} ON {table} ({self.column}({self.prefix}));'
        kwargs['sql'] = sql

        return schema_editor._create_index_sql(
            model,
            name=self.name,
            using=using,
            **kwargs,
        )

    def deconstruct(self):
        kwargs = {'column': self.column, 'prefix': self.prefix, 'name': self.name}
        path = "%s.%s" % (self.__class__.__module__, self.__class__.__name__)
        return path, self.expressions, kwargs

    def clone(self):
        return self.__class__(self.column, self.prefix, self.name)
Last edited 2 months ago by Tobias Krönke (previous) (diff)

comment:6 by ishaan sangwan, 2 months ago

Owner: set to ishaan sangwan
Status: newassigned

comment:7 by ishaan sangwan, 2 months ago

Triage Stage: UnreviewedAccepted

in reply to:  7 comment:8 by Natalia Bidart, 8 weeks ago

Component: MigrationsDatabase layer (models, ORM)
Resolution: wontfix
Status: assignedclosed
Triage Stage: AcceptedUnreviewed
Type: UncategorizedNew feature
Version: 5.0

Hello ishaan-sangwan and Tobias!

Thank you for your interest in making Django better. I'm afraid we can't accept this ticket yet, we may need to gather some consensus from the community before going ahead and adding this new index. To me, this feels a niche use case and thus I don't think this applies to the broader ecosystem, and Django is a framework designed to offer robust and accurate solutions for common scenarios.

If we would add a new index class, we should consider supporting all the DB backends that Django supports, not just MySQL. If we can't support all backends, I'm not sure it' a good idea to add this to Django. From a quick search, I'm not sure if this is possible?

To fully evaluate this, the recommended path forward is to first propose and discuss the idea with the community and gain consensus. To do that, please consider starting a new conversation on the Django Forum, where you'll reach a broader audience and receive additional feedback. Please be sure to add details about how this new index would look like for the other DB backends.

I'll close the ticket for now, but if the community agrees with the proposal, please return to this ticket and reference the forum discussion so we can re-open it. For more information, please refer to the documented guidelines for requesting features.

Note: See TracTickets for help on using tickets.
Back to Top