Opened 3 years ago

Closed 3 years ago

#26610 closed New feature (fixed)

Add a citext field for contrib.postgres

Reported by: Shadow Owned by: nobody
Component: contrib.postgres Version: 1.11
Severity: Release blocker Keywords: index charfield textfield case insensitive optimisation db-indexes
Cc: aksheshdoshi@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I was analysing some of our queries - and one of the things I noticed was that if you use iexact, then the database needs to do a full table scan because it is unable to use the index I had on the field.

class Example(models.Model):
    text = models.CharField(index=True)

Example.objects.get(text="meow")  # This will hit the index
Example.objects.get(text__iexact="meow")  # This does not.

I am aware that this kind of index can be added manually by RunSQL migrations, but I think this is probably a common enough scenario that the ORM should attempt to handle it natively.

# PROPOSED CHANGES - THIS DOES NOT ACTUALLY WORK
class Example(models.Model):
    text = models.CharField(insensitive_index=True)

Example.objects.get(text="meow")  # This probably won't hit the index (might depend on the db)
Example.objects.get(text__iexact="meow")  # This would.

What are your thoughts?

Change History (18)

comment:1 Changed 3 years ago by Akshesh Doshi

Cc: aksheshdoshi@… added
Triage Stage: UnreviewedAccepted
Version: 1.9master

comment:2 Changed 3 years ago by Aymeric Augustin

If we're talking about Postgres, I suggest adding a case-insensitive, case-preserving char/text field backed by the citext type to django.contrib.postgres. This should cover most use cases.

comment:3 Changed 3 years ago by Shadow

My current backend is indeed postgres. In my specific usage case is important. It's just when users are searching for models in my search box the filter needs to be case insensitive because... well... users are very sensitive about case insensitivity.

With the citext idea - would programmers have control enough to have both a sensitive and insensitive index if required? Or would that make all filters implicitly case insensitive?

Last edited 3 years ago by Shadow (previous) (diff)

comment:4 Changed 3 years ago by Aymeric Augustin

citext would make all queries and unicity checks case-insensitive. It's still case-preserving so I assume it would work for you even if "case is important".

comment:5 in reply to:  4 Changed 3 years ago by Shadow

I've done some reading on citext - and it would indeed do the job I want it to do.

My only concern is that iexact effectively becomes implicit, which may surprise some developers.
I guess the question is: would people value being able to do both insensitive and sensitive searches on the same field?

The only other way I could think to do this would be creating a functional index, which would give the ability to have the different index kwargs as I had in the original report, but I'm not sure which wins from a performance point of view.
And, again, if people would find having both a useful feature.

comment:6 Changed 3 years ago by Tim Graham

Keywords: db-indexes added

comment:7 Changed 3 years ago by Tim Graham

Has patch: set
Patch needs improvement: set

PR (tests aren't passing)

comment:8 Changed 3 years ago by Tim Graham

Component: Database layer (models, ORM)contrib.postgres
Summary: Case insensitive indexesAdd a citext field for contrib.postgres

comment:9 Changed 3 years ago by Mads Jensen

Patch needs improvement: unset

comment:10 Changed 3 years ago by Tim Graham <timograham@…>

Resolution: fixed
Status: newclosed

In 094d630:

Fixed #26610 -- Added CITextField to contrib.postgres.

comment:11 Changed 3 years ago by Tim Graham

As suggested on django-developers, I've created a PR to change the base class to TextField since max_length isn't used and shouldn't be required.

comment:12 Changed 3 years ago by Sean Brant

Resolution: fixed
Status: closednew

Tim Graham open a PR [1] to address the issue [2] I raised on the mailing list of this field being a subclass of CharField instead of TextField.

Thanks Tim!

[1] https://github.com/django/django/pull/8034
[2] https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/django-developers/jud-n1cBzdg/ToRMj42pBAAJ

comment:13 Changed 3 years ago by Sean Brant

Severity: NormalRelease blocker
Triage Stage: AcceptedReady for checkin

comment:14 Changed 3 years ago by Tim Graham

Has patch: unset
Triage Stage: Ready for checkinAccepted
Version: master1.11

Further discussion on the mailing list suggests to add a separate case-insensitive field for CharField, TextField, EmailField, etc.

comment:15 Changed 3 years ago by Tim Graham

Has patch: set
Triage Stage: AcceptedReady for checkin

PR from Mads.

comment:16 Changed 3 years ago by Tim Graham <timograham@…>

In fb5bd38:

Refs #26610 -- Added CIText mixin and CIChar/Email/TextField.

comment:17 Changed 3 years ago by Tim Graham <timograham@…>

In ded0632:

[1.11.x] Refs #26610 -- Added CIText mixin and CIChar/Email/TextField.

Backport of fb5bd38e3b83c7f0d1011de80f922fc34faf740b from master

comment:18 Changed 3 years ago by Tim Graham

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.
Back to Top