Opened 9 years ago
Closed 9 years ago
#26610 closed New feature (fixed)
Add a citext field for contrib.postgres
| Reported by: | Shadow | Owned by: | nobody | 
|---|---|---|---|
| Component: | contrib.postgres | Version: | 1.11 | 
| Severity: | Release blocker | Keywords: | index charfield textfield case insensitive optimisation db-indexes | 
| Cc: | aksheshdoshi@… | Triage Stage: | Ready for checkin | 
| Has patch: | yes | Needs documentation: | no | 
| Needs tests: | no | Patch needs improvement: | no | 
| Easy pickings: | no | UI/UX: | no | 
Description
I was analysing some of our queries - and one of the things I noticed was that if you use iexact, then the database needs to do a full table scan because it is unable to use the index I had on the field.
class Example(models.Model):
    text = models.CharField(index=True)
Example.objects.get(text="meow")  # This will hit the index
Example.objects.get(text__iexact="meow")  # This does not.
I am aware that this kind of index can be added manually by RunSQL migrations, but I think this is probably a common enough scenario that the ORM should attempt to handle it natively.
# PROPOSED CHANGES - THIS DOES NOT ACTUALLY WORK
class Example(models.Model):
    text = models.CharField(insensitive_index=True)
Example.objects.get(text="meow")  # This probably won't hit the index (might depend on the db)
Example.objects.get(text__iexact="meow")  # This would.
What are your thoughts?
Change History (18)
comment:1 by , 9 years ago
| Cc: | added | 
|---|---|
| Triage Stage: | Unreviewed → Accepted | 
| Version: | 1.9 → master | 
comment:2 by , 9 years ago
comment:3 by , 9 years ago
My current backend is indeed postgres. In my specific usage case is important. It's just when users are searching for models in my search box the filter needs to be case insensitive because... well... users are very sensitive about case insensitivity.
With the citext idea - would programmers have control enough to have both a sensitive and insensitive index if required? Or would that make all filters implicitly case insensitive?
follow-up: 5 comment:4 by , 9 years ago
citext would make all queries and unicity checks case-insensitive. It's still case-preserving so I assume it would work for you even if "case is important".
comment:5 by , 9 years ago
I've done some reading on citext - and it would indeed do the job I want it to do.
My only concern is that iexact effectively becomes implicit, which may surprise some developers.
I guess the question is: would people value being able to do both insensitive and sensitive searches on the same field?
The only other way I could think to do this would be creating a functional index, which would give the ability to have the different index kwargs as I had in the original report, but I'm not sure which wins from a performance point of view.
And, again, if people would find having both a useful feature.
comment:6 by , 9 years ago
| Keywords: | db-indexes added | 
|---|
comment:8 by , 9 years ago
| Component: | Database layer (models, ORM) → contrib.postgres | 
|---|---|
| Summary: | Case insensitive indexes → Add a citext field for contrib.postgres | 
comment:9 by , 9 years ago
| Patch needs improvement: | unset | 
|---|
comment:11 by , 9 years ago
As suggested on django-developers, I've created a PR to change the base class to TextField since max_length isn't used and shouldn't be required.
comment:12 by , 9 years ago
| Resolution: | fixed | 
|---|---|
| Status: | closed → new | 
Tim Graham open a PR [1] to address the issue [2] I raised on the mailing list of this field being a subclass of CharField instead of TextField.
Thanks Tim!
[1] https://github.com/django/django/pull/8034
[2] https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/django-developers/jud-n1cBzdg/ToRMj42pBAAJ
comment:13 by , 9 years ago
| Severity: | Normal → Release blocker | 
|---|---|
| Triage Stage: | Accepted → Ready for checkin | 
comment:14 by , 9 years ago
| Has patch: | unset | 
|---|---|
| Triage Stage: | Ready for checkin → Accepted | 
| Version: | master → 1.11 | 
Further discussion on the mailing list suggests to add a separate case-insensitive field for CharField, TextField, EmailField, etc.
comment:15 by , 9 years ago
| Has patch: | set | 
|---|---|
| Triage Stage: | Accepted → Ready for checkin | 
PR from Mads.
comment:18 by , 9 years ago
| Resolution: | → fixed | 
|---|---|
| Status: | new → closed | 
If we're talking about Postgres, I suggest adding a case-insensitive, case-preserving char/text field backed by the
citexttype todjango.contrib.postgres. This should cover most use cases.