Opened 11 years ago
Last modified 9 years ago
#22666 new Bug
GenericIPAddressField index never used on PostgreSQL
Reported by: | Marti Raudsepp | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | db-indexes |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
This isn't a problem for us currently, but something I noticed in the queries generated by Django, which I find to be an antifeature: fields that have GenericIPAddressField(db_index=True) are currently incapable of actually using the index in filters on PostgreSQL. The indexing does work on other databases because it's just a char/varchar field.
For example, given model:
class Ip(models.Model): ip = models.GenericIPAddressField(db_index=True)
The index is created as:
CREATE INDEX "asd_ip_ip" ON "asd_ip" ("ip");
And query filter clauses are generated using the host()
database function. Since the index was created without the function, it cannot be used:
Ip.objects.filter(ip='1.2.3.4') # SELECT ... FROM "asd_ip" WHERE HOST("asd_ip"."ip") = '1.2.3.4' Ip.objects.filter(ip__gte='1.2.3.4') # SELECT ... FROM "asd_ip" WHERE HOST("asd_ip"."ip") >= '1.2.3.4'
Worse, since host() converts the IP address to text, the __gte
filter stops making much sense, it will consider the IP address '255.0.0.0' to be less than '3.0.0.0'
The index can be used for natural ordering, but the ordering will be inconsistent with the above __gte
example and other greater/less than operators.
Ip.objects.order_by('ip') # SELECT ... FROM "asd_ip" ORDER BY "asd_ip"."ip" ASC
I understand that this was done to fix #708 -- the ability to use __contains=
for IP addresses, but I think the cure is worse than the disease. In order to support an operation that doesn't make much sense for IP addresses, the change sacrifices the advantages that PostgreSQL's native inet
type provides and makes ordering inconsistent with filter comparsisons.
I think the "correct" way to address this is to revert that fix (a9b4efc82b23383038fed6da6ba97242aece27c1) and implement it the same way how __contains
works for integers. The ::text
cast is universal and works with all data types in PostgreSQL:
Ip.objects.filter(pk__contains=123) # SELECT ... FROM "asd_ip" WHERE "asd_ip"."id"::text LIKE '%123%'
But this will break for users who are depending on the current __gte
etc behavior in PostgreSQL, or expecting it to behave the same way in all databases.
Another approach is to simply use char
/varchar
type (like other databases) instead of the PostgreSQL-specific inet
, so it always uses text-ordering behavior and behaves the same in all databases without any hackery.
Change History (8)
comment:1 by , 11 years ago
Component: | Uncategorized → Database layer (models, ORM) |
---|---|
Triage Stage: | Unreviewed → Accepted |
follow-up: 3 comment:2 by , 11 years ago
Maybe this illustrates better the behavior I am talking about:
Current PG behavior | Proposed PG behavior | Other DBs | Comment | |
---|---|---|---|---|
__gte , __lt , etc | Textual | Natural | Textual | The only place where behavior changes, to be able to use indexes. |
order_by | Natural | Natural | Textual | Behavior was always different between Postgres and others. |
__contains | Textual | Textual | Textual | I don't find this useful, but since we already have textual __contains for integers, it makes sense to fo it the same way for inet. |
If we implement
__contains
to work the same as for integers, the only difference between databases would be that sorting is different, right?
Another ticket of interest may be #11442 - which in its final resolution allows defining custom fields using INET that are not forced throughHOST()
Uh, so subclasses of GenericIPAddressField with a different name don't inherit its behavior? :( It seems we're adding hacks on top of hacks.
Actually the test introduced in commit f7467181aa56e30c946ff55190c49792d6e9a72f points out a problem, it can't be done exactly like integers. It seems the ::text
cast appends always the netmask, e.g. /32
. So I guess we still need host()
for LIKE searches.
comment:3 by , 10 years ago
Replying to intgr:
Uh, so subclasses of GenericIPAddressField with a different name don't inherit its behavior? :( It seems we're adding hacks on top of hacks.
No, it depends on the return value of get_internal_type()
of the field.
Maybe this illustrates better the behavior I am talking about: ...
And along with __gte
and __lt
, this would also change for exact matches, right? The way forward you suggest makes sense to me then.
There is a minor backwards compatibility issue, as of course the behaviour of __gte
and friends will have changed. However, its current behaviour with textual sorting doesn't make much sense, so I doubt this is an issue for anyone. It does warrant a mention in the release notes.
comment:4 by , 10 years ago
No, it depends on the return value of get_internal_type() of the field.
Ah sorry, bad assumptions on my part. That makes sense.
And along with
__gte
and__lt
, this would also change for exact matches, right?
Yeah, I think we're on the same page. So it's a simple matter of coding now?
comment:6 by , 10 years ago
Type: | Uncategorized → Bug |
---|
comment:7 by , 9 years ago
Summary: | (Generic)IPAddressField index never used on PostgreSQL, inconsistent behavior → GenericIPAddressField index never used on PostgreSQL |
---|---|
Version: | 1.6 → master |
comment:8 by , 9 years ago
Keywords: | db-indexes added |
---|
I can definitely see the issue here. A few minor points:
HOST()
If we implement
__contains
to work the same as for integers, the only difference between databases would be that sorting is different, right? But filtering would still work the same? In the case of IP addresses, it seems to me that string sorting is a bit silly in any case. Balancing that against the inability to index even exact match queries, makes this solution seem very reasonable to me.