Opened 3 years ago

Last modified 7 months ago

#22666 new Bug

GenericIPAddressField index never used on PostgreSQL

Reported by: Marti Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: db-indexes
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This isn't a problem for us currently, but something I noticed in the queries generated by Django, which I find to be an antifeature: fields that have GenericIPAddressField(db_index=True) are currently incapable of actually using the index in filters on PostgreSQL. The indexing does work on other databases because it's just a char/varchar field.

For example, given model:

class Ip(models.Model):
    ip = models.GenericIPAddressField(db_index=True)

The index is created as:

CREATE INDEX "asd_ip_ip" ON "asd_ip" ("ip");

And query filter clauses are generated using the host() database function. Since the index was created without the function, it cannot be used:

Ip.objects.filter(ip='1.2.3.4')
# SELECT ... FROM "asd_ip" WHERE HOST("asd_ip"."ip") = '1.2.3.4'
Ip.objects.filter(ip__gte='1.2.3.4')
# SELECT ... FROM "asd_ip" WHERE HOST("asd_ip"."ip") >= '1.2.3.4'

Worse, since host() converts the IP address to text, the __gte filter stops making much sense, it will consider the IP address '255.0.0.0' to be less than '3.0.0.0'

The index can be used for natural ordering, but the ordering will be inconsistent with the above __gte example and other greater/less than operators.

Ip.objects.order_by('ip')
# SELECT ... FROM "asd_ip" ORDER BY "asd_ip"."ip" ASC

I understand that this was done to fix #708 -- the ability to use __contains= for IP addresses, but I think the cure is worse than the disease. In order to support an operation that doesn't make much sense for IP addresses, the change sacrifices the advantages that PostgreSQL's native inet type provides and makes ordering inconsistent with filter comparsisons.


I think the "correct" way to address this is to revert that fix (a9b4efc82b23383038fed6da6ba97242aece27c1) and implement it the same way how __contains works for integers. The ::text cast is universal and works with all data types in PostgreSQL:

Ip.objects.filter(pk__contains=123)
# SELECT ... FROM "asd_ip" WHERE "asd_ip"."id"::text LIKE '%123%'

But this will break for users who are depending on the current __gte etc behavior in PostgreSQL, or expecting it to behave the same way in all databases.

Another approach is to simply use char/varchar type (like other databases) instead of the PostgreSQL-specific inet, so it always uses text-ordering behavior and behaves the same in all databases without any hackery.

Change History (8)

comment:1 Changed 3 years ago by Erik Romijn

Component: UncategorizedDatabase layer (models, ORM)
Triage Stage: UnreviewedAccepted

I can definitely see the issue here. A few minor points:

  • IPAddressField is deprecated and will be removed in 1.9. I don't think we should fix anything in it: anyone experiencing an issue can migrate to GenericIPAddressField. That doesn't solve this ticket though, as you correctly describe, GenericIPAddressField has the same issue.
  • Another ticket of interest may be #11442 - which in its final resolution allows defining custom fields using INET that are not forced through HOST()

If we implement __contains to work the same as for integers, the only difference between databases would be that sorting is different, right? But filtering would still work the same? In the case of IP addresses, it seems to me that string sorting is a bit silly in any case. Balancing that against the inability to index even exact match queries, makes this solution seem very reasonable to me.

comment:2 Changed 3 years ago by Marti

Maybe this illustrates better the behavior I am talking about:

Current PG behaviorProposed PG behaviorOther DBsComment
__gte, __lt, etcTextualNaturalTextualThe only place where behavior changes, to be able to use indexes.
order_byNaturalNaturalTextualBehavior was always different between Postgres and others.
__containsTextualTextualTextualI don't find this useful, but since we already have textual __contains
for integers, it makes sense to fo it the same way for inet.

If we implement __contains to work the same as for integers, the only difference between databases would be that sorting is different, right?
Another ticket of interest may be #11442 - which in its final resolution allows defining custom fields using INET that are not forced through HOST()

Uh, so subclasses of GenericIPAddressField with a different name don't inherit its behavior? :( It seems we're adding hacks on top of hacks.

Actually the test introduced in commit f7467181aa56e30c946ff55190c49792d6e9a72f points out a problem, it can't be done exactly like integers. It seems the ::text cast always appends the netmask, e.g. /32. So I guess we still need host() for LIKE searches.

Last edited 3 years ago by Marti (previous) (diff)

comment:3 in reply to:  2 Changed 3 years ago by Erik Romijn

Replying to intgr:

Uh, so subclasses of GenericIPAddressField with a different name don't inherit its behavior? :( It seems we're adding hacks on top of hacks.

No, it depends on the return value of get_internal_type() of the field.

Maybe this illustrates better the behavior I am talking about: ...

And along with __gte and __lt, this would also change for exact matches, right? The way forward you suggest makes sense to me then.

There is a minor backwards compatibility issue, as of course the behaviour of __gte and friends will have changed. However, its current behaviour with textual sorting doesn't make much sense, so I doubt this is an issue for anyone. It does warrant a mention in the release notes.

comment:4 Changed 3 years ago by Marti

No, it depends on the return value of get_internal_type() of the field.

Ah sorry, bad assumptions on my part. That makes sense.

And along with __gte and __lt, this would also change for exact matches, right?

Yeah, I think we're on the same page. So it's a simple matter of coding now?

comment:5 Changed 2 years ago by Erik Romijn

Yes, I believe so. And testing and documentation, of course :)

comment:6 Changed 2 years ago by Tim Graham

Type: UncategorizedBug

comment:7 Changed 16 months ago by Tim Graham

Summary: (Generic)IPAddressField index never used on PostgreSQL, inconsistent behaviorGenericIPAddressField index never used on PostgreSQL
Version: 1.6master

comment:8 Changed 7 months ago by Tim Graham

Keywords: db-indexes added
Note: See TracTickets for help on using tickets.
Back to Top