Opened 15 months ago

Last modified 27 hours ago

#22666 new Bug

GenericIPAddressField index never used on PostgreSQL

Reported by: intgr Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This isn't a problem for us currently, but something I noticed in the queries generated by Django, which I find to be an antifeature: fields that have GenericIPAddressField(db_index=True) are currently incapable of actually using the index in filters on PostgreSQL. The indexing does work on other databases because it's just a char/varchar field.

For example, given model:

class Ip(models.Model):
    ip = models.GenericIPAddressField(db_index=True)

The index is created as:

CREATE INDEX "asd_ip_ip" ON "asd_ip" ("ip");

And query filter clauses are generated using the host() database function. Since the index was created without the function, it cannot be used:

Ip.objects.filter(ip='1.2.3.4')
# SELECT ... FROM "asd_ip" WHERE HOST("asd_ip"."ip") = '1.2.3.4'
Ip.objects.filter(ip__gte='1.2.3.4')
# SELECT ... FROM "asd_ip" WHERE HOST("asd_ip"."ip") >= '1.2.3.4'

Worse, since host() converts the IP address to text, the __gte filter stops making much sense, it will consider the IP address '255.0.0.0' to be less than '3.0.0.0'

The index can be used for natural ordering, but the ordering will be inconsistent with the above __gte example and other greater/less than operators.

Ip.objects.order_by('ip')
# SELECT ... FROM "asd_ip" ORDER BY "asd_ip"."ip" ASC

I understand that this was done to fix #708 -- the ability to use __contains= for IP addresses, but I think the cure is worse than the disease. In order to support an operation that doesn't make much sense for IP addresses, the change sacrifices the advantages that PostgreSQL's native inet type provides and makes ordering inconsistent with filter comparsisons.


I think the "correct" way to address this is to revert that fix (a9b4efc82b23383038fed6da6ba97242aece27c1) and implement it the same way how __contains works for integers. The ::text cast is universal and works with all data types in PostgreSQL:

Ip.objects.filter(pk__contains=123)
# SELECT ... FROM "asd_ip" WHERE "asd_ip"."id"::text LIKE '%123%'

But this will break for users who are depending on the current __gte etc behavior in PostgreSQL, or expecting it to behave the same way in all databases.

Another approach is to simply use char/varchar type (like other databases) instead of the PostgreSQL-specific inet, so it always uses text-ordering behavior and behaves the same in all databases without any hackery.

Change History (7)

comment:1 Changed 15 months ago by erikr

  • Component changed from Uncategorized to Database layer (models, ORM)
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

I can definitely see the issue here. A few minor points:

  • IPAddressField is deprecated and will be removed in 1.9. I don't think we should fix anything in it: anyone experiencing an issue can migrate to GenericIPAddressField. That doesn't solve this ticket though, as you correctly describe, GenericIPAddressField has the same issue.
  • Another ticket of interest may be #11442 - which in its final resolution allows defining custom fields using INET that are not forced through HOST()

If we implement __contains to work the same as for integers, the only difference between databases would be that sorting is different, right? But filtering would still work the same? In the case of IP addresses, it seems to me that string sorting is a bit silly in any case. Balancing that against the inability to index even exact match queries, makes this solution seem very reasonable to me.

comment:2 follow-up: Changed 15 months ago by intgr

Maybe this illustrates better the behavior I am talking about:

Current PG behaviorProposed PG behaviorOther DBsComment
__gte, __lt, etcTextualNaturalTextualThe only place where behavior changes, to be able to use indexes.
order_byNaturalNaturalTextualBehavior was always different between Postgres and others.
__containsTextualTextualTextualI don't find this useful, but since we already have textual __contains
for integers, it makes sense to fo it the same way for inet.

If we implement __contains to work the same as for integers, the only difference between databases would be that sorting is different, right?
Another ticket of interest may be #11442 - which in its final resolution allows defining custom fields using INET that are not forced through HOST()

Uh, so subclasses of GenericIPAddressField with a different name don't inherit its behavior? :( It seems we're adding hacks on top of hacks.

Actually the test introduced in commit f7467181aa56e30c946ff55190c49792d6e9a72f points out a problem, it can't be done exactly like integers. It seems the ::text cast always appends the netmask, e.g. /32. So I guess we still need host() for LIKE searches.

Last edited 15 months ago by intgr (previous) (diff)

comment:3 in reply to: ↑ 2 Changed 14 months ago by erikr

Replying to intgr:

Uh, so subclasses of GenericIPAddressField with a different name don't inherit its behavior? :( It seems we're adding hacks on top of hacks.

No, it depends on the return value of get_internal_type() of the field.

Maybe this illustrates better the behavior I am talking about: ...

And along with __gte and __lt, this would also change for exact matches, right? The way forward you suggest makes sense to me then.

There is a minor backwards compatibility issue, as of course the behaviour of __gte and friends will have changed. However, its current behaviour with textual sorting doesn't make much sense, so I doubt this is an issue for anyone. It does warrant a mention in the release notes.

comment:4 Changed 14 months ago by intgr

No, it depends on the return value of get_internal_type() of the field.

Ah sorry, bad assumptions on my part. That makes sense.

And along with __gte and __lt, this would also change for exact matches, right?

Yeah, I think we're on the same page. So it's a simple matter of coding now?

comment:5 Changed 14 months ago by erikr

Yes, I believe so. And testing and documentation, of course :)

comment:6 Changed 14 months ago by timo

  • Type changed from Uncategorized to Bug

comment:7 Changed 27 hours ago by timgraham

  • Summary changed from (Generic)IPAddressField index never used on PostgreSQL, inconsistent behavior to GenericIPAddressField index never used on PostgreSQL
  • Version changed from 1.6 to master
Note: See TracTickets for help on using tickets.
Back to Top