Django

Code

Ticket #11442 (reopened)

Opened 8 months ago

Last modified 4 weeks ago

Postgresql backend casts inet types to text, breaks IP operations and IPv6 lookups.

Reported by: eide Assigned to: nobody
Milestone: Component: Database layer (models, ORM)
Version: 1.1 Keywords: ipv6 postgres inet
Cc: Triage Stage: Accepted
Has patch: 0 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 0

Description

Ticket #708 describes a problem with LIKE operations on inet types in postgresql. The solution was to cast inet to text using the HOST() function.

But by casting inet to text none of the network operations in postgresql will work, and IPv6 lookups are pretty much broken. In the database I'm currently using, doing a HOST() on a IPv6 address will always produce a compressed URL. So if I'm checking against a fullsize address in django the lookup will fail, even though they are the same address.

Here's an example of what I'm talking about:

my_db=# CREATE TABLE my_ips (ip inet);
CREATE TABLE
                            ^
my_db=# INSERT INTO my_ips VALUES ('2001:0db8:0000:0000:0000:0000:0000:0001');
INSERT 0 1

my_db=# SELECT * FROM my_ips WHERE ip = '2001:0db8:0000:0000:0000:0000:0000:0001';
     ip      
-------------
 2001:db8::1
(1 row)

my_db=# SELECT * FROM my_ips WHERE ip = '2001:db8::1';
     ip      
-------------
 2001:db8::1
(1 row)

So far so good, but when you throw HOST() into the picture, this happens:

my_db=# SELECT * FROM my_ips WHERE HOST(ip) = '2001:db8::1';
     ip      
-------------
 2001:db8::1
(1 row)

my_db=# SELECT * FROM my_ips WHERE HOST(ip) = '2001:0db8:0000:0000:0000:0000:0000:0001';
 ip 
----
(0 rows)

2001:db8::1 and 2001:0db8:0000:0000:0000:0000:0000:0001 are the same address, just displayed on different forms.

Currently I always make sure that I pass a compressed IP to the models with IPAddressFields. That does however assume that all postgresql databases will always return IPv6 addresses on the compressed form, and I do not know if that's correct.

The correct solution would be to not cast inet to text.

Also, the postgresql documentation on Network Address Functions and Operators states that:

The host, text, and abbrev functions are primarily intended to offer alternative display formats.

So using HOST() for lookups is acctually kind of wrong in the first place.

Attachments

Change History

07/09/09 03:54:04 changed by morten.brekkevold@uninett.no

  • needs_better_patch changed.
  • needs_tests changed.
  • needs_docs changed.

Not only will using HOST yield wrong results, there are also severe performance implications to using the HOST function call in lookups, as it fails to utilize indexes on INET type fields. See the following example:

nav=# select count(*) from arp;   
  count  
---------
 6391765
(1 row)

nav=# explain analyze select * from arp where ip = '2001:700::1';
                                                      QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 Index Scan using arp_ip_btree on arp  (cost=0.00..905.80 rows=232 width=67) (actual time=0.021..0.021 rows=0 loops=1)
   Index Cond: (ip = '2001:700::1'::inet)
 Total runtime: 0.051 ms
(3 rows)

nav=# explain analyze select * from arp where HOST(ip) = '2001:700::1';
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Seq Scan on arp  (cost=0.00..200239.38 rows=32911 width=67) (actual time=9410.175..9410.175 rows=0 loops=1)
   Filter: (host(ip) = '2001:700::1'::text)
 Total runtime: 9410.196 ms
(3 rows)

nav=# 

01/11/10 03:06:51 changed by kristian.klette@uninett.no

Morten: The perfomance issue can be "solved" by adding a host(ip)-index on the table;

CREATE INDEX arp_host_ip ON arp (host(ip));
klette=# SELECT ip from ips where host(ip) = '2001:700:300:1800::b';
         host         
----------------------
 2001:700:300:1800::b
(1 row)

Time: 1781.635 ms
klette=# CREATE INDEX ips_host_ip_index ON ips ( host(ip));
CREATE INDEX
Time: 31937.661 ms
klette=# SELECT ip from ips where host(ip)::text = '2001:700:300:1800::b';
         host         
----------------------
 2001:700:300:1800::b
(1 row)

Time: 0.805 ms

Doesn't really solve the bug though, but boost performance at least.

02/02/10 08:25:59 changed by russellm

  • stage changed from Unreviewed to Accepted.

02/04/10 07:17:45 changed by russellm

  • status changed from new to closed.
  • resolution set to duplicate.

On second thought - closing as a dupe of #811. IPv6 support is spotty across the board - if we're going to fix it, it isn't just a Postgres issue.

(follow-up: ↓ 6 ) 02/20/10 12:08:39 changed by bobrobertson

  • status changed from closed to reopened.
  • version changed from 1.0 to 1.1.
  • resolution deleted.

The resolution assumes this is just an IPv6 problem, and completely ignores the enormous performance problem introduced by casting every inet record in the database to a string. This is understandable for LIKE queries, but it even uses HOST() on exact match queries.

These two queries return the same results. The first is how Django currently runs this query, and is roughly 2000x slower than the second. (Yes, I restarted Postgres between tests and flushed the OS buffers, so it is a fair comparison.)

The difference is performing n inet->string casts vs. performing 1 string->inet cast.
This also fixes the original IPv6 problem in this ticket.

Takes ~30.0 sec:

SELECT ip from ips where host(ip) = '10.0.0.1'

Takes ~0.15 sec:

SELECT ip from ips where ip = inet '10.0.0.1'

(in reply to: ↑ 5 ) 02/20/10 14:38:23 changed by bobrobertson

Replying to bobrobertson:

Excuse my typo.

Takes ~0.15 sec:

should have been:

Takes ~0.015 sec:


Add/Change #11442 (Postgresql backend casts inet types to text, breaks IP operations and IPv6 lookups.)




Change Properties
Action