Opened 11 years ago

Closed 11 years ago

Last modified 9 years ago

#12010 closed (fixed)

Add `ewkb` and `hexewkb` properties and document the GEOS IO Classes

Reported by: James Owned by: jbronn
Component: GIS Version: 1.1
Severity: Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: yes
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Alex Gaynor)

The documentation states that the hex member returns the HEXEWKB of a geometry. This isn't true, it is only returning the hex value without the SRID embeded into it as the HEXEWKB should.
Example of issue:

> p = django.contrib.gis.geos.GEOSGeometry('POINT(2 3)',4326)
> p.ewkt
 'SRID=4326;POINT (2.0000000000000000 3.0000000000000000)'
> p.hex
 (FROM PostGres with PostGIS)
 postgres=# SELECT GeometryFromText('POINT(2 3)', 0);
 (1 row)
 postgres=# SELECT GeometryFromText('POINT(2 3)', 4326);
 (1 row)

The output above verifys that the SRID information isn't being returned as part of the hex result of the geometry. Either the documentation is incorrect and a method should be provided that does return the HEXEWKB or there is a bug where the srid isn't being considered when calculating the HEXEWKB value.

The documentation I am looking at:

Change History (9)

comment:1 Changed 11 years ago by Alex Gaynor

Description: modified (diff)

Please use preview.

comment:2 Changed 11 years ago by James

Alex, thanks for the formatting fix, I did use preview but wasn't sure how to fix it until now.

Getting the correct HEXEWKB value is important to my application so I have researched a bit further and in sources for django.contrib.gis.geos the hex method documentation states:

        Returns the HEX of the Geometry -- please note that the SRID is not
        included in this representation, because the GEOS C library uses
        -1 by default, even if the SRID is set.

So it appears this is intentional behavior and the external documentation should be changed.

However, it would be more desirable if a workaround could be found to embed the SRID value. Is there a workaround in which I can force the GEOS C library to cooperate?

comment:3 Changed 11 years ago by James

Well I have came up with a way to return a packed string which is the actual EWKB. I am very new to python so I am sure this code needs to be optimized and cleaned up so I haven't patched it, but hopefully this will be useful to someone:

from django.contrib.gis.geos import GEOSGeometry
import struct

def getEWKB(val):
    #val = str(val)
    byteOrder, = struct.unpack_from('b', val.wkb)
    ind = '>'
    if byteOrder == 1:
        ind = '<'
    #WKB=byteOrder + wkbType + point
    wkbType, = struct.unpack_from(ind + 'L', val.wkb, 1)
    if wkbType & 0x20000000:
        wkbType = wkbType ^ 0x20000000
    ret = ''
    srid = val.srid
    if srid == None:
        srid = 0
    if val.hasz:
        dataformat = ind + 'ddd'
        x, y, z = struct.unpack_from(dataformat, val.wkb, 5)
        dataformat = ind + 'bLLddd'
        ret = struct.pack(dataformat,byteOrder,wkbType,val.srid,x,y,z)
     #EWKB=byteOrder +wkbType + srid + point
        dataformat = ind + 'dd'
        x, y = struct.unpack_from(dataformat, val.wkb, 5)
        dataformat = ind + 'bLLdd'
        ret = struct.pack(dataformat,byteOrder,wkbType,srid,x,y)
    return ret

The restriction is you must send it an object which is based derived from GEOSGeometry. Now one could do this:

>>> from import getEWKBHEX
>>> from django.contrib.gis.geos import GEOSGeometry
>>> g = GEOSGeometry('POINT(1 2)', 4326)
>>> ret = getEWKBHEX(g)
>>> print "OLD METHOD: ", g.hex
OLD METHOD:  0101000000000000000000F03F0000000000000040
>>> print "NEW: ", ret
NEW:  0101000020e6100000000000000000f03f0000000000000040

This is great because now we really do have an extended WKB formated hex as confirmed from using psql:

postgres=# SELECT GeometryFromText('POINT(1 2)', 4326);
(1 row)

comment:4 Changed 11 years ago by jbronn

Needs documentation: set
Owner: changed from nobody to jbronn
Status: newassigned
Summary: GEOSGemoentry hex method does not return HEXEWKBDocument the GEOS IO Classes
Triage Stage: UnreviewedAccepted

GeoDjango wraps the GEOS library. Included in the 1.1 release were wrappers for the GEOS IO classes: WKBReader, WKBWriter, WKTReader, and WKTWriter -- unfortunately, this is not in the documentation. Regardless, the IO classes allow for finer-grained serialization of (HEX)EWKB. For example:

from django.contrib.gis.geos import Point, WKBWriter

# Creating WKBWriter instance and setting SRID flag to True
wkb_w = WKBWriter()
wkb_w.srid = True

# Also, for 3D support
#wkb_w.outdim = 3

pnt = Point(1, 2, srid=4326)

# This is '0101000020E6100000000000000000F03F0000000000000040'
hexewkb = wkb_w.write_hex(pnt)

Thus, HEXEWKB support already exists it's just poorly documented. I'll keep this ticket open until I complete the documentation for the GEOS IO class interfaces.

comment:5 Changed 11 years ago by James

Thanks for the response. It would be great if hex did as advertised and simply did this for us vs having to know about and use the capi wrappers.

I have some additional timing info to give since I did a bunch of profiling on these different approaches of retrieving the hex of the extended WKP.

So I did some timings on several methods and here were the results:

getEWKBHEX 0.59
hex 0.23
write_hex 0.22
str 0.01
encode 0.02

Each timing test profiles the time it takes to run ~10000 calls of each use case.

'getEWKBHEX' is my optimized code of the original snipet I provided. It is almost 20% faster then the original snippet, but is much slower then the other methods which doesn't surprise me since I must use the pack and unpack methods from python.

'hex' is the current implementation for GEOSGeometry.hex, which doesn't really do HEXEWKB.

'write_hex' is the capi wrapper call jbronn mentioned. The timing makes in all my trials it is slightly faster then the hex implementation, which is odd but probably has to do with data alignments in the native C library or simply the overhead costs between the two functions to cross the python to native c boundries.

'str' + 'encode' timings (0.03) is the total time needed to simply encode the GEOSGeometry.wkp field to hex. This is what was suggested in the codebase in the comments for speedup. This proves it, as long as hex should NOT provide HEXEWKB.

In summary, to do as the docs advertise and GEOSGeometry.hex should return HEXEWKB embedding the CAPI calls as shown by jbronn would not adversely affect performance. However, if the behavior must stay the same making the modifications to do a direct encode of the wkp field vs. calling through the underlying api would provide a large speed increase.

If anyone is interested here is the profiling code I used:

import time
def timing(f, n, a):
    if f.__name__:
        print f.__name__,
    r = range(n)
    t1 = time.clock()
    for i in r:
        f(*a); f(*a); f(*a); f(*a); f(*a); f(*a); f(*a); f(*a); f(*a); f(*a)
    t2 = time.clock()
    print round(t2-t1, 3)

from django.contrib.gis.geos import GEOSGeometry, WKBWriter
if __name__ == '__main__':
    wkb_w = WKBWriter()
    wkb_w.srid = True
    g=GEOSGeometry("POINT(1 2)", srid=4326)
    timing(getEWKBHEX, 1000, (g,))
    timing(GEOSGeometry.hex.fget, 1000, (g,))
    timing(wkb_w.write_hex, 1000, (g,))
    timing(str, 1000, (g.wkb,))
    timing(str.encode, 1000, (str(g.wkb),'HEX' ))

comment:6 Changed 11 years ago by jbronn

Summary: Document the GEOS IO ClassesAdd `ewkb` and `hexewkb` properties and document the GEOS IO Classes

Because of it's necessary for 3D support anyway, I think adding ewkb and hexewkb properties is also appropriate. However, hex and wkb will keep the same functionality because including 3D/SRID information is *not* part of the OGC spec, and they should remain 'pure'.

comment:7 Changed 11 years ago by jbronn

(In [11728]) Added ewkb and hexewkb properties to GEOSGeometry. Refs #11433, #12010.

comment:8 Changed 11 years ago by jbronn

milestone: 1.2
Resolution: fixed
Status: assignedclosed

Now that the GEOS I/O objects are now documented, I'm closing this ticket.

comment:9 Changed 9 years ago by Jacob

milestone: 1.2

Milestone 1.2 deleted

Note: See TracTickets for help on using tickets.
Back to Top