Opened 3 years ago

Closed 3 years ago

#17736 closed Bug (fixed)

django.contrib.gis.geos.polygon from_bbox loss of floating-point accuracy

Reported by: tdihp@… Owned by: DavidEklund
Component: GIS Version: 1.3
Severity: Normal Keywords:
Cc: daek@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

from_bbox use a string formatting to generate a geom instance, this will loose accuracy, when it matters

for example:

>>> import math
>>> from django.contrib.gis.geos import Polygon
>>> l,t,r,b = 0.0,0.0,math.pi,math.pi
>>> Polygon.from_bbox((l,t,r,b)).coords
(((0.0, 0.0), (0.0, 3.14159265359), (3.14159265359, 3.14159265359), (3.14159265359, 0.0), (0.0, 0.0)),)
>>> Polygon(((l,t),(r,t),(r,b),(l,b),(l,t))).coords
(((0.0, 0.0), (3.141592653589793, 0.0), (3.141592653589793,

Attachments (1)

17736_gis_precision.diff (1.6 KB) - added by DavidEklund 3 years ago.
Uses Polygon if the input is numerical. Test added.

Download all attachments as: .zip

Change History (10)

comment:1 Changed 3 years ago by claudep

  • Component changed from Uncategorized to GIS
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

In any case, whenever you are using floating-point values, rounding will occur at some point. Currently, from_bbox uses the Python built-in str which defaults to 12 significant digits. Of course, you can always do the rounding yourself. For example, to keep 25 significant numbers:

>>> l,t,r,b = 0.0,0.0,format(math.pi, '.25f'), format(math.pi, '.25f')
>>> Polygon.from_bbox((l,t,r,b)).coords
(((0.0, 0.0),
  (0.0, 3.1415926535897931),
  (3.1415926535897931, 3.1415926535897931),
  (3.1415926535897931, 0.0),
  (0.0, 0.0)),)

So even with 25 significant numbers, it seems GEOS rounds to 16. Should from_bbox also default to 16 significant digits?

comment:2 follow-up: Changed 3 years ago by anonymous

I think

return Polygon(((l,t),(r,t),(r,b),(l,b),(l,t)))

would be good enough for the from_bbox function.
Django's geos documention didn't mention that bbox can be strings, yet (am I right?).
So that we can forget the digits, here.

comment:3 Changed 3 years ago by aaugustin

  • Triage Stage changed from Unreviewed to Accepted

comment:4 in reply to: ↑ 2 Changed 3 years ago by DavidEklund

I think this makes sense, I have attached a patch making this change.

Replying to anonymous:

I think

return Polygon(((l,t),(r,t),(r,b),(l,b),(l,t)))

would be good enough for the from_bbox function.
Django's geos documention didn't mention that bbox can be strings, yet (am I right?).
So that we can forget the digits, here.

comment:5 Changed 3 years ago by DavidEklund

  • Cc daek@… added
  • Has patch set

As noted above the str() method of floats rounds the input to 12 digits. I guess python floats are double precision (depends on the machine one is using?), which means roughly 16 digits of accuracy. In contrast to str(), the repr() method of floats preserves double precision: it truncates the input to 17 digits. So for floats it might make more sense to use repr() than to use str().

Perhaps one wants to keep the option of passing strings to from_bbox() and also create the polygon using GEOSGeometry directly. In that case, one might consider using repr() if the coordinate is a float and str() otherwise:

    def from_bbox(cls, bbox):
        "Constructs a Polygon from a bounding box (4-tuple)."
        box_string = []
        for z in bbox:
            if isinstance(z, float):
                box_string.append(repr(z))
            else: 
                box_string.append(str(z))
        x0, y0, x1, y1 = box_string
        return GEOSGeometry( 'POLYGON((%s %s, %s %s, %s %s, %s %s, %s %s))' %  (
                x0, y0, x0, y1, x1, y1, x1, y0, x0, y0) )

That is, this would be an alternative to the approach followed in the attached patch.

Last edited 3 years ago by DavidEklund (previous) (diff)

comment:6 follow-up: Changed 3 years ago by claudep

  • Needs tests set

I like the idea of converting float to str as late as possible. So my suggestion would be to test if the bounding box is made of numeric types, and depending on this, choose either the Polygon() or the GEOSGeometry() creation method. We can probably assume that all bbox elements are of the same type, so checking the first one could be sufficient.

Tests wanted!

comment:7 in reply to: ↑ 6 Changed 3 years ago by DavidEklund

  • Owner changed from nobody to DavidEklund
  • Status changed from new to assigned

Sounds good. I will work on an implementation along these lines. I'm not sure, but I think Polygon() ends up creating a GEOSGeometry object using a string created from the numerical input, so that going via Polygon() is essentially just a detour (albeit perhaps a convenient one for numerical input). I will look closer at this issue.

Replying to claudep:

I like the idea of converting float to str as late as possible. So my suggestion would be to test if the bounding box is made of numeric types, and depending on this, choose either the Polygon() or the GEOSGeometry() creation method. We can probably assume that all bbox elements are of the same type, so checking the first one could be sufficient.

Tests wanted!

Changed 3 years ago by DavidEklund

Uses Polygon if the input is numerical. Test added.

comment:8 Changed 3 years ago by anonymous

The attached patch checks first whether the bounding box consists of objects of numeric type (real numbers). If that is the case for all elements of the bounding box, then the polygon is created with Polygon. Otherwise, that is if some element of the bounding box is not a real number, then the bounding box is converted to a string and the polygon is created using GEOSGeometry.

A test has been added. The test runs from_bbox on a particular input and checks the accuracy of one of the coordinates of the output to 14 digits.

comment:9 Changed 3 years ago by Claude Paroz <claude@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In [17824e2b74ca02b9914e853c818fa512a0f9ef34]:

Fixed #17736 -- Kept maximal floating-point accuracy in from_bbox

When constructing a polygon with Polygon.from_bbox, do not convert
parameters to strings at this stage (str defaults to 12 significant
digits).
Thanks tdihp@… for the report and David Eklung for the patch.

Note: See TracTickets for help on using tickets.
Back to Top