Opened 13 years ago
Closed 12 years ago
#17736 closed Bug (fixed)
django.contrib.gis.geos.polygon from_bbox loss of floating-point accuracy
Reported by: | Owned by: | David Eklund | |
---|---|---|---|
Component: | GIS | Version: | 1.3 |
Severity: | Normal | Keywords: | |
Cc: | daek@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | yes | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
from_bbox use a string formatting to generate a geom instance, this will loose accuracy, when it matters
for example:
>>> import math >>> from django.contrib.gis.geos import Polygon >>> l,t,r,b = 0.0,0.0,math.pi,math.pi >>> Polygon.from_bbox((l,t,r,b)).coords (((0.0, 0.0), (0.0, 3.14159265359), (3.14159265359, 3.14159265359), (3.14159265359, 0.0), (0.0, 0.0)),) >>> Polygon(((l,t),(r,t),(r,b),(l,b),(l,t))).coords (((0.0, 0.0), (3.141592653589793, 0.0), (3.141592653589793,
Attachments (1)
Change History (10)
comment:1 by , 13 years ago
Component: | Uncategorized → GIS |
---|
follow-up: 4 comment:2 by , 13 years ago
I think
return Polygon(((l,t),(r,t),(r,b),(l,b),(l,t)))
would be good enough for the from_bbox function.
Django's geos documention didn't mention that bbox can be strings, yet (am I right?).
So that we can forget the digits, here.
comment:3 by , 13 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:4 by , 12 years ago
I think this makes sense, I have attached a patch making this change.
Replying to anonymous:
I think
return Polygon(((l,t),(r,t),(r,b),(l,b),(l,t)))would be good enough for the from_bbox function.
Django's geos documention didn't mention that bbox can be strings, yet (am I right?).
So that we can forget the digits, here.
comment:5 by , 12 years ago
Cc: | added |
---|---|
Has patch: | set |
As noted above the str() method of floats rounds the input to 12 digits. I guess python floats are double precision (depends on the machine one is using?), which means roughly 16 digits of accuracy. In contrast to str(), the repr() method of floats preserves double precision: it truncates the input to 17 digits. So for floats it might make more sense to use repr() than to use str().
Perhaps one wants to keep the option of passing strings to from_bbox() and also create the polygon using GEOSGeometry directly. In that case, one might consider using repr() if the coordinate is a float and str() otherwise:
def from_bbox(cls, bbox): "Constructs a Polygon from a bounding box (4-tuple)." box_string = [] for z in bbox: if isinstance(z, float): box_string.append(repr(z)) else: box_string.append(str(z)) x0, y0, x1, y1 = box_string return GEOSGeometry( 'POLYGON((%s %s, %s %s, %s %s, %s %s, %s %s))' % ( x0, y0, x0, y1, x1, y1, x1, y0, x0, y0) )
That is, this would be an alternative to the approach followed in the attached patch.
follow-up: 7 comment:6 by , 12 years ago
Needs tests: | set |
---|
I like the idea of converting float to str as late as possible. So my suggestion would be to test if the bounding box is made of numeric types, and depending on this, choose either the Polygon() or the GEOSGeometry() creation method. We can probably assume that all bbox elements are of the same type, so checking the first one could be sufficient.
Tests wanted!
comment:7 by , 12 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Sounds good. I will work on an implementation along these lines. I'm not sure, but I think Polygon() ends up creating a GEOSGeometry object using a string created from the numerical input, so that going via Polygon() is essentially just a detour (albeit perhaps a convenient one for numerical input). I will look closer at this issue.
Replying to claudep:
I like the idea of converting float to str as late as possible. So my suggestion would be to test if the bounding box is made of numeric types, and depending on this, choose either the Polygon() or the GEOSGeometry() creation method. We can probably assume that all bbox elements are of the same type, so checking the first one could be sufficient.
Tests wanted!
by , 12 years ago
Attachment: | 17736_gis_precision.diff added |
---|
Uses Polygon if the input is numerical. Test added.
comment:8 by , 12 years ago
The attached patch checks first whether the bounding box consists of objects of numeric type (real numbers). If that is the case for all elements of the bounding box, then the polygon is created with Polygon. Otherwise, that is if some element of the bounding box is not a real number, then the bounding box is converted to a string and the polygon is created using GEOSGeometry.
A test has been added. The test runs from_bbox on a particular input and checks the accuracy of one of the coordinates of the output to 14 digits.
comment:9 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
In any case, whenever you are using floating-point values, rounding will occur at some point. Currently, from_bbox uses the Python built-in str which defaults to 12 significant digits. Of course, you can always do the rounding yourself. For example, to keep 25 significant numbers:
So even with 25 significant numbers, it seems GEOS rounds to 16. Should from_bbox also default to 16 significant digits?