Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#19366 closed Bug (fixed)

GEOSIndexError when comparing geometries

Reported by: cdestigter Owned by: claudep
Component: GIS Version: master
Severity: Release blocker Keywords:
Cc: craig.destigter@… Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

In 1.5 (latest master) I get the following when comparing geometries of different lengths:

from django.contrib.gis.geos import *
a = Polygon(((0, 0), (0, 1), (1, 1), (1, 0), (0, 0)))
b = Polygon(((0, 0), (0, 1), (1, 0), (0, 0)))

a < b

Traceback (most recent call last):
  File "<string>", line 6, in <module>
  File "/home/cdestigter/checkout/django/django/contrib/gis/geos/mutable_list.py", line 164, in __lt__
    c = self[i] < other[i]
  File "/home/cdestigter/checkout/django/django/contrib/gis/geos/mutable_list.py", line 164, in __lt__
    c = self[i] < other[i]
  File "/home/cdestigter/checkout/django/django/contrib/gis/geos/mutable_list.py", line 80, in __getitem__
    index = self._checkindex(index)
  File "/home/cdestigter/checkout/django/django/contrib/gis/geos/mutable_list.py", line 245, in _checkindex
    raise self._IndexError('invalid index: %s' % str(index))
django.contrib.gis.geos.error.GEOSIndexError: 'invalid index: 4'

This does not occur on django 1.4.x.

git bisect tracks it down to this commit: https://github.com/django/django/commit/d04f72fb31bab43386e12963c1c6dec23ae16143 ("Got rid of old cmpmethods replaced by rich comparison.")

python 2.6.5, GEOS 3.3.3-CAPI-1.7.4

marking as release blocker since it's a regression in 1.5.

Attachments (4)

19366-1.diff (2.9 KB) - added by claudep 3 years ago.
19366-2.diff (3.5 KB) - added by cdestigter 3 years ago.
19366-3.diff (4.1 KB) - added by cdestigter 3 years ago.
19366-2.diff plus fix lt when geometries are different lengths
19366-4.diff (4.4 KB) - added by claudep 3 years ago.

Download all attachments as: .zip

Change History (16)

comment:1 Changed 3 years ago by cdestigter

  • Has patch set
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

comment:2 Changed 3 years ago by claudep

  • Needs tests set
  • Owner changed from nobody to claudep
  • Triage Stage changed from Unreviewed to Accepted

Changed 3 years ago by claudep

comment:3 Changed 3 years ago by claudep

  • Needs tests unset

Craig, can you check the patch?
I kept catching IndexError, because it is possible that other is a simple list or any other object which may raise a plain IndexError.
I have also identified and fixed another regression in the equality method.

Changed 3 years ago by cdestigter

comment:4 Changed 3 years ago by cdestigter

My patch iterates over range(len(other)) to avoid getting an other._IndexError (which may not be either IndexError or self._IndexError). I dropped the catching of IndexError as it seems unnecessary in this case.

I also found another bug in __lt__ causing it to report incorrect results sometimes. I've added a test case to my patch to show the issue but haven't fixed the bug as I'm not quite sure the most efficient way to fix it.

comment:5 Changed 3 years ago by cdestigter

  • Cc craig.destigter@… added
  • Patch needs improvement set

Changed 3 years ago by cdestigter

19366-2.diff plus fix lt when geometries are different lengths

comment:6 Changed 3 years ago by claudep

After some thoughts, I think that comparing the linestrings for the equality makes sense, but not for > or <, because this will depend on the first differing point on the line string, which has absolutely nothing to do with being greater or smaller.
Shouldn't we override the __lt__ operator for polygons and compare the area instead?

comment:7 Changed 3 years ago by cdestigter

Not necessarily. Calculating a polygon's area is pretty expensive, and if all you want is a *consistent* ordering for a list of geometries (eg for hashing them consistently), simply ordering based on the coordinates is probably a lot faster.
At the very least, changing it implies a backwards compatibility break for anyone using the old ordering.

Changed 3 years ago by claudep

comment:8 Changed 3 years ago by claudep

So I've added a documentation paragraph about Polygon comparison.
To solve the __lt__ issue, I don't think we can prevent using a second reverse comparison. Also done in that patch.
I also removed the import * you added in previous patches, was this line intended?

comment:9 Changed 3 years ago by cdestigter

  • Patch needs improvement unset
  • Triage Stage changed from Accepted to Ready for checkin

Cool, that looks pretty good now. No, the import * wasn't meant to be in the patch. I had trouble figuring out how to run the test_geos.py tests, so that was my hackfix. Presumably there's a legit way to do it. They don't seem to run as part of the contrib.gis tests :/

comment:10 Changed 3 years ago by claudep

Thanks for the help and review.
Tests are run with gis tests, however it is not possible to run individual tests or TestCases because of #6712.

comment:11 Changed 3 years ago by Claude Paroz <claude@…>

  • Resolution set to fixed
  • Status changed from new to closed

In cc3c4a9d55e765cfe664e3dbd58201de99232096:

Fixed #19366 -- Prevented GEOSIndexError when comparing geometries

Thanks Craig de Stigter for the report and collaboration on the
patch.

comment:12 Changed 3 years ago by Claude Paroz <claude@…>

In 89593048e2eb5ef64a51da43728307e2dbef5783:

[1.5.x] Fixed #19366 -- Prevented GEOSIndexError when comparing geometries

Thanks Craig de Stigter for the report and collaboration on the
patch.
Backport of cc3c4a9d5 from master.

Note: See TracTickets for help on using tickets.
Back to Top