Code

Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#16553 closed Bug (fixed)

GeoIP unicode problem

Reported by: anonymous Owned by: jbronn
Component: GIS Version: 1.2
Severity: Normal Keywords: geoip unicode
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Here is the full traceback when GeoIP returns unicode characters in city name and this is passed to template:

Traceback (most recent call last):

  File "/usr/lib/python2.5/site-packages/django/core/handlers/base.py", line 100, in get_response
    response = callback(request, *callback_args, **callback_kwargs)

  File "/home/test/main/index/views.py", line 18, in index
    return render_to_response("index.html", {"request": request, "geoip": gp})

  File "/usr/lib/python2.5/site-packages/django/shortcuts/__init__.py", line 20, in render_to_response
    return HttpResponse(loader.render_to_string(*args, **kwargs), **httpresponse_kwargs)

  File "/usr/lib/python2.5/site-packages/django/template/loader.py", line 186, in render_to_string
    return t.render(context_instance)

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 173, in render
    return self._render(context)

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 167, in _render
    return self.nodelist.render(context)

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 796, in render
    bits.append(self.render_node(node, context))

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 809, in render_node
    return node.render(context)

  File "/usr/lib/python2.5/site-packages/django/template/loader_tags.py", line 125, in render
    return compiled_parent._render(context)

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 167, in _render
    return self.nodelist.render(context)

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 796, in render
    bits.append(self.render_node(node, context))

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 809, in render_node
    return node.render(context)

  File "/usr/lib/python2.5/site-packages/django/template/loader_tags.py", line 62, in render
    result = block.nodelist.render(context)

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 796, in render
    bits.append(self.render_node(node, context))

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 809, in render_node
    return node.render(context)

  File "/usr/lib/python2.5/site-packages/django/template/defaulttags.py", line 258, in render
    return self.nodelist_true.render(context)

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 796, in render
    bits.append(self.render_node(node, context))

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 809, in render_node
    return node.render(context)

  File "/usr/lib/python2.5/site-packages/django/template/defaulttags.py", line 258, in render
    return self.nodelist_true.render(context)

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 796, in render
    bits.append(self.render_node(node, context))

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 809, in render_node
    return node.render(context)

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 849, in render
    return _render_value_in_context(output, context)

  File "/usr/lib/python2.5/site-packages/django/template/__init__.py", line 829, in _render_value_in_context
    value = force_unicode(value)

  File "/usr/lib/python2.5/site-packages/django/utils/encoding.py", line 88, in force_unicode
    raise DjangoUnicodeDecodeError(s, *e.args)

DjangoUnicodeDecodeError: 'utf8' codec can't decode bytes in position 6-8: invalid data. You passed in 'Hlubok\xe1 Nad Vltavou' (<type 'str'>)

complete code:

gp = None
g = GeoIP()
ip = request.META.get("REMOTE_ADDR", None)
if ip:
	gp = g.city(ip)
return render_to_response("index.html", {"request": request, "geoip": gp})

Django 1.2.4

Attachments (2)

16553.1.diff (43.9 KB) - added by jbronn 3 years ago.
Refactor of GeoIP module.
16553.2.diff (46.5 KB) - added by jbronn 3 years ago.

Download all attachments as: .zip

Change History (8)

comment:1 Changed 3 years ago by aaugustin

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Resolution set to invalid
  • Status changed from new to closed

Django always uses unicode internally. However, if you happen to pass a non-unicode string, it will attempt to interpret it as utf-8. Unfortunately, here, you're passing a string encoded in latin1:

>>> 'Hlubok\xe1 Nad Vltavou'.decode('iso-8859-1')
u'Hlubok\xe1 Nad Vltavou'
>>> 'Hlubok\xe1 Nad Vltavou'.decode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe1 in position 6: invalid continuation byte

(Note that this is plain Python, Django isn't involved at all.)

GeoIP's documentation probably says that their data is encoded in latin1 (I didn't check). It's up to you to decode it (= convert to unicode) appropriately.

So, this isn't a bug in Django. It looks like you aren't very familiar with unicode in Python — you say GeoIP returns unicode characters, but the problem is precisely that it doesn't! This bug tracker isn't a support forum; it's a database of known bugs in Django. For such questions, you should seek help on #django on FreeNode or on the django-users mailing list.

comment:2 follow-up: Changed 3 years ago by anonymous

aaugustin, how should i know this isn't bug in Django ?? there isn't anything about this problem in Django's documentation:
https://docs.djangoproject.com/en/1.3/ref/contrib/gis/geoip/

and as you can see, GeoIP is part of Django (from django.contrib.gis.utils import GeoIP) so i assumed it's (or it should be) compatible with it. really don't understand why you are so unfriendly to me, i was trying to be helpfull for Django!

comment:3 in reply to: ↑ 2 ; follow-up: Changed 3 years ago by aaugustin

  • Component changed from Uncategorized to GIS
  • Resolution invalid deleted
  • Status changed from closed to reopened
  • Triage Stage changed from Unreviewed to Accepted

Replying to anonymous:

aaugustin, how should i know this isn't bug in Django ?? there isn't anything about this problem in Django's documentation:
https://docs.djangoproject.com/en/1.3/ref/contrib/gis/geoip/
and as you can see, GeoIP is part of Django (from django.contrib.gis.utils import GeoIP) so i assumed it's (or it should be) compatible with it.

Oops. I had missed that! Sorry, really.


really don't understand why you are so unfriendly to me, i was trying to be helpfull for Django!

I missed the fact that a copy GeoIP was shipped with Django, and since your problem was a basic unicode issue, I pasted my standard "user error" template, which is intended to kill the discussion. A huge fraction of bug reports (60% to 80%) aren't bugs in Django, but user errors. In such cases, it's better to cut short the discussion; otherwise the reporter often insists and we lose a lot of time.

Your bug report contains one sentence, one traceback, and 5 lines of code. With these elements, it's difficult to tell (a) what's your level of knowledge of Python and Django (b) how much analysis you've done before reporting the bug. This could have been a newbie error.

Basically, we are volunteers doing this on our free time. We're doing our best to sift through bug reports, most of which are poorly written, and we make errors sometimes :/ I hope you understand.

comment:4 in reply to: ↑ 3 Changed 3 years ago by anonymous

Replying to aaugustin:

Your bug report contains one sentence, one traceback, and 5 lines of code.


I should have include also the import line, will do it next time.

Basically, we are volunteers doing this on our free time. We're doing our best to sift through bug reports, most of which are poorly written, and we make errors sometimes :/ I hope you understand.


Yes i understand it but keep in mind that i wasn't paid for writing this bug report too.

Changed 3 years ago by jbronn

Refactor of GeoIP module.

comment:5 Changed 3 years ago by jbronn

  • Owner changed from nobody to jbronn
  • Patch needs improvement set
  • Status changed from reopened to new

The attached patch is a refactor of the GeoIP module, which fixes this bug and others (including significant memory leaks discovered while investigating this).

The patch is nearly complete, but additional documentation and deprecation information needs to be added about the moving of GeoIP from django.contrib.gis.utils to django.contrib.gis.geoip.

Changed 3 years ago by jbronn

comment:6 Changed 3 years ago by jbronn

  • Resolution set to fixed
  • Status changed from new to closed

In [16783]:

Fixed #16553 -- Refactored the GeoIP module, moving it django.contrib.gis.geoip; fixed memory leaks, and encoding issues.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.