Opened 13 years ago
Closed 12 years ago
#17566 closed Uncategorized (fixed)
RegexURLResolver.__repr__(self) is not unicode safe.
Reported by: | otto | Owned by: | |
---|---|---|---|
Component: | Core (URLs) | Version: | 1.4-alpha-1 |
Severity: | Normal | Keywords: | unicode |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
This is a sister to the (fixed) tickets #15795, #16158.
RegexURLResolver.repr(self) can throw a UnicodeDecodeError. The problem comes from the parameter self.urlconf_name. The exception is "UnicodeDecodeError: 'ascii' codec can't decode byte... (etc)."
A simple (and possibly naive) solution is to replace the occurrence of self.urlconf_name with str(self.urlconf_name).decode('utf-8') in repr(self). This seems to fix it for now, but someone who is working on this module might want to take a closer look.
Attachments (3)
Change History (15)
follow-up: 2 comment:1 by , 13 years ago
Resolution: | → needsinfo |
---|---|
Status: | new → closed |
comment:2 by , 12 years ago
Keywords: | unicode added |
---|---|
Resolution: | needsinfo |
Status: | closed → reopened |
I have noticed the same bug but I believe it is consequence of having app_name that has non ascii characters in it. May be that it is not supposed to have, but then, I couldn't find other way to display localised app_name in django's admin page. It is triggered only when I try to open missing link during the generation of technical_404_response. Here is a log:
Django version 1.4, using settings 'eprod.settings' Development server is running at http://192.168.1.195:8100/ Quit the server with CONTROL-C. Traceback (most recent call last): File "/usr/lib/python2.7/wsgiref/handlers.py", line 85, in run self.result = application(self.environ, self.start_response) File "/usr/local/lib/python2.7/dist-packages/django/contrib/staticfiles/handlers.py", line 67, in __call__ return self.application(environ, start_response) File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/wsgi.py", line 241, in __call__ response = self.get_response(request) File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py", line 146, in get_response response = debug.technical_404_response(request, e) File "/usr/local/lib/python2.7/dist-packages/django/views/debug.py", line 432, in technical_404_response 'reason': smart_str(exception, errors='replace'), File "/usr/local/lib/python2.7/dist-packages/django/utils/encoding.py", line 116, in smart_str return str(s) File "/usr/local/lib/python2.7/dist-packages/django/core/urlresolvers.py", line 235, in __repr__ return smart_str(u'<%s %s (%s:%s) %s>' % (self.__class__.__name__, self.urlconf_name, self.app_name, self.namespace, self.regex.pattern)) File "/usr/local/lib/python2.7/dist-packages/django/core/urlresolvers.py", line 235, in __repr__ return smart_str(u'<%s %s (%s:%s) %s>' % (self.__class__.__name__, self.urlconf_name, self.app_name, self.namespace, self.regex.pattern)) UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 18: ordinal not in range(128)
I hope this would help to solve the problem. In case that app_name has to be ascii then what is the proper way to force django admin application to show non-ascii applications names? In fact I would prefer to have plain ascii urls but display name or verbose_name or similar displayed in admin pages.
Vitalije
follow-up: 4 comment:3 by , 12 years ago
Resolution: | → invalid |
---|---|
Status: | reopened → closed |
If app_name contains non-ascii characters, it has to be a Unicode string, which I suspect was not the case in your app.
by , 12 years ago
Attachment: | app_name_unicode_test.tar.gz added |
---|
test django project that demonstrates this bug
follow-up: 5 comment:4 by , 12 years ago
Resolution: | invalid |
---|---|
Status: | closed → reopened |
Replying to claudep:
If app_name contains non-ascii characters, it has to be a Unicode string, which I suspect was not the case in your app.
Well you suspect wrong. It is indeed unicode here is one of my models:
class Dobavljac(models.Model): class Meta: app_label = u'Продавница' db_table = u'prodavnica_dobavljac' verbose_name = u'Добављач' verbose_name_plural = u'Добављачи' ...
As matter of fact I just have started a new django project and a new application in it using django-admin.py script. Then I have added two very simple models with a unicode app_name like the one above, created admin.py and registered those two models and tested it produces same error. I attached code if you like to test it yourself. Try to open some url that is not in urlpatterns like "/admin/Продавница/rwerew" and django will try to produce technical_404_response but will throw an exception like mentioned before.
HTH Vitalije
follow-up: 6 comment:5 by , 12 years ago
Replying to vitalije:
app_label in the Meta options is only used to connect the model to the right app instance. In this case, the only valid value is "test_app_name" - and here it is not required because the models are located in a models.py file at the first level of the test_app_name package.
This is largely unrelated to the original ticket - but the app_name is derived from the import path of the app - assigning a different Meta.app_label does not actually change the app_name, but in this case the resolver isn't even getting that far.
comment:6 by , 12 years ago
Replying to ptone:
This is largely unrelated to the original ticket - but the app_name is derived from the import path of the app - assigning a different Meta.app_label does not actually change the app_name, but in this case the resolver isn't even getting that far.
I must admit I was wrong to believe that app_name is problem. After thorough investigation I have found that app_name is plain ascii = 'admin'. But I have found that self.regex.pattern is unicode. When I replaced definition of method repr
def __repr__(self): return return smart_str(u'<%s %s (%s:%s) %s>' %(self.__class__.__name__, self.urlconf_name, self.app_name, self.namespace, self.regex.pattern))
with the following definition
def __repr__(self): return '<%s %s (%s:%s) %s>' %tuple(map(smart_str, (self.__class__.__name__, self.urlconf_name, self.app_name, self.namespace, self.regex.pattern)))
the problem was solved.
follow-up: 8 comment:7 by , 12 years ago
If you have an URL pattern with Unicode characters, I think it should also be prefixed with 'u'. I still think that current Django code is correct.
follow-up: 9 comment:8 by , 12 years ago
Replying to claudep:
If you have an URL pattern with Unicode characters, I think it should also be prefixed with 'u'. I still think that current Django code is correct.
Have you tested given django project or look at the attached code, or you are just guessing that I forgot to put u before string literal?
I didn't add any unicode url pattern, but admin application did because it uses app_name as part of its url patterns. So the problem is generated inside django and IMHO should be fixed there. Besides I have placed print self.regex.pattern, type(self.regex.pattern) right before the original return statement and in console log it was written as type unicode. So it actually has proper leading "u" ie it IS really unicode, but other arguments are strings. I don't know why python has problem with coercing all those strings in unicode, but I have verified that in my test application the only source of reported error is the above method repr of the RegexURLResolver class, and inside it, the only problem is regex.pattern being a unicode.
I did change that method locally in my own installation of django and it works for me. If it is not important for other users / or developers to make that correction part of their own installations / repository that's fine with me. If it is, then here are my 2 cents contribution.
Vitalije
PS: I feel like my reporting a bug is not very welcomed. I am grateful for all the time and energy that django developers have put in it, and I want to give something back to the community. It took me some time and effort to register, to write report in English (that is not my primary language), to debug code and report back the solution I found, and I don't complain about that effort and time. But I would appreciate (and I do believe some other programmers like me would appreciate also) if our bug reports are investigated more thoroughly before being replied. The next time I (or someone like me) find a bug in django, it can happen very easily that we wouldn't care so much to report it. I am not being impatient while waiting response from developers, so let them take their time as much as they need. But when they decide to respond, please let them take our reports seriously. Let them read our reports, test them, test their own hypothesis before making a reply. It would make us more willing to report. If there is something wrong with my report then please educate me how to make better report and by doing it you would make me feel that I owe you more good reports. That way there will be soon great number of good reports for every single bug, we users find. It is very easy to turn people away (not me, I believe), but not so easy to inspire them to get involved. Having said that I do hope that we will be more and more understanding each other in the future.
comment:9 by , 12 years ago
Owner: | removed |
---|---|
Status: | reopened → new |
Replying to vitalije:
PS: I feel like my reporting a bug is not very welcomed. (...)
I'm very sorry if I gave you the impression that your report was not welcome. It's not the case. In written communication, it's so easy to overlook the tone of responses and I probably did not address your issue with enough care. I will continue to investigate, but if any other reader feels like working on the issue, I have no monopol on it :-)
comment:10 by , 12 years ago
Needs tests: | set |
---|---|
Triage Stage: | Unreviewed → Accepted |
I could (at last) reproduced the issue. The culprit is self.urlconf_name which __repr__
already returns an utf-8-encode string (when self.urlconf_name is a list of RegexURLPattern), which then cannot be re-inserted in an unicode string without being decoded first. It is even reproducible in master.
Could you please test the following patch?
comment:11 by , 12 years ago
Needs tests: | unset |
---|
comment:12 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
The fix committed for #17892 [28fd876bae15df747d164dcae229840d2cf135ca] should have fixed this issue also.
How did you encounter this error?
urlconf_name
is the import path of the URLconf module, and Python doesn't support unicode module names (at least until Python 3.2), so it should always be an ASCII string.