#4430 closed (fixed)
[unicode] Syndication framework cannot handle unicode description
Reported by: | Owned by: | Malcolm Tredinnick | |
---|---|---|---|
Component: | contrib.syndication | Version: | other branch |
Severity: | Keywords: | ||
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
I have object with content attribute, where I have non-ascii data. For both cases (either specifying {{ obj.content }} in description template or by adding method
def __unicode__(self): return smart_unicode(self.content)
), I got UnicodeDecodeError when trying to display feed:
UnicodeDecodeError at /feeds/wiki/ 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128) Request Method: GET Request URL: http://rpgpedia.cz/feeds/wiki/ Exception Type: UnicodeDecodeError Exception Value: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128) Exception Location: /usr/lib/python2.5/codecs.py in write, line 303
Local variables show object codecs is trying to decode:
u'\xdasp\u011bch zna\u010d\xed zd\xe1rn\xe9 zavr\u0161en\xed akce, kter\xe1 je p\u0159edm\u011btem testov\xe1n\xed. Je pot\u0159ebn\xfd zejm\xe9na tehdy, kdy\u017e se n\u011bkter\xe1 ((rp postava postava)) nebo jin\xfd element v ((rp rolova_hra rolov\xe9 h\u0159e)) sna\u017e\xed n\u011bco ud\u011blat, n\u011bco zd\xe1rn\u011b zavr\u0161it, nebo n\u011bjak\xfdm zp\u016fsobem zvr\xe1tit situaci ve sv\u016fj prosp\u011bch.\r\n\r\nNakl\xe1d\xe1n\xed s \xfasp\u011bchem z\xe1vis\xed od ((rp pravidla pravidel)) hry. V n\u011bkter\xfdch hr\xe1ch je d\u016fle\u017eit\xfd tak\xe9 po\u010det \xfasp\u011bch\u016f (pokud jich m\u016f\u017ee hr\xe1\u010d v testu dos\xe1hnout v\xedce), v jin\xfdch hr\xe1ch je podstatn\xe9 jenom to, jestli hr\xe1\u010d v ((rp test testu)) usp\u011bje, nebo ne.\r\n\r\nV prvn\xedm p\u0159\xedpad\u011b m\u016f\u017ee nav\xedc p\u0159i v\xfdsledku konfliktn\xed akce mezi dv\u011bma nebo v\xedce postavami (nebo elementy) b\xfdt rozhoduj\xedc\xed i po\u010det \xfasp\u011bch\u016f jednotliv\xfdch postav a ta s nejvy\u0161\u0161\xedm po\u010dtem \xfasp\u011bch\u016f pak v dan\xe9m konfliktu zpravidla v\xedt\u011bz\xed.\r\n\r\nV n\u011bkter\xfdch hr\xe1ch existuje t\xe9\u017e """tot\xe1ln\xed \xfasp\u011bch""" (jak\xe1si zes\xedlen\xe1 varianta \xfasp\u011bchu obvykle s \u0159\xe1dov\u011b ni\u017e\u0161\xed pravd\u011bpodobnost\xed) vedouc\xed zpravida k v\xfdkon\u016fm \u010di ud\xe1lostem, kter\xe9 by za norm\xe1ln\xedch okolnost\xed byly (t\xe9m\u011b\u0159) nemo\u017en\xe9.'
( = normal unicode string, which has no problem when encoding with s.encode('utf-8')
Attachments (1)
Change History (8)
comment:1 by , 18 years ago
comment:2 by , 18 years ago
Has patch: | set |
---|
Fixed like michal pointed out + fix also other classes.
Adding patch.
comment:3 by , 18 years ago
Description: | modified (diff) |
---|---|
Triage Stage: | Unreviewed → Accepted |
(fixed description formatting)
The patch goes a bit too far. We should never be applying smart_unicode() to anything is a URL. If they aren't already in ASCII, it's a bug on the client code's side (they should be using things like iri_to_uri() at the appropriate moments).
I'm having a bit of trouble understanding the original report, because smart_unicode() does work on the string you posted and you don't include what's in the traceback leading up to the error.
If the patch fixes it for you, can you just drop in a comment saying so? I'll apply a version of this patch anyway, since it mostly fixes some places that have been overlooked (thanks for testing that, both of you), but I would like some confirmation that it is fixing the original report as well.
comment:4 by , 18 years ago
Owner: | changed from | to
---|
Okay, the original bug report does make sense (that is, I can repeat it) if the string passed in is a UTF-8 bytestring that uses non-ASCII characters.
I'll commit a modified patch shortly that takes care of the IRI -> URI mapping as well.
comment:5 by , 18 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
I have similar problem (not exactly same, but it's in relation with RSS framework and strings in Czech language and UTF-8).
When I try to fetch RSS feed, I get this error:
It looks like the RSS framework wrong handle items. I looked into Django source, into file django/utils/feedgenerator.py and change code from line 160 to:
In every call of handler.addQuickElement I used smart_unicode function to recode content. Now my RSS feed is running.
Maybe there is need to make patch do something similar in the RSS framework?