Opened 14 years ago

Closed 14 years ago

Last modified 13 years ago

#12989 closed (fixed)

URLValidator's verify_exists fails on IDN (Internationalized Domain Names)

Reported by: Fraser Nevett Owned by: Jannis Leidel
Component: Core (Other) Version: dev
Severity: Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Changeset [12474] introduced validation for IDN values in URLs. However, the verify_exists=True functionality doesn't work when you try to verify an IDN exists...

>>> from django.core.validators import URLValidator
>>> url = u'http://עברית.idn.icann.org'
>>> URLValidator(url)
True
>>> URLValidator(url, verify_exists=True)
ValidationError

The problem seems to be that urllib2 doesn't support IDN directly:

>>> req = urllib2.Request(url)
>>> urllib2.urlopen(req)
UnicodeEncodeError...

This can be resolved if the IDNA-encoded version is used instead:

idna_url = 'http://xn--54b7fta0cc.idn.icann.org/'
>>> req = urllib2.Request(idna_url)
>>> urllib2.urlopen(req)
<addinfourl at 3082877196L whose fp = <socket._fileobject object at 0xb7c1333c>>

ICANN has a of example names which might be useful for testing.

See also discussion at http://groups.google.com/group/django-developers/browse_frm/thread/6403db0c5229700a

Attachments (1)

12989.diff (2.8 KB ) - added by Fraser Nevett 14 years ago.

Download all attachments as: .zip

Change History (9)

comment:1 by Jannis Leidel, 14 years ago

Component: UncategorizedCore framework
milestone: 1.2
Triage Stage: UnreviewedAccepted
Version: 1.1SVN

comment:2 by Ulrich Petri, 14 years ago

I replied already to the ML but I'll repeat it here for completeness sake:

This seems only to be true for python 2.4. In 2.5 and above urlopen will happily accept IDNs.

comment:3 by Jannis Leidel, 14 years ago

The above example didn't work in 2.5. nor 2.6 for me. Are you sure it did for you?

by Fraser Nevett, 14 years ago

Attachment: 12989.diff added

comment:4 by Fraser Nevett, 14 years ago

Has patch: set

I too tried in 2.4, 2.5 and 2.6 and couldn't get it to work.

My patch and tests are now attached, but as per the django-developers thread, I think URL validation could do with a bit more thought.

The test case checks for the existence of עברית.idn.icann.org. I tend to think it would be better not to rely on a third-party domain name here. Perhaps someone could configure a subdomain such as idn-τεsτ.djangoproject.com to serve a simple 200 OK response and we could use that instead.

comment:5 by Jannis Leidel, 14 years ago

Owner: changed from nobody to Jannis Leidel
Status: newassigned

comment:6 by Jannis Leidel, 14 years ago

While I agree that we might need to work on providing the infrastructure for running these validator tests, I also think it's out of the scope of this ticket. Additionally the URL "עברית.idn.icann.org" you used in the patch is kept available even after the evaluation period of IDN has ended (ref http://idn.icann.org/#Limited_evaluation_period).

comment:7 by Jannis Leidel, 14 years ago

Resolution: fixed
Status: assignedclosed

(In [12620]) Fixed #12989 - Fixed verification of IDN URLs. Thanks to Fraser Nevett for the report and patch.

comment:8 by Jacob, 13 years ago

milestone: 1.2

Milestone 1.2 deleted

Note: See TracTickets for help on using tickets.
Back to Top