Opened 7 years ago

Closed 7 years ago

Last modified 5 years ago

#12989 closed (fixed)

URLValidator's verify_exists fails on IDN (Internationalized Domain Names)

Reported by: Fraser Nevett Owned by: Jannis Leidel
Component: Core (Other) Version: master
Severity: Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

Changeset [12474] introduced validation for IDN values in URLs. However, the verify_exists=True functionality doesn't work when you try to verify an IDN exists...

>>> from django.core.validators import URLValidator
>>> url = u'http://עברית.idn.icann.org'
>>> URLValidator(url)
True
>>> URLValidator(url, verify_exists=True)
ValidationError

The problem seems to be that urllib2 doesn't support IDN directly:

>>> req = urllib2.Request(url)
>>> urllib2.urlopen(req)
UnicodeEncodeError...

This can be resolved if the IDNA-encoded version is used instead:

idna_url = 'http://xn--54b7fta0cc.idn.icann.org/'
>>> req = urllib2.Request(idna_url)
>>> urllib2.urlopen(req)
<addinfourl at 3082877196L whose fp = <socket._fileobject object at 0xb7c1333c>>

ICANN has a of example names which might be useful for testing.

See also discussion at http://groups.google.com/group/django-developers/browse_frm/thread/6403db0c5229700a

Attachments (1)

12989.diff (2.8 KB) - added by Fraser Nevett 7 years ago.

Download all attachments as: .zip

Change History (9)

comment:1 Changed 7 years ago by Jannis Leidel

Component: UncategorizedCore framework
milestone: 1.2
Triage Stage: UnreviewedAccepted
Version: 1.1SVN

comment:2 Changed 7 years ago by Ulrich Petri

I replied already to the ML but I'll repeat it here for completeness sake:

This seems only to be true for python 2.4. In 2.5 and above urlopen will happily accept IDNs.

comment:3 Changed 7 years ago by Jannis Leidel

The above example didn't work in 2.5. nor 2.6 for me. Are you sure it did for you?

Changed 7 years ago by Fraser Nevett

Attachment: 12989.diff added

comment:4 Changed 7 years ago by Fraser Nevett

Has patch: set

I too tried in 2.4, 2.5 and 2.6 and couldn't get it to work.

My patch and tests are now attached, but as per the django-developers thread, I think URL validation could do with a bit more thought.

The test case checks for the existence of עברית.idn.icann.org. I tend to think it would be better not to rely on a third-party domain name here. Perhaps someone could configure a subdomain such as idn-τεsτ.djangoproject.com to serve a simple 200 OK response and we could use that instead.

comment:5 Changed 7 years ago by Jannis Leidel

Owner: changed from nobody to Jannis Leidel
Status: newassigned

comment:6 Changed 7 years ago by Jannis Leidel

While I agree that we might need to work on providing the infrastructure for running these validator tests, I also think it's out of the scope of this ticket. Additionally the URL "עברית.idn.icann.org" you used in the patch is kept available even after the evaluation period of IDN has ended (ref http://idn.icann.org/#Limited_evaluation_period).

comment:7 Changed 7 years ago by Jannis Leidel

Resolution: fixed
Status: assignedclosed

(In [12620]) Fixed #12989 - Fixed verification of IDN URLs. Thanks to Fraser Nevett for the report and patch.

comment:8 Changed 5 years ago by Jacob

milestone: 1.2

Milestone 1.2 deleted

Note: See TracTickets for help on using tickets.
Back to Top