Opened 5 years ago

Closed 5 years ago

Last modified 3 years ago

#12989 closed (fixed)

URLValidator's verify_exists fails on IDN (Internationalized Domain Names)

Reported by: frasern Owned by: jezdez
Component: Core (Other) Version: master
Severity: Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

Changeset [12474] introduced validation for IDN values in URLs. However, the verify_exists=True functionality doesn't work when you try to verify an IDN exists...

>>> from django.core.validators import URLValidator
>>> url = u'http://עברית.idn.icann.org'
>>> URLValidator(url)
True
>>> URLValidator(url, verify_exists=True)
ValidationError

The problem seems to be that urllib2 doesn't support IDN directly:

>>> req = urllib2.Request(url)
>>> urllib2.urlopen(req)
UnicodeEncodeError...

This can be resolved if the IDNA-encoded version is used instead:

idna_url = 'http://xn--54b7fta0cc.idn.icann.org/'
>>> req = urllib2.Request(idna_url)
>>> urllib2.urlopen(req)
<addinfourl at 3082877196L whose fp = <socket._fileobject object at 0xb7c1333c>>

ICANN has a of example names which might be useful for testing.

See also discussion at http://groups.google.com/group/django-developers/browse_frm/thread/6403db0c5229700a

Attachments (1)

12989.diff (2.8 KB) - added by frasern 5 years ago.

Download all attachments as: .zip

Change History (9)

comment:1 Changed 5 years ago by jezdez

  • Component changed from Uncategorized to Core framework
  • milestone set to 1.2
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted
  • Version changed from 1.1 to SVN

comment:2 Changed 5 years ago by UloPe

I replied already to the ML but I'll repeat it here for completeness sake:

This seems only to be true for python 2.4. In 2.5 and above urlopen will happily accept IDNs.

comment:3 Changed 5 years ago by jezdez

The above example didn't work in 2.5. nor 2.6 for me. Are you sure it did for you?

Changed 5 years ago by frasern

comment:4 Changed 5 years ago by frasern

  • Has patch set

I too tried in 2.4, 2.5 and 2.6 and couldn't get it to work.

My patch and tests are now attached, but as per the django-developers thread, I think URL validation could do with a bit more thought.

The test case checks for the existence of עברית.idn.icann.org. I tend to think it would be better not to rely on a third-party domain name here. Perhaps someone could configure a subdomain such as idn-τεsτ.djangoproject.com to serve a simple 200 OK response and we could use that instead.

comment:5 Changed 5 years ago by jezdez

  • Owner changed from nobody to jezdez
  • Status changed from new to assigned

comment:6 Changed 5 years ago by jezdez

While I agree that we might need to work on providing the infrastructure for running these validator tests, I also think it's out of the scope of this ticket. Additionally the URL "עברית.idn.icann.org" you used in the patch is kept available even after the evaluation period of IDN has ended (ref http://idn.icann.org/#Limited_evaluation_period).

comment:7 Changed 5 years ago by jezdez

  • Resolution set to fixed
  • Status changed from assigned to closed

(In [12620]) Fixed #12989 - Fixed verification of IDN URLs. Thanks to Fraser Nevett for the report and patch.

comment:8 Changed 3 years ago by jacob

  • milestone 1.2 deleted

Milestone 1.2 deleted

Note: See TracTickets for help on using tickets.
Back to Top