Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#27125 closed New feature (invalid)

Can not support django Internationalized domain name in URLValidation

Reported by: Ramin Farajpour Cami Owned by: nobody
Component: Core (URLs) Version: 1.10
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Ramin Farajpour Cami)

Hi,

i see here validator.py, there isn't docs aboud IDN supported django IDN or no,

there is arabic domains on internet like عربي.امارات

from django.core.validators import URLValidator
validate = URLValidator()

validate('http://عربی.امارات')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\site-packages\django\core\validators.py", line 114, in __call__
    value = force_text(value)
  File "C:\Python27\lib\site-packages\django\utils\encoding.py", line 88, in force_text
    raise DjangoUnicodeDecodeError(s, *e.args)
django.utils.encoding.DjangoUnicodeDecodeError: 'utf8' codec can't decode byte 0x9f in position 12: invalid start byte. You passed in 'http://\xe3\xa9\xa0?.\x9f\xea\x9f\xa9\x9f\xa2' (<type 'str'>)

You can use idn2 tool.

root@ramin ~ > idn2 'عربي.امارات'
xn--ngbrx4e.xn--mgbaam7a8h
root@ramin > nslookup $(idn2 'عربي.امارات')
Server:         127.0.1.1
Address:        127.0.1.1#53

Non-authoritative answer:
Name:   xn--ngbrx4e.xn--mgbaam7a8h
Address: 79.98.120.105

root@ramin > nmap $(idn2 'عربي.امارات') -p80

Starting Nmap 6.40 ( http://nmap.org ) at 2016-08-26 12:09 +07
Nmap scan report for xn--ngbrx4e.xn--mgbaam7a8h (79.98.120.105)
Host is up (0.23s latency).
rDNS record for 79.98.120.105: web-lb0.web.308th.dubai.aeda.net.ae
PORT   STATE SERVICE
80/tcp open  http

Nmap done: 1 IP address (1 host up) scanned in 0.88 seconds  

or use CURL

$ curl --head xn--ngbrx4e.xn--mgbaam7a8h

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0HTTP/1.0 200 OK
Date: Fri, 26 Aug 2016 05:09:07 GMT
Server: Apache
X-Powered-By: PHP/5.3.3
Content-Type: text/html
Connection: close


please see this answer stackoverflow

Thanks,
Ramin

Change History (11)

comment:1 by Ramin Farajpour Cami, 8 years ago

Component: UncategorizedCore (URLs)

comment:2 by Ramin Farajpour Cami, 8 years ago

Description: modified (diff)

comment:3 by Claude Paroz, 8 years ago

Resolution: worksforme
Status: newclosed

I cannot reproduce the issue on my system, but I guess the problem may come from the fact you are passing a bytestring instead of an Unicode string to validate. Try prefixing your URL with the u prefix (needed on Python 2 only).
Validating non-ASCII URLs is tested in Django, see https://github.com/django/django/blob/master/tests/validators/valid_urls.txt

in reply to:  3 comment:4 by Ramin Farajpour Cami, 8 years ago

Replying to claudep:

I cannot reproduce the issue on my system, but I guess the problem may come from the fact you are passing a bytestring instead of an Unicode string to validate. Try prefixing your URL with the u prefix (needed on Python 2 only).
Validating non-ASCII URLs is tested in Django, see https://github.com/django/django/blob/master/tests/validators/valid_urls.txt

>>> validate = URLValidator()
>>> validate(u'http://عربی.امارات')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\site-packages\django\core\validators.py", line 132, in __call__
    super(URLValidator, self).__call__(url)
  File "C:\Python27\lib\site-packages\django\core\validators.py", line 61, in __call__
    raise ValidationError(self.message, code=self.code)
django.core.exceptions.ValidationError
>>>

comment:5 by Ramin Farajpour Cami, 8 years ago

i test it's work ,

>>> value = u'http://مثال.إختبار'
>>> validate(value)

but this not work :)))

>>> value = u'http://عربی.امارات'
>>> validate(value)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\site-packages\django\core\validators.py", line 118, in __call__
    raise ValidationError(self.message, code=self.code)
django.core.exceptions.ValidationError

please open my ticket . thanks,

Last edited 8 years ago by Ramin Farajpour Cami (previous) (diff)

comment:6 by Ramin Farajpour Cami, 8 years ago

Update this message

Last edited 8 years ago by Ramin Farajpour Cami (previous) (diff)

comment:7 by Ramin Farajpour Cami, 8 years ago

i understand this problem exist on commend prompt windows 10, i don't know why, (_)

i test with idea work it, nothing problem ,

Version 0, edited 8 years ago by Ramin Farajpour Cami (next)

comment:8 by Marten Kenbeek, 8 years ago

What happens when you do print(u'http://عربی.امارات')? My windows command prompt replaces invalid unicode values with ?, so the value that is validated is u'http://????.??????' -- which is of course an invalid url.

in reply to:  8 comment:9 by Ramin Farajpour Cami, 8 years ago

Replying to knbk:

What happens when you do print(u'http://عربی.امارات')? My windows command prompt replaces invalid unicode values with ?, so the value that is validated is u'http://????.??????' -- which is of course an invalid url.

Yes, i see an index change to ?,it was interesting , and this is a invalid url, do you know why change to ? ?

print(u'http://عربی.امارات')

Thanks,

Last edited 8 years ago by Ramin Farajpour Cami (previous) (diff)

comment:10 by Marten Kenbeek, 8 years ago

Resolution: worksformeinvalid

As far as I know this is because the command prompt doesn't really support unicode. When encoding unsupported characters, Python has several error modes. The replace mode replaces any invalid characters with a ? when encoding a string.

This only affects your shell -- your Django code should work fine. I'm afraid I don't know any workarounds for this issue, I'm not very familiar with unicode issues on windows.

Either way, this seems to be a bug in how your shell handles unicode, not in Django.

comment:11 by Ramin Farajpour Cami, 8 years ago

Thanks, yes, i say work it on IDE pycharm unicode with validation django, it's was limited events for me on shell windows, windows shell I fooled for open this ticket, because i test other string arabic any problem,

i am very sad for this ticket,it's was very bad for me with mistake of windows shell for open this ticket,

Note: See TracTickets for help on using tickets.
Back to Top