Opened 9 years ago

Last modified 7 years ago

#25418 new New feature

URL Validator to check only hostname part without domain nor tld

Reported by: FoxMaSk Owned by: nobody
Component: Core (Other) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Tim Graham)

Hi,
on the LAN, we don't use full scheme of host + domain + tld, so it could be fine if we could change this

    host_re = '(' + hostname_re + domain_re + tld_re + '|localhost)'

to

    host_re = '(' + hostname_re + domain_re + tld_re + '|' + hostname_re + '|localhost)'

regards

Change History (9)

comment:1 by Moritz Sichert, 9 years ago

Component: Core (URLs)Core (Other)
Triage Stage: UnreviewedAccepted
Type: UncategorizedBug
Version: 1.8master

URLs like http://host definitely should be considered valid.
Also, is there really a need to differentiate between tld, domain and hostname? As far as DNS is concerned they don't really mean different things.

comment:2 by Tim Graham, 9 years ago

Description: modified (diff)

I believe the idea of URLValidator is to recognize URLs that usually work without some special DNS setup. I feel like this proposal has come up before and been rejected. If so, we should document the restriction (and how to lift it in your own validation) to try to prevent it from being proposed again and again.

comment:3 by Claude Paroz, 9 years ago

Type: BugNew feature

In 4e2e8f39d19d79a59c2696b2c40cb619a54fa745, we added some flexibility to add whitelisted hostnames for EmailValidator. It might make sense to add the same for URLValidator, but it would probably require to (again!) restructure that validator.

comment:4 by FoxMaSk, 9 years ago

I'm sorry to speak about the topic, as I didn't know all about this history, if it's boring ; close that ticket.
Regards

comment:5 by Claude Paroz, 9 years ago

Nothing's boring. If you want to give it a shot and try a patch that matches the domain_whitelist behavior of EmailValidator, we'll happily review it. If this fulfills your use case, of course.

comment:6 by Josh Schneier, 9 years ago

I just ran into this as well. There is also no whitelist of hostnames that I can hardcode (so that solution is out). Unfortunately the data is entered via a CMS which is using URLValidator internally and leaving users quite frustrated.

Is changing host_re to

host_re = '(' + hostname_re + domain_re + tld_re + '|' + hostname_re + '|localhost)'

as suggested by the original reporter of this issue out of the question? That seems like the simplest way to fix this and will bring Django more in-line with the RFC that lays out URLs.

comment:7 by Tim Graham, 9 years ago

I believe that conflicts with what I mentioned in comment 2.

comment:8 by Chris Withers, 7 years ago

Tim, the fact that you've had a duplicate from me suggests that the regex is wrong. https://tools.ietf.org/html/rfc3986#section-3.2.2 does not specify the need for either a domain or a tld.
That's the technicality, the practicality is that most intranet urls are left in short form, without a domain or tld. These are not a "special" DNS setup, and I can't see a reason to simplify the regex.

Please can you provide more justification for your position?

comment:9 by Tim Graham, 7 years ago

There's a past discussion about it in #9202. That being said, we might instead drastically simplify the regex in favor of HTML5 validation as suggested on django-developers.

Note: See TracTickets for help on using tickets.
Back to Top