#18119 closed New feature (fixed)
add DomainNameValidator to validate Internet Domain Names
| Reported by: | michele | Owned by: | Nina Menezes |
|---|---|---|---|
| Component: | Core (Other) | Version: | dev |
| Severity: | Normal | Keywords: | validators |
| Cc: | Triage Stage: | Ready for checkin | |
| Has patch: | yes | Needs documentation: | no |
| Needs tests: | no | Patch needs improvement: | no |
| Easy pickings: | no | UI/UX: | no |
Description
Internet Domain Names are very present throughout, e.g. in URLs and e-mail addresses.
This patch adds django.core.validators.DomainNameValidator to check Internet Domain Names.
DomainNameValidator supports IDNA (utf) domain names, but it can reject them if optional accept_idna=False is passed to the constructor (defaults to accept).
DomainNameValidator is implemented as follows:
- derives RegexValidator
- uses the same RegEx of URLValidator (with the additional constrain to accept only 127 labels / 255 chars, as per RFC)
- if plain validation fails, converts UTF -> ASCII with IDNA encoding and attempts to validate again (unless accept_idna=False)
Based on this Validator, I will submit separate patches to add DomainNameField support for models.
Attachments (2)
Change History (26)
by , 14 years ago
| Attachment: | domainnamevalidator.txt added |
|---|
comment:1 by , 14 years ago
| Needs documentation: | set |
|---|---|
| Needs tests: | set |
| Patch needs improvement: | set |
| Triage Stage: | Unreviewed → Accepted |
by , 14 years ago
| Attachment: | domainnamevalidator_2.txt added |
|---|
DomainNameValidator patch #2 - remove validator-own attributes before super()
comment:2 by , 14 years ago
In attachment:domainnamevalidator_2.txt :
- fix the constructor argument as per claudep suggestion
- return a specific error message based on IDNA being accepted or not
comment:3 by , 11 years ago
| Needs documentation: | unset |
|---|---|
| Needs tests: | unset |
| Patch needs improvement: | unset |
| Version: | 1.4 → master |
comment:4 by , 11 years ago
| Patch needs improvement: | set |
|---|
This is blocked on #20003. Patch will need to be updated after that change.
comment:7 by , 9 years ago
| Patch needs improvement: | set |
|---|
comment:8 by , 22 months ago
| Owner: | changed from to |
|---|---|
| Status: | new → assigned |
comment:10 by , 20 months ago
| Patch needs improvement: | unset |
|---|
Thank you Nina for working on this! I have unset the patch needs improvement flag so this is listed in the "needs review" list in the dashboard.
comment:11 by , 19 months ago
| Needs documentation: | set |
|---|---|
| Patch needs improvement: | set |
comment:12 by , 19 months ago
| Needs documentation: | unset |
|---|---|
| Patch needs improvement: | unset |
Documentation now updated.
comment:13 by , 18 months ago
| Patch needs improvement: | set |
|---|
comment:14 by , 18 months ago
| Patch needs improvement: | unset |
|---|
comment:15 by , 18 months ago
| Patch needs improvement: | set |
|---|
follow-up: 18 comment:16 by , 18 months ago
Somewhat naive question: are there still use cases to exclude IDNA domain names from such a validator, 12 years after the initial report?
comment:17 by , 18 months ago
| Patch needs improvement: | unset |
|---|
comment:18 by , 18 months ago
Replying to Claude Paroz:
Somewhat naive question: are there still use cases to exclude IDNA domain names from such a validator, 12 years after the initial report?
Good question, perhaps it makes sense to remove the accept_idna argument and not support excluding IDNA domain names for now. Then if someone has a use case this can be raised in a new ticket?
Nina - what do you think?
comment:19 by , 18 months ago
| Patch needs improvement: | set |
|---|
(open question as to whether we should drop accept_idna)
follow-up: 21 comment:20 by , 18 months ago
The reason for adding accept_idna had nothing to do with how "modern" the format was. It's a way to allow the developer to enforce ascii names.
Although python beautifully supports UTF, the developer is often required to handle non-ascii characters specially. For example, when checking their length, or storing them somewhere, or serializing them to somewhere else. The exceeding rarity of IDNA causes many to assume the domain is ASCII, only to run into exception later in production.
If nothing else, the presence of the accept_idna option forces the developer to think whether that's a relevant case for them, and what to do about it.
I wouldn't necessarily consider adding it if the patch was without, but I certainly see value in keeping it now it's already there.
PS: I'm the original submitter. Trac wouldn't send me a reset-password link.
comment:21 by , 18 months ago
| Patch needs improvement: | unset |
|---|---|
| Triage Stage: | Accepted → Ready for checkin |
Replying to michele:
I wouldn't necessarily consider adding it if the patch was without, but I certainly see value in keeping it now it's already there.
PS: I'm the original submitter. Trac wouldn't send me a reset-password link.
Thank you for your input. In which case, I am marking the patch as "Ready for checkin".
I think it is a good idea to centralize domain name validation. I detected at least three points where we could use this validator (django/core/mail/message.py, django/core/validators.py (URLValidator), django/utils/html.py).
For DRY reasons, the domain regex part should be shared with URLValidator. In
__init__, just useself.accept_idna = kwargs.pop('accept_idna', True)