Opened 13 years ago

Closed 7 months ago

Last modified 9 days ago

#18119 closed New feature (fixed)

add DomainNameValidator to validate Internet Domain Names

Reported by: michele Owned by: Nina Menezes
Component: Core (Other) Version: dev
Severity: Normal Keywords: validators
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Internet Domain Names are very present throughout, e.g. in URLs and e-mail addresses.

This patch adds django.core.validators.DomainNameValidator to check Internet Domain Names.

DomainNameValidator supports IDNA (utf) domain names, but it can reject them if optional accept_idna=False is passed to the constructor (defaults to accept).

DomainNameValidator is implemented as follows:

  • derives RegexValidator
  • uses the same RegEx of URLValidator (with the additional constrain to accept only 127 labels / 255 chars, as per RFC)
  • if plain validation fails, converts UTF -> ASCII with IDNA encoding and attempts to validate again (unless accept_idna=False)

Based on this Validator, I will submit separate patches to add DomainNameField support for models.

Attachments (2)

domainnamevalidator.txt (1.6 KB ) - added by michele 13 years ago.
domainnamevalidator_2.txt (1.7 KB ) - added by michele 13 years ago.
DomainNameValidator patch #2 - remove validator-own attributes before super()

Download all attachments as: .zip

Change History (26)

by michele, 13 years ago

Attachment: domainnamevalidator.txt added

comment:1 by Claude Paroz, 13 years ago

Needs documentation: set
Needs tests: set
Patch needs improvement: set
Triage Stage: UnreviewedAccepted

I think it is a good idea to centralize domain name validation. I detected at least three points where we could use this validator (django/core/mail/message.py, django/core/validators.py (URLValidator), django/utils/html.py).

For DRY reasons, the domain regex part should be shared with URLValidator. In __init__, just use self.accept_idna = kwargs.pop('accept_idna', True)

by michele, 13 years ago

Attachment: domainnamevalidator_2.txt added

DomainNameValidator patch #2 - remove validator-own attributes before super()

comment:2 by michele, 13 years ago

In attachment:domainnamevalidator_2.txt :

  • fix the constructor argument as per claudep suggestion
  • return a specific error message based on IDNA being accepted or not

comment:3 by Berker Peksag, 10 years ago

Needs documentation: unset
Needs tests: unset
Patch needs improvement: unset
Version: 1.4master

comment:4 by Tim Graham, 10 years ago

Patch needs improvement: set

This is blocked on #20003. Patch will need to be updated after that change.

comment:5 by Thomas Güttler, 9 years ago

would be nice to have

comment:6 by Berker Peksag, 8 years ago

Patch needs improvement: unset

Updated PR.

comment:7 by Tim Graham, 8 years ago

Patch needs improvement: set

comment:8 by Nina Menezes, 11 months ago

Owner: changed from nobody to Nina Menezes
Status: newassigned

comment:10 by Natalia Bidart, 9 months ago

Patch needs improvement: unset

Thank you Nina for working on this! I have unset the patch needs improvement flag so this is listed in the "needs review" list in the dashboard.

comment:11 by Sarah Boyce, 9 months ago

Needs documentation: set
Patch needs improvement: set

comment:12 by Nina Menezes, 8 months ago

Needs documentation: unset
Patch needs improvement: unset

Documentation now updated.

comment:13 by Sarah Boyce, 8 months ago

Patch needs improvement: set

comment:14 by Nina Menezes, 7 months ago

Patch needs improvement: unset

comment:15 by Sarah Boyce, 7 months ago

Patch needs improvement: set

comment:16 by Claude Paroz, 7 months ago

Somewhat naive question: are there still use cases to exclude IDNA domain names from such a validator, 12 years after the initial report?

comment:17 by Nina Menezes, 7 months ago

Patch needs improvement: unset

in reply to:  16 comment:18 by Sarah Boyce, 7 months ago

Replying to Claude Paroz:

Somewhat naive question: are there still use cases to exclude IDNA domain names from such a validator, 12 years after the initial report?

Good question, perhaps it makes sense to remove the accept_idna argument and not support excluding IDNA domain names for now. Then if someone has a use case this can be raised in a new ticket?
Nina - what do you think?

comment:19 by Sarah Boyce, 7 months ago

Patch needs improvement: set

(open question as to whether we should drop accept_idna)

comment:20 by michele, 7 months ago

The reason for adding accept_idna had nothing to do with how "modern" the format was. It's a way to allow the developer to enforce ascii names.

Although python beautifully supports UTF, the developer is often required to handle non-ascii characters specially. For example, when checking their length, or storing them somewhere, or serializing them to somewhere else. The exceeding rarity of IDNA causes many to assume the domain is ASCII, only to run into exception later in production.

If nothing else, the presence of the accept_idna option forces the developer to think whether that's a relevant case for them, and what to do about it.

I wouldn't necessarily consider adding it if the patch was without, but I certainly see value in keeping it now it's already there.

PS: I'm the original submitter. Trac wouldn't send me a reset-password link.

Last edited 7 months ago by michele (previous) (diff)

in reply to:  20 comment:21 by Sarah Boyce, 7 months ago

Patch needs improvement: unset
Triage Stage: AcceptedReady for checkin

Replying to michele:

I wouldn't necessarily consider adding it if the patch was without, but I certainly see value in keeping it now it's already there.

PS: I'm the original submitter. Trac wouldn't send me a reset-password link.

Thank you for your input. In which case, I am marking the patch as "Ready for checkin".

comment:22 by Claude Paroz, 7 months ago

OK, let's go for it, at least the parameter default is True.

comment:23 by Sarah Boyce <42296566+sarahboyce@…>, 7 months ago

Resolution: fixed
Status: assignedclosed

In 4971a9a:

Fixed #18119 -- Added a DomainNameValidator validator.

Thanks Claude Paroz for the review.

Co-authored-by: Nina Menezes <77671865+nmenezes0@…>

comment:24 by Sarah Boyce <42296566+sarahboyce@…>, 9 days ago

In 5405912:

Fixed #36007 -- Removed dead code from URLValidator.

The "Trivial case failed. Try for possible IDN domain" handling was
obsoleted by ticket-20003, which adjusted the regular expressions to
allow all international domain names (Refs #20003).

Uses of ul were moved to DomainNameValidator in ticket-18119
(Refs #18119).

Note: See TracTickets for help on using tickets.
Back to Top