Opened 2 months ago

Last modified 4 weeks ago

#36809 assigned Cleanup/optimization

Allow EmailValidator to customize error messages depending on the part that failed validation

Reported by: Daniel E Onetti Owned by: jaffar Khan
Component: Core (Other) Version: dev
Severity: Normal Keywords: EmailValidator
Cc: Mike Edmunds, jaffar Khan Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: yes Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Currently, the EmailValidator only provides the full 'value' in the params dictionary when a ValidationError is raised. However, in many use cases, developers need to customize error messages based specifically on the domain part of the email (e.g., "The domain %(domain_part)s is not allowed").

This change adds 'domain_part' to the params dictionary in EmailValidator.call, bringing it in line with how other validators provide decomposed parts of the validated value. This allows for more granular and helpful error messages for end users.

I have already prepared a patch with tests and verified that it passes all style (flake8) and functional checks.

Change History (13)

comment:1 by Kundan Yadav, 2 months ago

Owner: set to Kundan Yadav
Status: newassigned

comment:2 by Kundan Yadav, 2 months ago

hey can i work on it or is it already being worked on by you

in reply to:  description ; comment:3 by Natalia Bidart, 2 months ago

Has patch: set
Keywords: EmailValidator added
Needs documentation: set
Needs tests: set
Owner: changed from Kundan Yadav to Daniel E Onetti
Patch needs improvement: set
Triage Stage: UnreviewedAccepted
Type: UncategorizedCleanup/optimization

Hello Daniel! Thanks for the report, for taking the time to prepare a patch, and for following the contributing guide. I appreciate that.

I agree that the use case is valid and that being able to reference the domain in error messages can be genuinely useful. That said, I am not convinced that simply adding domain_part to ValidationError.params is the best fit for Django core. It solves the immediate problem, but does so in a way that feels quite specific rather than structural. It also risks committing to an API that only partially reflects how EmailValidator actually works.

Looking at the other validators in this module, there is a clear pattern of separating parsing, normalization, and validation into distinct steps or methods, even when the final error surface remains simple. EmailValidator is somewhat inconsistent here: it already decomposes the value into user and domain components, but does so inline inside __call__, with no clear extension points. Adding a single extra param addresses the symptom rather than the structure.

I think a more Django aligned approach would be to re-evaluate EmailValidator as a whole and give it clearer, overridable validation hooks, while keeping full backward compatibility. For example, introducing explicit methods like validate_recipient() and validate_domain(), called from __call__, would allow customization via subclassing without re-parsing the email or wrapping errors. The existing validate_domain_part() could be kept for compatibility and potentially deprecated later, since it currently returns a boolean.

So in short, I agree with the motivation, but I think the solution needs to be broader and more structural than the current proposal in order to align better with Django's existing validator patterns.

in reply to:  2 comment:4 by Natalia Bidart, 2 months ago

Replying to Kundan Yadav:

hey can i work on it or is it already being worked on by you

Hello Kundan, there is an initial PR already for this issue by Daniel, so I have re-assigned this to them,

comment:5 by Natalia Bidart, 2 months ago

Summary: Allow EmailValidator to return 'domain_part' in ValidationError paramsAllow EmailValidator to customize error messages depending on the part that failed validation

comment:6 by Mike Edmunds, 2 months ago

Cc: Mike Edmunds added

Natalia's proposal would also simplify implementation of #27029 EAI address validation. (Or help developers who need that functionality sooner subclass EmailValidator themselves.)

Django already uses "recipient" to mean a complete email address, possibly including a friendly display name (e.g., django.core.mail.EmailMessage.all_recipients()). The official RFC 5322 term for the part before the @ is local-part, but other common terms are user (e.g., existing EmailValidator code), username (e.g., Python's email.headerregistry.Address.username), or mailbox. I'd maybe go with validate_username() and validate_domain() to align with Python's email package.

Version 0, edited 2 months ago by Mike Edmunds (next)

comment:7 by Natalia Bidart, 2 months ago

Thank you Mike for the validation! I appreciate it. Your naming proposal sounds great!

comment:8 by jaffar Khan, 4 weeks ago

Cc: jaffar Khan added

I want to work on this ticket as the current owner seem to no longer active.
I think it is more genuine way to separate checks of username and domain as Natalia proposed. I defined two methods validate_username() and validate_domain() as:

    def validate_username(self, user_part):
        if not self.user_regex.match(user_part):
            raise ValidationError(self.message, code=self.code, params={"value": user_part})

    def validate_domain(self, domain_part):
        if domain_part not in self.domain_allowlist and not self.domain_regex.match(domain_part):
            raise ValidationError(self.message, code=self.code, params={"value": domain_part})

then inside call, I called these two methods and removed the username and domain checks as it looks no longer necessary:

    def __call__(self, value):
        # The maximum length of an email is 320 characters per RFC 3696
        # section 3.
        if not value or "@" not in value or len(value) > 320:
            raise ValidationError(self.message, code=self.code, params={"value": value})

        user_part, domain_part = value.rsplit("@", 1)

        self.validate_username(user_part)
        self.validate_domain(domain_part)

I think it will be a more developer-friendly way. Please let me know whether to create a PR.

comment:9 by jaffar Khan, 4 weeks ago

I checked docs/ref/validators.txt, and there is no description about methods of classes, so I don't think documentation changes are needed.

comment:10 by jaffar Khan, 4 weeks ago

I opened a PR https://github.com/django/django/pull/20616 based on above description.
There are some linter failures, If the modified changes are acceptable then I will polish the PR.

comment:11 by jaffar Khan, 4 weeks ago

Owner: changed from Daniel E Onetti to jaffar Khan

in reply to:  3 comment:12 by Mike Edmunds, 4 weeks ago

Natalia, in attempting to review jaffar Khan's PR, I realized that while I agree with these statements in theory, I'm having trouble putting them into practice:

Replying to Natalia Bidart:

… Looking at the other validators in this module, there is a clear pattern of separating parsing, normalization, and validation into distinct steps or methods, even when the final error surface remains simple. EmailValidator is somewhat inconsistent here: it already decomposes the value into user and domain components, but does so inline inside __call__, with no clear extension points. Adding a single extra param addresses the symptom rather than the structure.

I think a more Django aligned approach would be to re-evaluate EmailValidator as a whole and give it clearer, overridable validation hooks, while keeping full backward compatibility. For example, introducing explicit methods like validate_recipient() and validate_domain(), called from __call__, would allow customization via subclassing without re-parsing the email or wrapping errors. …

I couldn't tell which validators you were referring to that separate parsing, normalization and validation. EmailValidator follows the pattern of the RegexValidators: it has some (overridable) regex and list properties, and it does pretty much everything in its __call__() method. Unlike the RegexValidators, it also has one extension point method: validate_domain_part() allows replacing the domain validation logic (but not domain_allowlist exceptions or ValidationError construction). I believe the intent was to support stricter domain validation via subclassing (e.g., for users who felt DomainNameValidator was too lax or wanted to disallow numeric domain literals).

If a refactored EmailValidator looks something like:

class EmailValidator:
    def __call__(self, value):
        ...  # pre-validation omitted
        username, domain = value.rsplit("@", 1)  # or self.parse_parts(value) ?
        self.validate_username(username, value)
        self.validate_domain(domain, value)        

    def validate_domain(self, domain, value):
        if domain not in self.domain_allowlist and not some_other_logic_on(domain):
            raise ValidationError(self.message, code=self.code, params={"value": value})

… then the original ticket request to have access to the domain in the error message does seem to require wrapping errors (or duplicating logic from the superclass):

class EmailValidatorWithBetterErrorMessages(EmailValidator):
    def validate_domain(self, domain, value):
        try:
            super().validate_domain(domain, value)
        except ValidationError as error:
            # change "code" and add domain to the error params
            raise ValidationError(self.message, code="invalid-domain",
                params={"value": value, "domain": domain}) from error

… and substituting the domain validation logic requires duplicating some of the superclass code:

class EmailValidatorWithCustomDomainValidation(EmailValidator):
    def validate_domain(self, domain, value):
        if domain in self.domain_allowlist:
            return  # duplicate allowlist logic from superclass
        if not idna.utils.is_valid_domain(domain):
            raise ValidationError(self.message, code=self.code, params={"value": value})

… which makes me think I've misunderstood what you were envisioning in reworking EmailValidator to improve subclassing hooks.

comment:13 by jaffar Khan, 4 weeks ago

Thanks Mike for the review. I think defining subclass is not a good option which makes duplication of code as you describe. Defining explicit methods like I did will makes the code more readable and will avoid duplication.
In my current PR I made some changes like the methods validate_domain() and validate_username() will return boolean so we need to wrap the methods inside call. Now I am confused about do I have to make these two methods to raise ValidationError then no need for wrapping inside call or left as it is?

Last edited 4 weeks ago by jaffar Khan (previous) (diff)
Note: See TracTickets for help on using tickets.
Back to Top