Opened 6 years ago

Last modified 3 years ago

#11716 assigned Bug

Various methods in django.db.models.fields don't wrap ValueErrors and allow them to escape

Reported by: Leo Owned by: Leo
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: danfairs Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Passing in a '---' as the value for a ModelChoiceField causes an exception down in query.py that filters up through the form.

Here is specifically what I'm doing:

class Bar(Model):
    name = TextField()

class Foo(Model):
    bar = ForeignKey(Bar, blank=True, null=True)

class FooForm(ModelForm):
    def __init__(self, *args, **kwargs):
        super(FooForm, self).__init__(*args, **kwargs)

        bars = list(Bar.objects.order_by('name'))
        self.fields['bar'].choices = [('---','---')] + [(b.id, b.name) for b in bars]

If someone submits the form without selecting a Bar in the dropdown, the view does the standard is_valid() call and blows up.

But even if setting the field up this way that isn't a good idea (which it may not be), a client can submit whatever they want to the view that processes the form so this is a situation where submitting an invalid value creates a 500 rather than being caught in the field and reported as a form error.

Here is the full traceback

  File "C:\ws\fuzzy\EvoworxSite\src\testapp\views.py", line 13, in do_test
    if form.is_valid():

  File "C:\Python25\lib\site-packages\django\forms\forms.py", line 120, in is_valid
    return self.is_bound and not bool(self.errors)

  File "C:\Python25\lib\site-packages\django\forms\forms.py", line 111, in _get_errors
    self.full_clean()

  File "C:\Python25\lib\site-packages\django\forms\forms.py", line 240, in full_clean
    value = field.clean(value)

  File "C:\Python25\lib\site-packages\django\forms\models.py", line 993, in clean
    value = self.queryset.get(**{key: value})

  File "C:\Python25\lib\site-packages\django\db\models\query.py", line 299, in get
    clone = self.filter(*args, **kwargs)

  File "C:\Python25\lib\site-packages\django\db\models\query.py", line 498, in filter
    return self._filter_or_exclude(False, *args, **kwargs)

  File "C:\Python25\lib\site-packages\django\db\models\query.py", line 516, in _filter_or_exclude
    clone.query.add_q(Q(*args, **kwargs))

  File "C:\Python25\lib\site-packages\django\db\models\sql\query.py", line 1675, in add_q
    can_reuse=used_aliases)

  File "C:\Python25\lib\site-packages\django\db\models\sql\query.py", line 1614, in add_filter
    connector)

  File "C:\Python25\lib\site-packages\django\db\models\sql\where.py", line 56, in add
    obj, params = obj.process(lookup_type, value)

  File "C:\Python25\lib\site-packages\django\db\models\sql\where.py", line 269, in process
    params = self.field.get_db_prep_lookup(lookup_type, value)

  File "C:\Python25\lib\site-packages\django\db\models\fields\__init__.py", line 210, in get_db_prep_lookup
    return [self.get_db_prep_value(value)]

  File "C:\Python25\lib\site-packages\django\db\models\fields\__init__.py", line 361, in get_db_prep_value
    return int(value)

ValueError: invalid literal for int() with base 10: '---'

Digging into the code it looks like django.db.models.fields.AutoField.get_db_prep_value should look something like this:

    def get_db_prep_value(self, value):
        if value is None:
            return value
        try:
            return int(value)
        except (TypeError, ValueError):
            raise exceptions.ValidationError(
                _("This value must be an integer."))

Which is just like the to_python() method above it.

Further, it looks like IntegerField and several others in that file have the same problem and let the ValueErrors escape.

Attachments (4)

11716.patch (2.9 KB) - added by Leo 5 years ago.
Patch against r11526
11716-2.patch (8.6 KB) - added by Leo 5 years ago.
updated patch against r12623
failure_case.patch (2.4 KB) - added by Leo 5 years ago.
Example failure case from not catching ValidationError
11716.diff (12.9 KB) - added by timo 4 years ago.
updated patch to trunk + tests passing

Download all attachments as: .zip

Change History (40)

comment:1 Changed 6 years ago by Leo

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Summary changed from ModelChoiceField blows up on non-integer input to Various methods in django.db.models.fields don't wrap ValueErrors and allow them to escape

comment:2 Changed 6 years ago by Leo

  • Has patch set
  • Needs tests set

I've attached a patch, and I'm happy to add some tests but I can't figure out where they would go. The similar behavior for DecimalField has no test that I could find.

Changed 5 years ago by Leo

Patch against r11526

comment:3 Changed 5 years ago by Leo

  • Needs tests unset
  • Owner changed from nobody to Leo
  • Status changed from new to assigned

Thanks to mattmcc for pointing me in the right direction, I've attached a patch with the tests.

comment:4 Changed 5 years ago by adamnelson

Would this also more gracefully handle an invalid value like 'f' when an int is expected?

Model.objects.get(id='f')

comment:5 Changed 5 years ago by Leo

int('f') and int('--') result in the same exception being thrown, which this patch would then convert into a ValidationError which gets turned into a normal form error.

comment:6 Changed 5 years ago by Leo

Ah, I see what you're asking now. Doing get() takes you through the same code path. This patch changes the behavior to be consistent with how it would be if you tried to do get(some_boolean_field='foo') and the documentation is silent on what the proper response should be if the argument is incorrect.

ValidationError comes from django.core.exceptions so it seems like its fair game to be used here (and is used for the other fields). If one of the committers feels its needed, I can try to add the behavior to the docs in the appropriate places.

comment:7 Changed 5 years ago by Leo

  • Version changed from 1.1 to SVN

comment:8 Changed 5 years ago by russellm

  • milestone set to 1.2
  • Patch needs improvement set
  • Triage Stage changed from Unreviewed to Accepted

I suspect the patch needs to be updated in the light of the changes for multidb.

comment:9 Changed 5 years ago by Leo

Thanks for accepting it Russell. I suspect you're right, I'll take a look.

It would have been fabulous to have had the patch merged in to trunk a little bit earlier in the cycle to prevent that though.

Changed 5 years ago by Leo

updated patch against r12623

comment:10 Changed 5 years ago by Leo

  • Patch needs improvement unset

Ok, I've updated the patch. There were a few more details involved here:

  • several functions in BooleanField would accept a value of None silently, that's no longer the case - this is backed up by the thinking in #5563
  • IntegerField's default error message had a typo in it - 'invalid': _("This value must be a float.") - it's been changed to read correctly - This value must be an integer.

Please let me know if there are any more issues that need to be resolved. I'd really love to get this into 1.2.

comment:11 Changed 5 years ago by kmtracey

A newly-added test fails with the current patch:

======================================================================
ERROR: testCommentDoneReSubmitWithInvalidParams (regressiontests.comment_tests.tests.comment_view_tests.CommentViewTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/kmt/tmp/django/trunk/tests/regressiontests/comment_tests/tests/comment_view_tests.py", line 235, in testCommentDoneReSubmitWithInvalidParams
    response = self.client.get(broken_location)
  File "/home/kmt/tmp/django/trunk/django/test/client.py", line 286, in get
    response = self.request(**r)
  File "/home/kmt/tmp/django/trunk/django/core/handlers/base.py", line 101, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
  File "/home/kmt/tmp/django/trunk/django/contrib/comments/views/utils.py", line 41, in confirmed
    comment = comments.get_model().objects.get(pk=request.GET['c'])
  File "/home/kmt/tmp/django/trunk/django/db/models/manager.py", line 132, in get
    return self.get_query_set().get(*args, **kwargs)
  File "/home/kmt/tmp/django/trunk/django/db/models/query.py", line 331, in get
    clone = self.filter(*args, **kwargs)
  File "/home/kmt/tmp/django/trunk/django/db/models/query.py", line 545, in filter
    return self._filter_or_exclude(False, *args, **kwargs)
  File "/home/kmt/tmp/django/trunk/django/db/models/query.py", line 563, in _filter_or_exclude
    clone.query.add_q(Q(*args, **kwargs))
  File "/home/kmt/tmp/django/trunk/django/db/models/sql/query.py", line 1100, in add_q
    can_reuse=used_aliases)
  File "/home/kmt/tmp/django/trunk/django/db/models/sql/query.py", line 1040, in add_filter
    connector)
  File "/home/kmt/tmp/django/trunk/django/db/models/sql/where.py", line 66, in add
    value = obj.prepare(lookup_type, value)
  File "/home/kmt/tmp/django/trunk/django/db/models/sql/where.py", line 275, in prepare
    return self.field.get_prep_lookup(lookup_type, value)
  File "/home/kmt/tmp/django/trunk/django/db/models/fields/__init__.py", line 318, in get_prep_lookup
    return self.get_prep_value(value)
  File "/home/kmt/tmp/django/trunk/django/db/models/fields/__init__.py", line 498, in get_prep_value
    return self.to_python(value)
  File "/home/kmt/tmp/django/trunk/django/db/models/fields/__init__.py", line 492, in to_python
    raise exceptions.ValidationError(self.error_messages['invalid'])
ValidationError: [u'This value must be an integer.']

----------------------------------------------------------------------
Ran 82 tests in 13.539s

FAILED (errors=1)

I kind of thought it was going to and nearly didn't add it, pending some resolution between that ticket (#12151) and this one, but on further reflection I think the problem is here. If there is existing code that handles errors in submitted data by catching ValueError raised on a get, that's going to break if the same error condition now raises ValidationError instead.

I agree that it's ugly that bad data can cause a plethora of different exceptions to be raised -- but switching to one and only one, if that is not one that was previously being raised, seems bckwards-incompatible. Am I missing something here?

comment:12 Changed 5 years ago by Leo

I agree that this is a behavior change (and in principle backwards incompatible) but virtually any bugfix is going to be.

This is my reasoning as to why this bugfix is worth the behavior change.

IMO, the issue here isn't that bad data can cause a plethora of different exceptions. IMO the issue is that the API for a Field's behavior is to throw ValidationErrors and not allow other errors to percolate up through the call stack. Most of the other fields already do this properly and these few do not.

The actual behavior in the current codebase is inconsistent with the expected API and therefore broken which forced users (and django itself - see the fix to django/forms/models.py in the patch) to have to code defensively. Yes, the behavior does change, however the rest of the stack is already written with the expectation of getting a ValidationError from fields when they're cleaned or when invalid input is given to them.

In the test that was added in r12681, the catch statement should be catching a ValidationError rather than a ValueError. In that specific test if for some reason request.GET['c'] returned something of a type that AutoField couldn't handle, you'd have to check for TypeError as well. Those exceptions are implementation details of AutoField and should be encapsulated by AutoField. For that specific example, if pk could ever refer to something other than an IntegerField (its not entirely crazy that a primary key might not be an integer) then you would have to catch any exceptions that field threw as well.

comment:13 Changed 5 years ago by kmtracey

But we can't introduce a change where previous application code that worked will now raise an exception. If yesterday code had to catch ValueError (and TypeError, and whatever else if it was to be 100% robust) in order to properly recover from being handed bad data, we cannot now start raising a different exception and make previously robust code start throwing exceptions.

comment:14 Changed 5 years ago by Leo

I'm not sure if you're stating that as a policy or as an argument for why this fix shouldn't be taken. I don't have sufficient knowledge to argue against it as a Django policy (although if that's the case, I will try and figure out the best way to do so). I can, however, make a strong argument for making this change on its own merits.

I agree with the principle of keeping the API consistent for a given release but this is a case where the implementation of the API was broken. If the point that you're making is that any code that relied on this broken behavior will be broken once this bug is fixed, well, yes. Any bugfix changes behavior and IMO a bugfix that causes an explicit loud exception to get thrown is much better than one that may have not immediately obvious side effects. As a random example of a bug that I've worked on, take #5605 - someone relying on the 'incorrect' lowercasing behavior and using a case sensitive DB backend may now have a very subtle bug in their code when they upgrade to 1.2. One could easily find such cases for pretty much any bug.

The consequences of not fixing this bug are fairly significant as described in the example above and require client code to look significantly uglier as well as become tied to the actual implementation of any specific field. I just double checked the docs and primary_key is just a method on Field so it can be applied to ANY field. Which in turn means that if you abandon the API of requiring the Field to encapsulate its internal working in a ValidationError, the catch statement in the comments code has to catch pretty much any exception that could possibly come out of the field. That should clearly not be the responsibility of that client code.

comment:15 follow-up: Changed 5 years ago by kmtracey

I'm saying it's policy not to introduce backwards-incompatible changes in a micro or minor release update, and therefore I have concerns about this patch, because it seems to me it may likely cause previously robust application code to start raising exceptions.

You say the consequences of not fixing this bug are significant. What I'm saying is we cannot now break code in the wild that has already been written to accommodate those consequences. Code that likely looks similar to the comments code:

def confirmed(request):
    comment = None
    if 'c' in request.GET:
        try:
            comment = comments.get_model().objects.get(pk=request.GET['c'])
        except (ObjectDoesNotExist, ValueError):
            pass
    return render_to_response(template,
        {'comment': comment},
        context_instance=RequestContext(request)
    )

This code catches ValueError there because that is the exception that has in fact been raised in observed cases where bad data is present in request.GET['c']. It doesn't catch ValidationError because that has not been an exception observed to be raised.

Now, it may well be possible to cause a ValidationError here, since a custom comments model may have a custom pk field of some non-standard type, a type that raises ValidationError on being handed invalid data. But for a typical auto-generated pk field, I cannot find any case where a querystring parameter value (which will be a unicode string) containing 'bad' data generates anything other than a ValueError. (Based on prior experience I thought TypeError was another possibility but I can't even get that one to be raised.)

So my concern is that there may be Django users with apps with code written much like this, and I do not think they will be pleased if we now change what gets raised there from a ValueError to a ValidationError. That change will cause their code, which previously gracefully handled being handed bad data values, to now start returning server error pages. I don't think they will see that change as an improvement.

comment:16 in reply to: ↑ 15 Changed 5 years ago by Leo

Replying to kmtracey:

I'm saying it's policy not to introduce backwards-incompatible changes in a micro or minor release update, and therefore I have concerns about this patch, because it seems to me it may likely cause previously robust application code to start raising exceptions.

I agree that it will cause some applications to start raising exceptions. I think your concerns are vary valid and critical to consider for a project like Django. I believe that its a better choice for Django to make this change in this release and take the impact of introducing this behavior change especially since non-trivial parts of the DB architecture have been worked on for this release (including the very functions this patch changes).

You say the consequences of not fixing this bug are significant. What I'm saying is we cannot now break code in the wild that has already been written to accommodate those consequences.

Sure, but that's true for virtually any bugfix so this isn't an absolute. This is a discussion with a range of impacts of either making the change or not making it.

This code catches ValueError there because that is the exception that has in fact been raised in observed cases where bad data is present in request.GET['c'].
It doesn't catch ValidationError because that has not been an exception observed to be raised.

In my experience the approach of seeing what exceptions I can observe has not been a great way figure out what exceptions to catch in the code that I write. I've found that fairly often I either overlook a potential failure case or do not fully understand (or sometimes don't even have access to) how the code I'm working with behaves. This has resulted in me preferring to think about interfaces and how the code should behave and defining those interfaces in one form or another.

Now, it may well be possible to cause a ValidationError here, since a custom comments model may have a custom pk field of some non-standard type, a type that raises ValidationError on being handed invalid data. But for a typical auto-generated pk field, I cannot find any case where a querystring parameter value (which will be a unicode string) containing 'bad' data generates anything other than a ValueError. (Based on prior experience I thought TypeError was another possibility but I can't even get that one to be raised.)

It doesn't need to be a very non-standard field. A DecimalField will cause the problem. It's a slightly larger test case to create a custom Comments model but I've attached a very simple one that breaks another part of the comments application that also doesn't catch ValidationError properly.

So my concern is that there may be Django users with apps with code written much like this, and I do not think they will be pleased if we now change what gets raised there from a ValueError to a ValidationError. That change will cause their code, which previously gracefully handled being handed bad data values, to now start returning server error pages. I don't think they will see that change as an improvement.

I think you're right. Very few developers are happy when their code doesn't work as they expect. In my experience the best way to prevent that is to have interfaces on commonly used components. If this change is properly messaged in the release notes I believe that it will make Django users happy when they can replace rather ugly and patched code with much cleaner code that represents a contract and gives them more assurance that their app won't break.

I respect your concerns about this patch and I think we're both striving to make Django better but I suspect we may not come to agreement on this issue so I'm wondering what the right next step is given Django's process. Should this be brought up for discussion on django-dev? is there a vote by the core committees? does the release manager decide this?

Changed 5 years ago by Leo

Example failure case from not catching ValidationError

comment:17 Changed 5 years ago by adamnelson

@kmtracey While it may in theory be true that milestone:1.2 is 'minor', that's not true in practice. Python 2.3 support is being dropped as well as all sorts of major backwards incompatible changes. Anybody who wrote a workaround for this behavior can stick with milestone:1.1 for an extended period of time while updating their codebase. Well maintained apps can move to this format in short order without losing milestone:1.0 compatibility. I would understand not making a backwards incompatible change in milestone:1.1.1 , but for something as big as milestone:1.2 with so many other major incompatibilities, we're wasting time.

In addition, russelm accepted the ticket and assigned it to milestone:1.2 . That should be sufficient to get the ticket merged.

comment:18 Changed 5 years ago by adamnelson

Maybe the compromise is to revert the ValidationError to ValueError change and leave everything else as is? ValidationError isn't in the standard library, so it's not the craziest thing to just stick with ValueError for milestone:1.2 and make a new ticket to rationalize exceptions in Django.

comment:19 Changed 5 years ago by kmtracey

In my view it is not a waste of time to ensure we don't blindside users with backwards-incompatible changes. Note we do attempt to document all backwards incompatible changes, so that people who might run into them may have a clue, if they read the release notes, that they could run into them when updating. There are no such notes included with this patch, nor was there even mention of the possibility of this introducing a backwards-incompatibility, until I brought it up. If this is to be done for 1.2, it needs documentation of the backwards incompatibility introduced.

(Note claiming it's perfectly OK to add more backwards-incompatible changes into 1.2 because it already has so many is fallacious: each one is considered on its own merits. If anything I'd argue that the reasons needs to be even stronger as the list gets longer, not the other way around.)

Alternatively, if there is a way to accomplish the bulk of what this ticket was aiming for without introducing a backwards-incompatibility, that would be even better, in my opinion. Hopefully I've clarified above what I mean by backwards-incompatible. (It is not true that every behavior change introduced by a bugfix introduces a backwards-incompatibility.)

As for process, what's needed is consensus among the core devs. What we've got now is one (me) who has expressed concerns and none who have come forward to say my concerns are unfounded or overblown. While Russ accepted the ticket, I do not know if he was aware of the backwards-incompatibility introduced and did not think it enough of a problem to worry about, or if he did not notice it.

I won't stand in the way if others on the core think this is a change worth making, even in light of the problems it may cause for existing code. I'll just strongly urge that it at least needs documentation of the backwards incompatibility.

comment:20 Changed 5 years ago by adamnelson

Is all that's being asked is whether to change ValidationError back to ValueError ? If so that seems like the way to go.

comment:21 Changed 5 years ago by Leo

@adamnelson Thanks for weighing in, however making the code not throw a ValueError is the heart of the patch. All that does is wrap the TypeErrors but doesn't fix the implementation from an API perspective.

@kmtracey, I strongly agree that this behavior change needs documentation. I left it out of the original patch since I strongly suspect that what I write will need to be re-written by a core dev anyways to fit style and consistency guidelines. If that's not the case, I'm happy to take a crack at it.

comment:22 Changed 5 years ago by kmtracey

Replying to adamnelson:

Is all that's being asked is whether to change ValidationError back to ValueError ? If so that seems like the way to go.

I don't know what you are proposing there exactly.

I'd object to changing fields (such as DecimalField) that currently raise ValidationError for bad data to instead raise ValueError, for the same reason as I've raised concerns about the reverse change: it would break previously working code.

If you're not proposing that but rather proposing that these particular fields that currently unfortunately raise ValueError instead of the ValidationError that would be more proper, continue to raise ValueError while other fields continue to raise ValidationError, then I'm not sure what the change buys us, exactly. I thought one of the aims was to bring consistency to the API across the field types?

Having ValidationError inherit from ValueError might be a possibility, but I haven't had time to think through whether that introduces any of its own problems.

comment:23 Changed 5 years ago by russellm

  • milestone changed from 1.2 to 1.3

Not critical for 1.3

comment:24 Changed 5 years ago by Leo

I understand that this ticket isn't crucial to the 1.2 release (even though it fixes functionality that was changed in 1.2) but it would be really fantastic if it got addressed/landed early in the 1.3 cycle. It would be incredibly frustrating to do a third version of this patch if there were significant DB layer changes as part of 1.3.

comment:25 Changed 4 years ago by sirex

Same here. Instead of getting normal form validation error, I getting unhandled exception, what leads to 500 error.

Why this error is not already fixed?

Changed 4 years ago by timo

updated patch to trunk + tests passing

comment:26 Changed 4 years ago by timo

This has always bugged me too. When submitting non-integer data to a ModelChoiceField having a server error instead of a validation error isn't nice. I updated the existing patch to apply cleanly and get the test suite passing with the hopes that this might get some consideration for 1.3.

comment:27 Changed 4 years ago by Leo

timo, thanks for cleaning up the patch! The path in your diff, however, seems incorrect, all of the diffs start with 'a/django/' instead of 'django/'

I'm continually trying to get a core dev's attention to this ticket/bug/issue.

comment:28 Changed 4 years ago by lukeplant

  • Needs documentation set

The patch is fine regarding the format ('a/django' is just an artifact of using git/hg to create the diff, and can be handled easily using 'patch -p1').

The patch lacks documentation of the backwards incompatibilities, as already noted by Karen. Every circumstance where previously working code might fail needs to be documented in the 1.3 release notes - and there seem to be quite a few, given the number that cropped up just in the test suite. Even if this documentation is re-written, it is always useful to have a starting point.

comment:29 Changed 4 years ago by kmtracey

Is there no way to do this is a backwards-compatible way, such that code which currently does not raise exceptions when handed bad values continues to not raise exceptions when handed the same bad values? Even with documentation this is the kind of change that is hard to easily "upgrade to" and be sure that you have actually covered all the places where you (or some piece of code you have inherited) may have used "except ValueError" or whatever where now some other exception is more properly going to be raised. There was mention of possibly having ValidationError inherit from ValueError above -- I would like to see some investigation of the feasibility of an approach like that or at least some explanation of why it can't possibly work before going ahead and adding a backwards-incompatible alternative.

comment:30 Changed 4 years ago by timo

Actually, r13751 fixed the particular use case I had (posting a non-integer value to a ForeignKey field would raise an exception rather than a form field error). I'm not sure if there is anything else that Leo is looking for.

comment:31 Changed 4 years ago by Leo

timo, thanks for pointing that out. r13751 fixes the initial problem that I posted, however it changes the behavior of the to_python function in the same way as my proposed patch - which means that according to the discussion in this ticket it requires the same backwards incompatible documentation. Code that caught ValueError on that field before will now not catch it.

The patch for this ticket makes the same fix across several other models and makes the API here consistent. @kmtracey, I don't think that there's a backward compatible solution here, precisely because it's trying to fix behavior that's broken. Having ValidationError extend from ValueError is a bad idea that's going to affect a LOT more of the codebase and client code.

@lukeplant all of the test changes were necessary because the tests were testing the broken API, just like they were in r13751. The expected exception is a ValidationError not a ValueError.

I will spend time this week to add a note to the backwards compatibility for this patch. Hopefully it's not too late for 1.3.

comment:32 Changed 4 years ago by danfairs

  • Cc danfairs added

comment:33 Changed 4 years ago by julien

  • Severity set to Normal
  • Type set to Bug

comment:34 Changed 3 years ago by jacob

  • milestone 1.3 deleted

Milestone 1.3 deleted

comment:11 Changed 3 years ago by aaugustin

  • UI/UX unset

Change UI/UX from NULL to False.

comment:12 Changed 3 years ago by aaugustin

  • Easy pickings unset

Change Easy pickings from NULL to False.

Note: See TracTickets for help on using tickets.
Back to Top