#7777 closed Uncategorized (fixed)
DecimalField validation ignores infinity and nan
Reported by: | Farhan Ahmad | Owned by: | Farhan Ahmad |
---|---|---|---|
Component: | Core (Other) | Version: | SVN |
Severity: | Normal | Keywords: | |
Cc: | farhan@… | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I have noticed that the DecimalField validation doesn't validate inf and nan. It does check the digits and decimals places, which could be one way to get inf, but user can enter literal 'inf' and 'nan' strings to get these special values. decimal.Decimal() doesn't raise the DecimalException in these cases because these are valid Decimal values. This validation is done in several different places.
grep -Ir "Decimal(" * | grep -v "\.svn" | less core/validators.py: val = Decimal(field_data) db/models/fields/__init__.py: return decimal.Decimal(value) db/backends/util.py: return decimal.Decimal(s) newforms/fields.py: value = Decimal(value) oldforms/__init__.py: v = validators.IsValidDecimal(self.max_digits, self.decimal_places) oldforms/__init__.py: return decimal.Decimal(data)
I noticed that pgsql can handle 'nan' value, but it doesn't handle 'inf' so I am not sure if this check is ignored intentionally or not. If this is intentional then we should add a note to the DecimalField documentation telling the users that they should account for these separately.
Just as an FYI, these values can be checked by using the is_nan or is_infinite functions...
# Example if Decimal.is_nan(value) or Decimal.is_infinite(value): ...
Attachments (1)
Change History (16)
comment:1 by , 16 years ago
Cc: | added |
---|
comment:2 by , 16 years ago
Component: | Validators → Core framework |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:3 by , 16 years ago
milestone: | 1.0 |
---|---|
Triage Stage: | Accepted → Design decision needed |
It's not obvious to me at the moment that this is either a clear bug or what the right behaviour should be. The problem is that IEEE bounds behaviour (inf and nan) is both platform (CPU) and database dependent. So it's possible that installation-specific behaviour might just be what happens here. Certainly needs some investigation about what happens, but only then can we make a resolution decision.
(Please log in before triaging, since we have no way of judging the quality of anonymous triagers).
follow-up: 5 comment:4 by , 16 years ago
In most common cases, a user expects DecimalField to validate against "real numbers", not including inf and nan. What's about adding some optional parameter to DecimalField? Something like allow_inf and allow_nan should cover special needs and doesn't break backwards compatibility.
comment:5 by , 16 years ago
Replying to kratorius:
In most common cases, a user expects DecimalField to validate against "real numbers", not including inf and nan. What's about adding some optional parameter to DecimalField? Something like allow_inf and allow_nan should cover special needs and doesn't break backwards compatibility.
From your reply it isn't clear whether the default for allow_{inf,nan} should be True or False, so I would like to clarify. For backwards compatibility, these would have to be True (i.e. allow inf and nan), but I think this is a case where it is probably OK to break backwards compatibility and default to False (i.e. don't allow inf and nan) since, like you said, that is probably what most people are expecting anyways.
comment:7 by , 16 years ago
Is there any progress on this? I just ran into this problem when using the jquery library -- when marshaling a string argument as an integer to django, jquery makes the argument into NaN which crashes the django process.
comment:9 by , 15 years ago
Has patch: | set |
---|---|
Owner: | changed from | to
Status: | new → assigned |
by , 15 years ago
Attachment: | code_and_tests.diff added |
---|
Code for checking these cases and the accompanying tests.
comment:10 by , 15 years ago
Adding some useful discussion from the mailing list.
That said: the patch itself looks like a fine implementation of a given behaviour.
What isn't clear is that the behavior implemented is correct. Malcolm flagged this in his original comments - NaN and Inf are CPU and database-dependent values. Postgres can't support Inf, but can support NaN; I don't know what the situation is with other databases. Inf and NaN are both Decimal values - unconditionally raising a validation error doesn't seem like the right approach to me.
So - the patch is fine, as long as we accept the design it implements.
Now you need to convince us that the design is correct.
Yours,
Russ Magee %-)
I need to revise the patch, but I need some confirmation. I am not sure if it is worth determining what all the different databases support. I think optional parameters (allow_{inf,nan}=False) that let the user specify that django should allow these two cases would be sufficient. Note that I am suggesting setting these to False, which will break compatibility, but I believe that this is really what the average user is expecting anyways.
If I can get a confirmation that this behavior will be acceptable then I will go ahead and submit an updated patch that reflects this change. This time I will ping again in one week :)
- Farhan
How could it break compatibility? From my testing (i.e., running your test cases), passing in NaN or Inf raises exceptions at the moment. At this point, it's impossible for any patch dealing with Inf and NaN to break backwards compatibility, because there is no working baseline behavior.
Adding a configuration doesn't always solve a problem. More often than not, it adds a new problem. This is one of those cases.
Putting a choice in the hands of a user which can come back and bite them is not in the style of Django. Defining a DecimalField with allow_inf=True will cause errors if you use a Postgres database. I'm not about to commit code that will allow you to have a valid Django configuration that throws errors in production.
This is why a survey of the current level of support in databases is important. If, for example, no database supports Inf, then there is no reason to even have an allow_inf setting. If support is varied, then we need to have a discussion about the right way to handle the variation.
There is also the issue of good API design to consider. From a practical standpoint, I'm having difficulty thinking of a reason why allow_nan needs to exist - why should an end user be allowed to submit as a form value that is, a priori, known to be invalid? Essentially, you're allowing the user to submit "divide by zero" as a valid form value - what is the use case for this?
In summary, the process goes like this:
- Get the API right
- Implement it.
We're still on step 1. :-)
Yours,
Russ Magee %-)
OK, here is what I have gathered about the databases listed under DATABASE_ENGINE [4].
Postgresql 8: Supports all three, +/-Inf and NaN [0] MySQL: No support for either NaN or Inf [1] sqlite3: No support for either NaN or Inf [2] Oracle: Supports all three [3]
So, we have a 50/50 split. What do you say?
- Farhan
[0] http://www.postgresql.org/docs/8.2/static/datatype-numeric.html [1] http://dev.mysql.com/doc/refman/5.4/en/numeric-types.html [2] http://www.sqlite.org/datatype3.html [3] http://dev.mysql.com/doc/refman/5.4/en/numeric-types.html [4] http://docs.djangoproject.com/en/dev/ref/settings/#database-engine
Combine the complete absence of support in SQLite and MySQL with my previous comment about the nonsensical nature of NaN and Inf as user inputs, and my gut reaction is that NaN and Inf should be rejected as invalid inputs. I'm willing to be convinced otherwise, though.
Yours,
Russ Magee %-)
comment:13 by , 15 years ago
Triage Stage: | Design decision needed → Accepted |
---|
comment:14 by , 15 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:15 by , 15 years ago
comment:17 by , 7 years ago
Easy pickings: | unset |
---|---|
Severity: | → Normal |
Type: | → Uncategorized |
UI/UX: | unset |
Looks like this is back in master:
File "/usr/local/lib/python3.5/dist-packages/django/db/models/base.py", line 1226, in full_clean self.clean_fields(exclude=exclude) File "/usr/local/lib/python3.5/dist-packages/django/db/models/base.py", line 1268, in clean_fields setattr(self, f.attname, f.clean(raw_value, self)) File "/usr/local/lib/python3.5/dist-packages/django/db/models/fields/__init__.py", line 603, in clean self.run_validators(value) File "/usr/local/lib/python3.5/dist-packages/django/db/models/fields/__init__.py", line 555, in run_validators v(value) File "/usr/local/lib/python3.5/dist-packages/django/core/validators.py", line 421, in __call__ decimals = abs(exponent) TypeError: bad operand type for abs(): 'str'
comment:18 by , 7 years ago
Please open a new ticket with steps to reproduce. The tests from the original fix are still present (see tests/field_tests/test_decimalfield.py
) so it's most likely a different issue.
Makes sense. Seems like a clear bug to me. We should either allow both nan and inf or disallow both.