Opened 7 years ago

Closed 6 years ago

Last modified 4 years ago

#7777 closed (fixed)

DecimalField validation ignores infinity and nan

Reported by: thebitguru Owned by: thebitguru
Component: Core (Other) Version: master
Severity: Keywords:
Cc: farhan@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description

I have noticed that the DecimalField validation doesn't validate inf and nan. It does check the digits and decimals places, which could be one way to get inf, but user can enter literal 'inf' and 'nan' strings to get these special values. decimal.Decimal() doesn't raise the DecimalException in these cases because these are valid Decimal values. This validation is done in several different places.

grep -Ir "Decimal(" * | grep -v "\.svn" | less
core/validators.py:            val = Decimal(field_data)
db/models/fields/__init__.py:            return decimal.Decimal(value)
db/backends/util.py:    return decimal.Decimal(s)
newforms/fields.py:            value = Decimal(value)
oldforms/__init__.py:        v = validators.IsValidDecimal(self.max_digits, self.decimal_places)
oldforms/__init__.py:            return decimal.Decimal(data)

I noticed that pgsql can handle 'nan' value, but it doesn't handle 'inf' so I am not sure if this check is ignored intentionally or not. If this is intentional then we should add a note to the DecimalField documentation telling the users that they should account for these separately.

Just as an FYI, these values can be checked by using the is_nan or is_infinite functions...

# Example
if Decimal.is_nan(value) or Decimal.is_infinite(value):
  ...

Attachments (1)

code_and_tests.diff (1.3 KB) - added by thebitguru 6 years ago.
Code for checking these cases and the accompanying tests.

Download all attachments as: .zip

Change History (17)

comment:1 Changed 7 years ago by thebitguru

  • Cc farhan@… added
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

comment:2 Changed 7 years ago by anonymous

  • Component changed from Validators to Core framework
  • Triage Stage changed from Unreviewed to Accepted

Makes sense. Seems like a clear bug to me. We should either allow both nan and inf or disallow both.

comment:3 Changed 7 years ago by mtredinnick

  • milestone 1.0 deleted
  • Triage Stage changed from Accepted to Design decision needed

It's not obvious to me at the moment that this is either a clear bug or what the right behaviour should be. The problem is that IEEE bounds behaviour (inf and nan) is both platform (CPU) and database dependent. So it's possible that installation-specific behaviour might just be what happens here. Certainly needs some investigation about what happens, but only then can we make a resolution decision.

(Please log in before triaging, since we have no way of judging the quality of anonymous triagers).

comment:4 follow-up: Changed 7 years ago by kratorius

In most common cases, a user expects DecimalField to validate against "real numbers", not including inf and nan. What's about adding some optional parameter to DecimalField? Something like allow_inf and allow_nan should cover special needs and doesn't break backwards compatibility.

comment:5 in reply to: ↑ 4 Changed 7 years ago by thebitguru

Replying to kratorius:

In most common cases, a user expects DecimalField to validate against "real numbers", not including inf and nan. What's about adding some optional parameter to DecimalField? Something like allow_inf and allow_nan should cover special needs and doesn't break backwards compatibility.

From your reply it isn't clear whether the default for allow_{inf,nan} should be True or False, so I would like to clarify. For backwards compatibility, these would have to be True (i.e. allow inf and nan), but I think this is a case where it is probably OK to break backwards compatibility and default to False (i.e. don't allow inf and nan) since, like you said, that is probably what most people are expecting anyways.

comment:6 Changed 7 years ago by thebitguru

Can we please make a decision on this? Thanks.

comment:7 Changed 6 years ago by anonymous

Is there any progress on this? I just ran into this problem when using the jquery library -- when marshaling a string argument as an integer to django, jquery makes the argument into NaN which crashes the django process.

comment:8 Changed 6 years ago by ramiro

#11411 was a duplicate and included a patch.

comment:9 Changed 6 years ago by thebitguru

  • Has patch set
  • Owner changed from nobody to thebitguru
  • Status changed from new to assigned

Changed 6 years ago by thebitguru

Code for checking these cases and the accompanying tests.

comment:10 Changed 6 years ago by thebitguru

Adding some useful discussion from the mailing list.


That said: the patch itself looks like a fine implementation of a given behaviour.

What isn't clear is that the behavior implemented is correct. Malcolm flagged this in his original comments - NaN and Inf are CPU and database-dependent values. Postgres can't support Inf, but can support NaN; I don't know what the situation is with other databases. Inf and NaN are both Decimal values - unconditionally raising a validation error doesn't seem like the right approach to me.

So - the patch is fine, as long as we accept the design it implements.
Now you need to convince us that the design is correct.

Yours,
Russ Magee %-)


I need to revise the patch, but I need some confirmation. I am not sure if it is worth determining what all the different databases support. I think optional parameters (allow_{inf,nan}=False) that let the user specify that django should allow these two cases would be sufficient. Note that I am suggesting setting these to False, which will break compatibility, but I believe that this is really what the average user is expecting anyways.

If I can get a confirmation that this behavior will be acceptable then I will go ahead and submit an updated patch that reflects this change. This time I will ping again in one week :)

  • Farhan

How could it break compatibility? From my testing (i.e., running your test cases), passing in NaN or Inf raises exceptions at the moment. At this point, it's impossible for any patch dealing with Inf and NaN to break backwards compatibility, because there is no working baseline behavior.

Adding a configuration doesn't always solve a problem. More often than not, it adds a new problem. This is one of those cases.

Putting a choice in the hands of a user which can come back and bite them is not in the style of Django. Defining a DecimalField with allow_inf=True will cause errors if you use a Postgres database. I'm not about to commit code that will allow you to have a valid Django configuration that throws errors in production.

This is why a survey of the current level of support in databases is important. If, for example, no database supports Inf, then there is no reason to even have an allow_inf setting. If support is varied, then we need to have a discussion about the right way to handle the variation.

There is also the issue of good API design to consider. From a practical standpoint, I'm having difficulty thinking of a reason why allow_nan needs to exist - why should an end user be allowed to submit as a form value that is, a priori, known to be invalid? Essentially, you're allowing the user to submit "divide by zero" as a valid form value - what is the use case for this?

In summary, the process goes like this:

  1. Get the API right
  2. Implement it.

We're still on step 1. :-)

Yours,
Russ Magee %-)


OK, here is what I have gathered about the databases listed under DATABASE_ENGINE [4].

Postgresql 8: Supports all three, +/-Inf and NaN [0] 
MySQL: No support for either NaN or Inf [1] 
sqlite3: No support for either NaN or Inf [2] 
Oracle: Supports all three [3] 

So, we have a 50/50 split. What do you say?

  • Farhan
[0] http://www.postgresql.org/docs/8.2/static/datatype-numeric.html 
[1] http://dev.mysql.com/doc/refman/5.4/en/numeric-types.html 
[2] http://www.sqlite.org/datatype3.html 
[3] http://dev.mysql.com/doc/refman/5.4/en/numeric-types.html 
[4] http://docs.djangoproject.com/en/dev/ref/settings/#database-engine 

Combine the complete absence of support in SQLite and MySQL with my previous comment about the nonsensical nature of NaN and Inf as user inputs, and my gut reaction is that NaN and Inf should be rejected as invalid inputs. I'm willing to be convinced otherwise, though.

Yours,
Russ Magee %-)

comment:11 Changed 6 years ago by thebitguru

  • Version changed from SVN to 1.2-alpha

comment:12 Changed 6 years ago by russellm

  • milestone set to 1.2
  • Version changed from 1.2-alpha to SVN

comment:13 Changed 6 years ago by russellm

  • Triage Stage changed from Design decision needed to Accepted

comment:14 Changed 6 years ago by russellm

  • Resolution set to fixed
  • Status changed from assigned to closed

(In [12490]) Fixed #7777 -- Added validation handling for NaN, Inf and -Inf in DecimalFields. Thanks to thebitguru for the patch.

comment:15 Changed 6 years ago by russellm

(In [12491]) [1.1.X] Fixed #7777 -- Added validation handling for NaN, Inf and -Inf in DecimalFields. Thanks to thebitguru for the patch.

Backport of r12490 from trunk.

comment:16 Changed 4 years ago by jacob

  • milestone 1.2 deleted

Milestone 1.2 deleted

Note: See TracTickets for help on using tickets.
Back to Top