Code

Opened 2 years ago

Last modified 5 months ago

#17653 new Bug

using id = 0 on get_or_create

Reported by: sylvain.lebon@… Owned by: nobody
Component: Database layer (models, ORM) Version: 1.3
Severity: Normal Keywords:
Cc: krzysiumed@…, anssi.kaariainen@…, olau@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

When using get_or_create with id=0, django does create an object, but doesn't write the id back in the returned object. The object is therefore not usable. We should get the new object's id, or at least get an error message to prevent using 0 as an id.

Attachments (3)

ticket_17653.diff (3.0 KB) - added by akaariai 2 years ago.
Some initial work
ticket_17653_v2.diff (3.4 KB) - added by krzysiumed 2 years ago.
ticket_17653_v3.diff (4.7 KB) - added by krzysiumed 2 years ago.

Download all attachments as: .zip

Change History (31)

comment:1 follow-up: Changed 2 years ago by charettes

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

Which backend are you using? I'm asking because I know MySQL has issues with those kind of ids.

comment:2 in reply to: ↑ 1 Changed 2 years ago by anonymous

Replying to charettes:

Which backend are you using? I'm asking because I know MySQL has issues with those kind of ids.

Indeed, I'm using MySQL. good guess :)
MySQL does create an object with a random id when given 0. Is that to say that this issue is to be considered as a mysql bug?
Shouldn't Django work that around?

comment:3 Changed 2 years ago by krzysiumed

  • Cc krzysiumed@… added
  • Triage Stage changed from Unreviewed to Accepted

The same problem appears when you pass argument id=0 to __init__ of your model (in my example it's Text), see code below.

>>> t = Text(id=0, text="lorem ipsum")
>>> t.id
0
>>> t.save()
>>> i.id
0
>>> Text.objects.all()
[<Text: Text object>]
>>> Text.objects.all()[0].id
1

The problem appears with MySQL. sqlite3 works properly. I've not checked other backends.

I think that the problem is in django.db.models.fields.AutoField which is integer (see https://docs.djangoproject.com/en/1.3/ref/models/fields/#autofield) especially it can be zero. It should works like PositiveIntegerField or something like that. Supporting AutoField validation can be a solution, but it's backward incompatible. What's more, the validation is not necessary for backends other than MySQL because zero is valid value for id in most databases. We can check if id is not zero if the backend is MySQL otherwise don't validate id, but it makes the issue more comlex and it's still backward incompatible.

comment:4 Changed 2 years ago by akaariai

  • Cc anssi.kaariainen@… added

You could do a check in model.base_save(): connection.validate_pk(val), which should just raise an exception if the backend doesn't work with the given value. So, if you try to save a model with id=0, you would get an exception on MySQL on 0-valued PK value, and otherwise things would work as expected. This change should be simple to implement and safe to use. I don't think backwards compatibility for MySQL 0-valued PKs is worth worrying about: the current behavior can be considered a bug.

Another option is to just skip this issue by saying this is a bug in MySQL, not in Django. But as in this case it is pretty easy to prevent the confusing behavior, then why not?

comment:5 Changed 2 years ago by krzysiumed

  • Owner changed from nobody to krzysiumed
  • Status changed from new to assigned

comment:6 Changed 2 years ago by akaariai

  • Has patch set
  • Patch needs improvement set

I was just writing a patch for this :)

I will post what I have in case that is helpful and leave this to krzysiumed.

Changed 2 years ago by akaariai

Some initial work

comment:7 Changed 2 years ago by charettes

FYI django's db backend features already keeps track of this misbehaviour for mysql using the allows_primary_key_0 feature. I haven't look where it's used but that's the reason why I did comment:1. It might be worth investigating where this flag is used.

comment:8 Changed 2 years ago by akaariai

It is used in one place currently:

tests/regressiontests/serializers_regress/tests.py:if connection.features.allows_primary_key_0:

comment:9 Changed 2 years ago by charettes

Then maybe the BaseDatabaseOperations.validate_autopk_value could use it somehow returning allows_primary_key_0 or value !== 0.

comment:10 Changed 2 years ago by akaariai

Or the if connection.features.allows_primary_key_0 could be changed to if connection.ops.validate_autopk_value(0).

Now that I look at it, it seems the name for the validate_autopk_value is wrong, maybe check_autopk_value()?

I quickly tested primary key value of 0 using MySQL, and it seems to work. It is the automatic PK which doesn't work for zero value. So, it really isn't about allowing 0 as primary key value, it is about autopk values. I am no MySQL expert, so I might be wrong here.

In general I am opposed to these boolean allows_primary_key_0 type features. An external backend might disallow all non-positive values and they have no way to express that with the boolean. Granted, this one isn't particularly bad. And this really isn't this ticket's problem.

comment:11 Changed 2 years ago by krzysiumed

  • Owner changed from krzysiumed to nobody
  • Patch needs improvement unset
  • Status changed from assigned to new

Sorry for delay, but the ticket is not as easy as I thought.

I attached a patch similar to akaariai's patch. It validates value for primary key in new method BaseDatabaseOperations.validate_autopk_value in module db.backends. The method is overriden in DatabaseOperations in module db.backends.mysql, so it does not use allows_primary_key_0 flag. The method is called by AutoField.get_db_prep_value from module db.models.fields. The patch includes simple test.

I'm not sure if get_db_prep_value is right place for validation. The doc of the method says:

        Returns field's value prepared for interacting with the database
        backend.

        Used by the default implementations of ``get_db_prep_save``and
        `get_db_prep_lookup```

so maybe we should add note that get_db_prep_value do validation and may raise error?

Note that both patches (mine and akaariai's) works for AutoField as well as for ForeignKey and ManyToManyField, because related fields calls AutoField.get_db_prep_value during executing their get_db_prep_value method (see https://code.djangoproject.com/browser/django/trunk/django/db/models/fields/related.py#L962).

Version 2, edited 2 years ago by krzysiumed (previous) (next) (diff)

Changed 2 years ago by krzysiumed

comment:12 Changed 2 years ago by akaariai

  • Patch needs improvement set

The get_db_prep_value seems like the right place, because that is called in all the right places. The only other option I can see is to create a new method for this.

The get_db_prep_value is already doing some validation, it calls get_prep_value which calls int(value), and that can raise a ValueError.

There is a small mistake in the error message. It says zero is not accepted for primary key. This isn't true for MySQL.

mysql> create table testtbl(id integer primary key);
Query OK, 0 rows affected (0.05 sec)

mysql> insert into testtbl values(0);
Query OK, 1 row affected (0.01 sec)

mysql> select * from testtbl;
+----+
| id |
+----+
|  0 |
+----+
1 row in set (0.00 sec)

I know this is a minor nitpick. But it is also easy to correct. It should say that zero is not allowed for AutoFields.

There should be a test for bulk_create, too. Maybe, as this ticket is about get_or_create add one for that too, although that might be a bit redundant.

Changed 2 years ago by krzysiumed

comment:13 Changed 2 years ago by krzysiumed

  • Patch needs improvement unset

comment:14 Changed 2 years ago by claudep

  • Triage Stage changed from Accepted to Ready for checkin

The patch looks good. I've created the ticket #17713 to fix the misnaming of allows_primary_key_0.

comment:15 Changed 2 years ago by akaariai

I did some minor cleanups to the patch. Results here: https://github.com/akaariai/django/commit/bc206a3e9ca90e40d626c7d20a07d644418aadfc

I will commit this in a day or two unless somebody objects.

comment:16 Changed 2 years ago by akaariai

  • Resolution set to fixed
  • Status changed from new to closed

In [17933]:

Fixed #17653 -- Changed MySQL backend to raise a ValueError if zero is used as an AutoField value.

Thanks to Sylvain Lebon for the report, krzysiumed for the patch and charettes and claudep for reviews.

comment:17 Changed 2 years ago by carneiro.be@…

  • Resolution fixed deleted
  • Status changed from closed to reopened

I am now getting the error "The database backend does not accept 0 as a value for AutoField." using the latest SVN version. What am I doing wrong? I'm not even using any AutoField (besides the django automatic id on all models)

comment:18 Changed 2 years ago by aaugustin

  • Resolution set to fixed
  • Status changed from reopened to closed

Yes, you're seeing the expected behavior introduced by the patch, ie. raising an explicit value error.

comment:19 Changed 2 years ago by carneiro.be@…

Alright, I've managed to track what's my problem. I have a ForeignKey to the user model. If this value is 0 it will trigger this exception. I believe the patch isn't checking anywhere if it is in fact an autofield. Or this the desired behavior?

comment:20 Changed 2 years ago by anonymous

  • Resolution fixed deleted
  • Status changed from closed to reopened

comment:21 Changed 2 years ago by akaariai

  • Resolution set to fixed
  • Status changed from reopened to closed

It seems you are trying to save 0 to a ForeignKey which points to an AutoField. But, this is illegal, as the AutoField will not accept that value and so the ForeignKey can not hold that value either.

I am closing this still as fixed - you can again open this, but please provide more details if you do. The above explanation leaves us guessing if there is some bug or not in the newly introduced behavior. A testcase/testproject would be really valuable.

EDIT: Didn't spot the error message earlier. That might be a little confusing... It says AutoField while the field infact is a ForeignKey. Is this the problem?

Last edited 2 years ago by akaariai (previous) (diff)

comment:22 Changed 2 years ago by anonymous

Sorry that I opened it twice. Yes, I guess I found it a bit confusing that it said AutoField. My first reaction was to check the id of the object, which was not 0 (in fact I was editing something).

But why is it illegal? I mean, it had worked before with no problem. I know it's not a good practice to have a foreign key as 0, but I don't understand how it can be illegal. Pointing a Foreign Key to 0 for me would be just like pointing to an id that doesn't exist, I don't see the difference. I did understand the bug you were trying to fix, but should this also apply to ForeignKeys?

Thanks in advance!

comment:23 Changed 2 years ago by akaariai

So, you are creating ForeignKeys with value 0 knowing they point nowhere? You should set the ForeignKey as null=True and use None instead. You could get incorrect results from some queries if you use 0 instead of null=True.

I don't think ForeignKeys should be fixed to allow 0 as a value. Django's foreign keys are really meant to point at things if they have a value set.

comment:24 Changed 10 months ago by Montyskew

So this fix creates another problem when you need to be able to accept a value of 0 (or if you are working on a DB that already has a value of 0!) in an autofield. In my case, I need to accept an ID of zero so that my foreign key can point to zero so that a unique constraint can work properly. (see http://dev.mysql.com/doc/refman/5.6/en/innodb-table-and-index.html , http://mechanics.flite.com/blog/2013/06/18/closing-the-unique-index-null-loophole/ , and other pages that discuss)

Here is an example of my situation. I am collecting school test score data with a marker for demographics (e.g., Race/Ethnicity). Each school should only be allowed to have a single test score entry for each race. There should also be an allowance for a single entry for overall (no race). Logic suggests using a unique constraint on race and school and allowing race to be nullable. However, the SQL rules specify that null values within unique indexes are not counted as duplicates. (That was an unexpected discovery, BTW). So, I either have to create my own unique index field or take off the null constraint on the race field (shown below).

class Race(models.Model):
    id = models.AutoField(primary_key=True)
    Name = models.CharField(max_length=32)

class School(models.Model)
    id = models.AutoField(primary_key=True)
    Name = models.CharField(max_length=32)

class TestScore(models.Model)
    id = models.AutoField(primary_key=True)
    race = models.ForeignKey(Race)
    school = models.ForeignKey(School)
    score = models.DecimalField(max_digits=4, decimal_places=1)

    class Meta():
        unique_together = [
            ["school", 'race'],
        ]

A reasonable value for this special fake-null field would be 0. If you create SQL directly without going through the django cursor, you can create a new field and then change the auto-generated field back to 0.

INSERT INTO race ('name') VALUE ('None')
UPDATE race SET id=0 WHERE id = <autogenerated ID>

Performing that sequence through a combination of the ORM and cursor connection works

r = Race(name='None')
r.save()
connection.cursor().execute('UPDATE race SET id=0 WHERE id=%s', [r.id,])
r = Race.objects.get(id=0)

HOWEVER, creating a record based on that new object raises an error!

>>> test = TestScore(school=school, race=r, score=175.2)
  File "python2.7/site-packages/django/db/models/base.py", line 546, in save
    force_update=force_update, update_fields=update_fields)
  File "python2.7/site-packages/django/db/models/base.py", line 650, in save_base
    result = manager._insert([self], fields=fields, return_id=update_pk, using=using, raw=raw)
  File "python2.7/site-packages/django/db/models/manager.py", line 215, in _insert
    return insert_query(self.model, objs, fields, **kwargs)
  File "python2.7/site-packages/django/db/models/query.py", line 1661, in insert_query
    return query.get_compiler(using=using).execute_sql(return_id)
  File "python2.7/site-packages/django/db/models/sql/compiler.py", line 936, in execute_sql
    for sql, params in self.as_sql():
  File "python2.7/site-packages/django/db/models/sql/compiler.py", line 894, in as_sql
    for obj in self.query.objs
  File "python2.7/site-packages/django/db/models/fields/related.py", line 1047, in get_db_prep_save
    connection=connection)
  File "python2.7/site-packages/django/db/models/fields/__init__.py", line 304, in get_db_prep_save
    prepared=False)
  File "python2.7/site-packages/django/db/models/fields/__init__.py", line 549, in get_db_prep_value
    value = connection.ops.validate_autopk_value(value)
  File "python2.7/site-packages/django/db/backends/mysql/base.py", line 288, in validate_autopk_value
    raise ValueError('The database backend does not accept 0 as a '
ValueError: The database backend does not accept 0 as a value for AutoField.

So, I guess there are two issues here.
1) It is possible to insert a new row with ID 0, but it requires a two-step process of insert followed by update.
2) Once you have this auto value of 0 (whether it was generated through Django or otherwise), you can't actually point at it!

comment:25 Changed 6 months ago by olau

  • Resolution fixed deleted
  • Status changed from closed to new

You used to be able to set the id by creating the object, then do a .filter(pk=obj.pk).update(pk=0).

I just want to add that while this may seem a bit far-fetched, this change did indeed break backwards compatibility in a pretty annoying way because of the ForeignKey problem.

It seems to me there would be alternative ways of solving the problem that would keep MySQL in line, e.g.

SET SQL_MODE = "NO_AUTO_VALUE_ON_ZERO";

http://cristian-radulescu.ro/article/insert-value-0-for-primary-keys-with-auto-increment-in-mysql.html

This would actually fix the error, i.e. make everything work, rather than exporting the problem as a validation error. So I will try reopening the bug. If you disagree, close it again and I will go away.

comment:26 Changed 6 months ago by olau

  • Cc olau@… added

comment:27 Changed 5 months ago by akaariai

  • Triage Stage changed from Ready for checkin to Accepted

Removing ready for checkin as there isn't anything to commit in this ticket.

Maybe we could allow users to set SQL_MODE to NO_AUTO_VALUE_ON_ZERO and then allow usage of zero as autopk value, or maybe we could just allow usage of zero in .update(). However the .update() option seems a bit hard to implement, as we would need information that we are updating to get_db_prep_value(), but there doesn't seem to be any way to do that.

To me it seems a database settings option ALLOW_AUTO_VALUE_ZERO for MySQL to switch the SQL mode, and to also switch off the autopk checking is the best approach.

comment:28 Changed 5 months ago by timo

  • Has patch unset

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as new
The owner will be changed from nobody to anonymous. Next status will be 'assigned'
as The resolution will be set. Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.