Opened 7 months ago

Last modified 5 months ago

#34907 assigned Bug

loaddata crashes on objects with natural keys when don't exist on passed database.

Reported by: Florian Owned by: Gaurav Jain
Component: Core (Serialization) Version: 4.2
Severity: Normal Keywords: natural key, multi-db, loaddata
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Hi all,

I'm running into a bug when trying to use fixtures with natural keys and using multiple databases.

If I have two tables A and B, that are routed by a database router to db1 and db2 respectively, then I can load 'regular' fixtures of A in db2. Django will correctly recognize that A data does not need to be loaded into db2, and hence give the (correct) output that 0 objects are loaded from 1 fixture.
However, if I keep everything the same, except that I use natural keys for the data, then loaddata will give an OperationalError exception.

I've created a minimal reproduction here:
https://github.com/fvkluck/improved-dollop

In its readme I've put the reproduction, the full stacktrace, and explain where I got my data. Shortest version of the repro is:

  1. git clone https://github.com/fvkluck/improved-dollop
  2. cd improved-dollop/repro
  3. python manage.py migrate
  4. python manage.py migrate --database=secondary
  5. python manage.py loaddata someprimarymodel_natural.json --database=secondary

This leads to an exception (OperationalError). Expected is: '0 objects were loaded from 1 fixture'

Best,
Florian

Change History (9)

comment:2 by Mariusz Felisiak, 7 months ago

Component: Database layer (models, ORM)Core (Serialization)
Summary: OperationalError during loaddata of fixtures with natural keys on multi-dbloaddata crashes on objects with natural keys when don't exist on passed database.
Triage Stage: UnreviewedAccepted

Thanks for the report.

Django doesn't need to search for an object in the database when you don't use natural key, that's why it works.

When natural keys are used Django tries to find a matching object in the database when building an instance based on JSON values. We could probably avoid this with:

  • django/core/serializers/python.py

    diff --git a/django/core/serializers/python.py b/django/core/serializers/python.py
    index 7ec894aa00..47606aa248 100644
    a b def Deserializer(  
    117117                continue
    118118            else:
    119119                raise
     120        if not router.allow_migrate_model(self.using, Model):
     121            continue
    120122        data = {}
    121123        if "pk" in d:
    122124            try:

Does it work for you?

comment:3 by Florian, 6 months ago

Thanks for looking into it. The patch seems to fix it for me (filling in some small details like importing the router):

  • django/core/serializers/python.py

    diff --git a/django/core/serializers/python.py b/django/core/serializers/python.py
    index 7ec894aa00..2ef5a1765a 100644
    a b other serializers.  
    66
    77from django.apps import apps
    88from django.core.serializers import base
    9 from django.db import DEFAULT_DB_ALIAS, models
     9from django.db import DEFAULT_DB_ALIAS, models, router
    1010from django.utils.encoding import is_protected_type
    1111
    1212
    def Deserializer(  
    118118            else:
    119119                raise
    120120        data = {}
     121        if not router.allow_migrate(using, Model):
     122            continue
    121123        if "pk" in d:
    122124            try:
    123125                data[Model._meta.pk.attname] = Model._meta.pk.to_python(d.get("pk"))

The test script outputs 'Ok' when running all tests with this patch applied.

I can't really figure out what would be a logical unit test for the bug. I've been looking at tests/serializers/test_natural.py for inspiration. But I guess I'm not familiar enough with this part to come up with a good test case that fails without this patch. What I've tried is modifying natural_key_test to use a different database (modifications marked with comments):

class TestRouter:
    def allow_migrate(self, db, app_label, model_name=None, **hints):
        return False


@override_settings(DATABASE_ROUTES=[TestRouter])  # added router
def natural_key_test_multidb(self, format):
    book1 = {
        "data": "978-1590597255",
        "title": "The Definitive Guide to Django: Web Development Done Right",
    }
    book2 = {"data": "978-1590599969", "title": "Practical Django Projects"}

    # Create the books.
    adrian = NaturalKeyAnchor.objects.create(**book1)
    james = NaturalKeyAnchor.objects.create(**book2)

    # Serialize the books.
    string_data = serializers.serialize(
        format,
        NaturalKeyAnchor.objects.all(),
        indent=2,
        use_natural_foreign_keys=True,
        use_natural_primary_keys=True,
    )

    # Delete one book (to prove that the natural key generation will only
    # restore the primary keys of books found in the database via the
    # get_natural_key manager method).
    james.delete()

    # Deserialize and test.
    books = list(serializers.deserialize(format, string_data, using='other'))  # added using
    self.assertCountEqual(
        [(book.object.title, book.object.pk) for book in books],
        [
            (book1["title"], None),  # None i.s.o. adrian.pk
            (book2["title"], None),
        ],
    )

It feels natural to me that None is returned when using a different database in which the object doesn't exist. This test succeeds already without the patch. I guess what's missing is that the router is not used when setting up the database, but I'm currently somewhat lost in getting it to do that. Any hints?

I'm going to look into it some more tomorrow.

comment:4 by Gaurav Jain, 5 months ago

Owner: changed from nobody to Gaurav Jain
Status: newassigned

comment:6 by Gaurav Jain, 5 months ago

Cc: Gaurav Jain added
Has patch: unset
Owner: Gaurav Jain removed
Status: assignednew

comment:7 by Gaurav Jain, 5 months ago

Hey @Florian, it looks like there's a typo in DATABASE_ROUTES. Although the test works without the patch, applying it breaks other tests connected to loaddata, particularly those linked to a multidb setup, such as django/tests/multiple_database/tests.py: FixtureTestCase.test_pseudo_empty_fixtures.

Due to the patch suppressing this error, loaddata proceeds silently with the subsequent steps. Consequently, it fails to assert the expected string in the mentioned test case and We're encountering a ''RuntimeWarning: No fixture data found for 'pets'. (The file format might be invalid.)

Here is the patchset for all fixes and updates. ​https://github.com/django/django/pull/17528.

Last edited 5 months ago by Gaurav Jain (previous) (diff)

comment:8 by Gaurav Jain, 5 months ago

Cc: Gaurav Jain removed
Has patch: set
Owner: set to Gaurav Jain
Status: newassigned

comment:9 by Mariusz Felisiak, 5 months ago

Patch needs improvement: set

comment:10 by Gaurav Jain, 5 months ago

@Mariusz Felisiak Patch has been updated. All checks have passed.

Note: See TracTickets for help on using tickets.
Back to Top