Opened 8 months ago

Last modified 4 months ago

#34819 assigned Bug

GenericForeignKey.get_prefetch_queryset()

Reported by: Richard Laager Owned by: rajdesai24
Component: contrib.contenttypes Version: 3.2
Severity: Normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

get_prefetch_queryset() returns two functions. The caller calls them rel_obj_attr and instance_attr. They return a tuple of (pk, cls). They need to match so that objects can be pulled from the cache. In GenericForeignKey.get_prefetch_query(), there is a bug where gfk_key() (the rel_obj_attr) returns a get_prep_value()d value but the instance_attr lambda returns obj.pk.

Accordingly, for objects where the prepped value and the Python representation are not the same (e.g. the database uses a string and the Python representation is an object instance), these will not match. This makes things like Model.objects.prefetch_related('content_object').get(id=123) clear (set to None) the content_object.

The fix is to call get_prep_value() in the instance_attr code path.

Attachments (1)

gfk-prefetch.patch (513 bytes ) - added by Richard Laager 8 months ago.

Download all attachments as: .zip

Change History (8)

by Richard Laager, 8 months ago

Attachment: gfk-prefetch.patch added

comment:1 by Natalia Bidart, 8 months ago

Hello Richard, thanks for the ticket. In order for us to properly triage this report, would you please send a reproducer that we can use? Either a test case or a minimal Django project would suffice.

comment:2 by Mariusz Felisiak, 8 months ago

Has patch: unset
Triage Stage: UnreviewedAccepted

Thanks , sounds valid. Would you like to prepare a patch (via GitHub PR)? a regression test is needed.

in reply to:  description comment:3 by Oguzhan Akan, 8 months ago

I was able to reproduce the issue, but am not 100% sure if the suggested fix is correct.

First let's see the following example:

# ../tests/generic_relations_regress/models.py
class CustomCharField(models.CharField):
    def get_prep_value(self, value):
        value = super().get_prep_value(value)
        return self.to_python(value) + "-pk"


class Foo1(models.Model):
    id = models.CharField(primary_key=True, max_length=25)


class Bar1(models.Model):
    content_type = models.ForeignKey(ContentType, models.CASCADE)
    object_id = CustomCharField(max_length=25)
    content_object = GenericForeignKey()


class Foo2(models.Model):
    id = models.CharField(primary_key=True, max_length=25)


class Bar2(models.Model):
    content_type = models.ForeignKey(ContentType, models.CASCADE)
    object_id = models.CharField(max_length=25)
    content_object = GenericForeignKey()

with the test functions

# ../tests/generic_relations_regress/tests.py
# ..imports
class GenericRelationTests(TestCase):
    def test_custom_prep_value_access_1(self):
        foo = Foo1.objects.create(pk="some-test")
        bar = Bar1.objects.create(content_object=foo)
        self.assertEqual(
            foo,
            Bar1.objects.prefetch_related("content_object").get(pk=1).content_object,
        )  # fails

    def test_custom_prep_value_access_2(self):
        foo = Foo2.objects.create(pk="some-test")
        bar = Bar2.objects.create(content_object=foo)
        self.assertEqual(
            foo,
            Bar2.objects.prefetch_related("content_object").get(pk=1).content_object,
        )  # passes


From the example above, the first test fails while the second passes.

  1. In the first test, Bar1's object_id is a CustomCharField, which has its get_prep_value function customized.
  2. Then, the content_object is set to None as GenericForeignKey.get_prefetch_queryset function returns empty list for ret_val. This happens because the get_all_objects_for_this_type function (#L200) uses fkeys as the filter. The fkeys have the prepd values where the database stores the original strings as the primary key.
  3. The suggested solution (running get_prep_value in instance_attr function) does not solve the problem as the prep function for Foo1 is different than the Bar1's.
  1. In the second test, Bar2's object_id is a default CharField, while Foo2 has a CustomCharField as its primary key.
  2. The test passes as the GenericForeignKey.get_prefetch_queryset returns a non-empty return value.


I believe one should not use a custom field for object_id referring a foreign key. Maybe a warning in docs would suffice? I can work more on this upon your suggestions.

comment:4 by Richard Laager, 6 months ago

That's an interesting point. That's the opposite of the scenario I care about. My scenario is real world, and IMHO reasonable.

I believe one should not use a custom field for object_id referring a foreign key.

I agree. I don't think it is reasonable to be messing with an object_id for a GenericForeignKey. That's supposed to be a plain old string so that it can represent primary keys of other objects.

Maybe a warning in docs would suffice?

For that part of it, yes.

So I think a reasonable fix here is:

  1. The patch I've provided.
  2. A test case. That can be similar to your test case, except that CustomCharField() should be used on the object pointed to by the GenericForeignKey: i.e. the Foo object, not the Bar object.
  3. A doc warning that the object_id for a GenericForeignKey() should be an ordinary CharField or the same field type as the primary key of the objects it will point to.

comment:5 by rajdesai24, 4 months ago

Hey Mariusz, can I work on this issue if the patch is not yet provided?

in reply to:  5 comment:6 by Mariusz Felisiak, 4 months ago

Replying to rajdesai24:

Hey Mariusz, can I work on this issue if the patch is not yet provided?

Sure, if you have an idea how to fix it.

comment:7 by rajdesai24, 4 months ago

Owner: changed from nobody to rajdesai24
Status: newassigned

I agree with what Richard is suggesting, and it works. I will generate a PR regarding the solution provided and a valid test case

Note: See TracTickets for help on using tickets.
Back to Top