Opened 3 years ago

Last modified 5 months ago

#24272 new Cleanup/optimization

Better error messages for prefetch_related

Reported by: Todor Velichkov Owned by: nobody
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords: prefetch_related, GenericRelation, related_query_name
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Consider the following model structure:

from django.db import models
from django.contrib.contenttypes.fields import GenericForeignKey, GenericRelation
from django.contrib.contenttypes.models import ContentType


class TaggedItem(models.Model):
    tag = models.SlugField()
    content_type = models.ForeignKey(ContentType)
    object_id = models.PositiveIntegerField()
    content_object = GenericForeignKey('content_type', 'object_id')

    def __unicode__(self):
        return self.tag


class Director(models.Model):
    name = models.CharField(max_length=100)

    def __unicode__(self):
        return self.name


class Movie(models.Model):
    name = models.CharField(max_length=100)
    director = models.ForeignKey(Director)
    tags = GenericRelation(TaggedItem, related_query_name='movies')

    def __unicode__(self):
        return self.name


class Author(models.Model):
    name = models.CharField(max_length=100)

    def __unicode__(self):
        return self.name


class Book(models.Model):
    name = models.CharField(max_length=100)
    author = models.ForeignKey(Author)
    tags = GenericRelation(TaggedItem, related_query_name='books')

    def __unicode__(self):
        return self.name

And some initial data:

>>> a = Author.objects.create(name='E L James')
>>> b1 = Book.objects.create(name='Fifty Shades of Grey', author=a)
>>> b2 = Book.objects.create(name='Fifty Shades Darker', author=a)
>>> b3 = Book.objects.create(name='Fifty Shades Freed', author=a)
>>> d = Director.objects.create(name='James Gunn')
>>> m1 = Movie.objects.create(name='Guardians of the Galaxy', director=d)
>>> t1 = TaggedItem.objects.create(content_object=b1, tag='roman')
>>> t2 = TaggedItem.objects.create(content_object=b2, tag='roman')
>>> t3 = TaggedItem.objects.create(content_object=b3, tag='roman')
>>> t4 = TaggedItem.objects.create(content_object=m1, tag='action movie')

Now using the GenericForeignKey we are able to:

  1. prefetch only one level deep from querysets containing different type of content_object
    >>> TaggedItem.objects.all().prefetch_related('content_object')
    [<TaggedItem: roman>, <TaggedItem: roman>, <TaggedItem: roman>, <TaggedItem: action movie>]
    
  2. prefetch many levels but from querysets containing only one type of content_object.
    >>> TaggedItem.objects.filter(books__author__name='E L James').prefetch_related('content_object__author')
    [<TaggedItem: roman>, <TaggedItem: roman>, <TaggedItem: roman>]
    

But we can't do 1) and 2) together (prefetch many levels from querysets containing different types of content_objects)

>>> TaggedItem.objects.all().prefetch_related('content_object__author')
Traceback (most recent call last):
  ...
AttributeError: 'Movie' object has no attribute 'author_id'

For such tasks this API is inconvenient and became unconvincing in more complex examples. For example if we want all TaggedItems with prefetched movies with their directors and prefetched books with their author.
One silly attempt would look like this:

>>> TaggedItem.objects.all().prefetch_related(
...     'content_object__author',
...     'content_object__director',
... )
Traceback (most recent call last):
  ...
AttributeError: 'Movie' object has no attribute 'author_id'

Or like this:

>>> TaggedItem.objects.all().prefetch_related(
...     Prefetch('content_object', queryset=Book.objects.all().select_related('author')),
...     Prefetch('content_object', queryset=Movie.objects.all().select_related('director')),
... )
Traceback (most recent call last):
  ...
ValueError: Custom queryset can't be used for this lookup.

What I suggest is to use the API which we used to filter TaggedItems by their book author. This is not working right now.

>>> TaggedItem.objects.filter(books__author__name='E L James').prefetch_related('books')
Traceback (most recent call last):
  ...
AttributeError: 'Book' object has no attribute 'object_id'

This way we would have and a nice solution for the more complex example mentioned above:

>>> TaggedItem.objects.all().prefetch_related(
...     'books__author',
...     'movies__director',
... )
Traceback (most recent call last):
  ...
AttributeError: 'Book' object has no attribute 'object_id'

Or like this:

>>> TaggedItem.objects.all().prefetch_related(
...     Prefetch('books', queryset=Book.objects.all().select_related('author')),
...     Prefetch('movies', queryset=Movie.objects.all().select_related('director')),
... )
Traceback (most recent call last):
  ...
AttributeError: 'Book' object has no attribute 'object_id'

Change History (5)

comment:1 Changed 3 years ago by Tim Graham

I don't think what you have asked for can be implemented, but I'll leave this open for confirmation by an ORM expert. See #21422 which is to document the limitation.

comment:3 Changed 3 years ago by Tim Graham

Summary: prefetch_related GenericRelation via related_query_nameBetter error messages for prefetch_related
Triage Stage: UnreviewedAccepted
Type: New featureCleanup/optimization
Version: 1.7master

If the proposal can't be implemented, I think it would be helpful to at least throw a more helpful error message.

comment:4 Changed 2 years ago by Todor Velichkov

I think the proposal is implementable. Here is an django-app which I just found which seems like implements the main part of the problem (where django get's confused to prefetch different type of FK's from 'content_object').

django-deep-prefetch

Found the app from this ticket: #22014

comment:5 Changed 5 months ago by Todor Velichkov

I did some debugging and I think I find out why prefetch_related on GenericRelation using related_query_name is not working. i.e.

TaggedItem.objects.filter(books__author__name='E L James').prefetch_related('books')

It starts from the get_prefetch_queryset at related_descriptors, where self.field is a GenericRelation field (<django.contrib.contenttypes.fields.GenericRelation: tags> in this example) the GenericRelation class inherits from ForeignObject but does not implement get_local_related_value and get_foreign_related_value methods which looks incompatible with the GenericRelation interface, because they search for object_id attribute inside a Book model.

I would love to try to fix this, but I still can't fully understand the code, I'm not even sure what needs to be returned here, all tags related to this book maybe? If thats the case, then the code in get_prefetch_queryset looks like its expected to be returned only a single instance, not many, this is confusing me.

Note: See TracTickets for help on using tickets.
Back to Top