Opened 6 years ago

Closed 6 years ago

Last modified 4 years ago

#29551 closed Uncategorized (invalid)

My use-case of QuerySet.extra(): Join two models without explicit relation

Reported by: Bowser Owned by: nobody
Component: Uncategorized Version: 2.0
Severity: Normal Keywords: QuerySet.extra
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hello,
I'm reporting here, as suggested in the docs, my use-case of QuerySet.extra() method, looking if there is any way to make it in a different way.

My django app is inter-operating with an independent third-party email program which created and filled independently some tables in a separate DB.
In order to obtain data from that independent tables in my django system I needed to map those tables in a new django model using the command inspectdb of manage.py. This worked very well with minimal corrections and let me the opportunity to get successfully the data from that tables in my Django system.

My need of QuerySet.extra() came out when I needed to join that special model with other (regular) models of my django system with a field (the email address the sender used) that can be put in relation with some of the customers of my system (but of course it may not be too if there is no relation if the sender email is unknown)
Obviuosly I counldn't put a normal ForeignKey on that model nor I can change the structure of the tables because it could block my third party email system from working properly so I couldn't put a direct regular relation between those models.

This ended in the insertion of the .extra() method in the main queryset of the ModelAdmin (I'm testing that system in the admin panels for now and is working properly with the target of using it in my main app too):

def get_queryset(self, request):
      return super(InboxAdmin,self).get_queryset(request).extra(
           select={'customer': 'customer_id'},
           where=["SenderEmail = email"],
           tables=['Customers_customer']
      )

This is working well but I wonder if this can be done in a better way.

So my question practically is if there is any way in Django to join two models even when there is no explicit relation between them (OneToOne, ManyToMany or ForeignKey) that I can't insert.

Any idea?
Thank you

Change History (2)

comment:1 by Simon Charette, 6 years ago

Resolution: invalid
Status: newclosed

Hey Bowser,

I think the best approach here would be to define a django.db.models.related.ForeignObject on your model by making sure to define appropriate from_fields and to_fields. Such objects will behave just like ForeignKeys with regards to ORM interactions (filter, annotate, values, ...) but aren't backed by a implicit concrete column. Under the hood ForeignKey is a subclass of ForeignKey that creates a concrete a column to use it as from_fields and defaults to_fields to the referenced model's primary key.

For example on your Inbox model.

class Inbox(models.Model):
    ...
    sender_email = models.EmailField(db_column='SenderEmail')
    customer = ForeignObject(Customer, models.CASCADE, from_fields=['sender_email'], to_fields=['email'])

    class Meta:
        managed = False
        ...

I'll close this ticket as invalid for now but feel free to re-open it if the provided alternative to your usage of extra() isn't appropriate.

comment:2 by jonathan-golorry, 4 years ago

ForeignObject requires that you join on a unique combination of fields. I don't think there's a ForeignObject equivalent for a ManyToMany relationship. For filtering functionality, you can get pretty much whatever you need by annotating subqueries. I haven't looked into prefetching.

In my case, I have customers and purchases. I want to see when a customer has cancelled a purchase without later making another purchase in the same category. Naively:

Purchase.objects.filter(cancelled=True).exclude(
    customer__purchases__category=F("category"),
    customer__purchases__id__gt=F("id"),
)

This doesn't work because django creates separate subqueries for each exclude condition (this might be a django bug, since django normally combines conditions in the same exclude call).

My working implementation:

later = Purchase.objects.filter(
    category=OuterRef("category"),
    customer_id=OuterRef("customer_id"),
    id__gt=OuterRef("id"),
)
Purchase.objects.annotate(latest=~Exists(later)).filter(latest=True, cancelled=True)

ForeignObject generates a correct sql query using an inner join instead of a subquery, but the system check fails. Customers very rarely purchase something in the same category multiple times, so the inner join on customer_id and category is around as fast as the subquery. I actually managed to get a faster sql query putting the ID comparison directly in a left join, but I'm not sure why that worked.

same_category = models.ForeignObject(
    "self",
    on_delete=models.PROTECT,
    from_fields=["customer", "category"],
    to_fields=["customer", "category"],
)
Purchase.objects.filter(cancelled=True).exclude(
    same_category__id__gt=F("id"),
)
Note: See TracTickets for help on using tickets.
Back to Top