Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#28422 closed New feature (duplicate)

Allow adding joins to other querysets (or models) to a queryset with extra join conditions — at Version 3

Reported by: Debanshu Kundu Owned by: nobody
Component: Database layer (models, ORM) Version: 1.11
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Debanshu Kundu)

In one of our project we had a need to join our sub-queries to our main query. As Django ORM doesn't support this we had to write SQL queries. But after some time those SQL queries became difficult to maintain and our project was under active development and we were doing additions/changes to models and query logic.

So we started to look for alternatives and found some ways to hack Django ORM using which we can add joins to queryset. I have created this gist with utility functions and related helper code which are we using to add join to another queryset (or model) to a queryset: https://gist.github.com/debanshuk/6fd9398cff0fab59e7093fe98b8a9152.

These functions are named join_to_queryset() and join_to_table() respectively. They also allow adding extra conditions to the join added by them using get_active_extra_restriction() helper function.

Following is an example showing use of join_to_queryset() function:

class Snake(models.Model):
    name = models.TextField()
    age = models.PositiveIntegerField()
    length = models.FloatField()
    sex = models.TextField()

class Egg(models.Model):
    snake = models.ForeignKey(Snake)

class Kill(models.Model):
    snake = models.ForeignKey(Snake)

queryset = Snake.objects.filter(pk__in=snake_pks)

queryset = join_to_queryset(
    table=Snake,
    subquery=Egg.objects.values('snake').annotate(egg_count=Count('pk')),
    table_field='id',
    subquery_field='snake_id',
    queryset=queryset,
    alias='SnakeEggAggr'
).extra(select={'egg_count': 'SnakeEggAggr.egg_count'})

queryset = join_to_queryset(
    table=Snake,
    subquery=Kill.objects.values('snake').annotate(kill_count=Count('pk')),
    table_field='id',
    subquery_field='snake_id',
    queryset=queryset,
    alias='SnakeKillAggr'
).extra(select={'kill_count': 'SnakeKillAggr.kill_count'})

print(queryset.values('name', 'age', 'length', 'sex', 'egg_count', 'kill_count'))

Output of above code would be something like:

[{'name': 'John', 'age': 5, 'length': 20.1, 'sex': 'male', 'egg_count': 10, 'kill_count': 5}, {'name': 'Jane', 'age': 8, 'length': 25.5, 'sex': 'female', 'egg_count': 5, 'kill_count': 1}, {'name': 'Jack', 'age': 2, 'length': 10.2, 'sex': 'hermaphrodite', 'egg_count': 0, 'kill_count': 0}]

Above result can also be obtained by doing following query:

print(Snake.objects.filter(pk__in=snake_pks).annotate(egg_count=Count('egg__id'), kill_count=Count('kill__id')).values('name', 'age', 'length', 'sex', 'egg_count', 'kill_count''egg_count', 'kill_count'))

But this query will take more time to execute than previous one as 'name', 'age', 'length' and 'sex' all four fields would be in the GROUP BY clause of SQL query and the time will increase more and more as the number of such fields increases (this is the vary reason due to which we had to use sub-queries for aggreagation).

It would be nice if such functionality can be added to the Django ORM itself. It seems doable to as we were able to hack the ORM to do the same.

Change History (3)

comment:1 by Debanshu Kundu, 7 years ago

Description: modified (diff)

comment:2 by Tim Graham, 7 years ago

Component: UncategorizedDatabase layer (models, ORM)
Resolution: duplicate
Status: newclosed
Type: UncategorizedNew feature

Duplicate of #26426 and/or #27332? If not, perhaps you can make a more specific API you're proposal.

comment:3 by Debanshu Kundu, 7 years ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.
Back to Top