Opened 6 months ago

Last modified 6 months ago

#27936 new Cleanup/optimization

Add some clarifications to "Spanning multi-valued relationships"

Reported by: Thomas Güttler Owned by: nobody
Component: Documentation Version: 1.10
Severity: Normal Keywords:
Cc: tzanke@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

First of all: Thank you for the great docs.
Since it took some time until we got the difference:

filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)

 vs

filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)

The docs are great:

https://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-valued-relationships

But maybe the ascii art below helps to understand it better?

What do you think?

                      +--------------------+                                                                                      
                      |           Lennon   |                                                                                      
                    - |  Entry 1           |                                                                                      
+---------------+  /  |           2008     |                                                                                      
|               | /   +--------------------+                                                                                      
| Blog 1        |/                                 filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)         
|               |\                               -                                                                                
+---------------+ \ -                                                                                                             
                   \  +--------------------+                                                                                      
                    \ |                    |                                                                                      
                     -|  Entry 2           |                                                                                      
                      |                    |                                                                                      
                      +--------------------+                                                                                      
                                                                                                                                  
                                                                                                                                  
                      +--------------------+                                                                                      
                      |           Lennon   |                                                                                      
                    - |  Entry 3           |                                                                                      
+---------------+  /  |                    |                                                                                      
|               | /   +--------------------+                                                                                      
| Blog 2        |/                                   filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)
|               |\                                                                                                                
+---------------+ \ -                                                                                                             
                   \  +--------------------+                                                                                      
                    \ |           2008     |                                                                                      
                     -|  Entry 4           |                                                                                      
                      |                    |                                                                                      
                      +--------------------+                                                                                      

Ascii Art: https://textik.com/#100375b764993664

The ascii art could get improved, I wanted to ask you first before polishing it.

Change History (12)

comment:1 Changed 6 months ago by Josh Smeaton

I think there's definitely scope to improve the docs around multi valued relationships, but I don't think ASCII art (or that diagram) is really the right way of doing it. For some added confusion, the docs fail to mention that duplicates are possible with the second query (with multiple filters) if there is an entry with the headline Lennon AND it was posted in 2008. It seems the docs go through a lot of effort to avoid mentioning that a second filter causes the query to create a second join to the same table.

I'm not proposing new language for this myself, but I've seen many an experienced developer get caught by this without understanding what was actually happening with the underlying query. I'd very much like to see these docs improved in some way.

Here's some shell output for those curious about what's happening:

In [1]: from datetime import date
In [2]: d2008 = date(2008, 6, 6)
In [3]: d2009 = date(2009, 6, 6)
In [4]: both = Blog.objects.create(name='Match Both')

In [5]: Entry.objects.create(blog=both, headline='1 Lennon 1', body_text='body', pub_date=d2008)
Out[5]: <Entry: Entry object>

In [6]: Entry.objects.create(blog=both, headline='2 Lennon 2', body_text='body', pub_date=d2009)
Out[6]: <Entry: Entry object>

In [7]: Entry.objects.create(blog=both, headline='3 Blah 3', body_text='body', pub_date=d2008)
Out[7]: <Entry: Entry object>

In [8]: Blog.objects.filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)
Out[8]: <QuerySet [<Blog: Match Both>]>

In [9]: justdate = Blog.objects.create(name='Match Date Only')

In [10]: Entry.objects.create(blog=justdate, headline='4 Blah 4', body_text='body', pub_date=d2008)
Out[10]: <Entry: Entry object>

In [11]: justheadline = Blog.objects.create(name='Match Headline Only')

In [12]: Entry.objects.create(blog=justheadline, headline='5 Lennon 5', body_text='body', pub_date=d2009)
Out[12]: <Entry: Entry object>

In [13]: Blog.objects.filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)
Out[13]: <QuerySet [<Blog: Match Both>]>

In [14]: Blog.objects.filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)
Out[14]: <QuerySet [<Blog: Match Both>, <Blog: Match Both>, <Blog: Match Both>, <Blog: Match Both>]>

And the queries:

# Blog.objects.filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)

SELECT
  "scratch_blog"."id",
  "scratch_blog"."name",
  "scratch_blog"."tagline"
FROM "scratch_blog"
  INNER JOIN "scratch_entry" ON ("scratch_blog"."id" = "scratch_entry"."blog_id")
WHERE (
  "scratch_entry"."pub_date" BETWEEN '2008-01-01' :: DATE AND '2008-12-31' :: DATE)
  AND "scratch_entry"."headline" LIKE '%Lennon%'
);


# Blog.objects.filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)

SELECT
  "scratch_blog"."id",
  "scratch_blog"."name",
  "scratch_blog"."tagline"
FROM "scratch_blog"
  INNER JOIN "scratch_entry" ON ("scratch_blog"."id" = "scratch_entry"."blog_id")
  INNER JOIN "scratch_entry" T3 ON ("scratch_blog"."id" = T3."blog_id")
WHERE (
    "scratch_entry"."headline" LIKE '%Lennon%'
AND T3."pub_date" BETWEEN '2008-01-01'::date AND '2008-12-31'::Date)

comment:2 Changed 6 months ago by TZanke

Cc: tzanke@… added

comment:3 Changed 6 months ago by Tim Graham

Summary: ASCII Art for docs "Spanning multi-valued relationships"Add some clarifications to "Spanning multi-valued relationships"
Triage Stage: UnreviewedAccepted
Type: UncategorizedCleanup/optimization

comment:4 in reply to:  1 Changed 6 months ago by Thomas Güttler

Replying to Josh Smeaton:

I think there's definitely scope to improve the docs around multi valued relationships, but I don't think ASCII art (or that diagram) is really the right way of doing it. For some added confusion, the docs fail to mention that duplicates are possible with the second query (with multiple filters) if there is an entry with the headline Lennon AND it was posted in 2008. It seems the docs go through a lot of effort to avoid mentioning that a second filter causes the query to create a second join to the same table.

Hi Josh,

I think your proposal to change the docs are valid.

This issue is about the ascii art.

Why not open a new issue for your proposal?

comment:5 Changed 6 months ago by Tim Graham

I don't think there would be consensus to use ASCII art in the Django documentation. If you think some diagram might be helpful (even though Josh said he didn't think the diagram is the right way to clarify the situation), please follow the pattern used by existing images. For simplicity, I retitled this ticket rather than closing it and creating a new one.

comment:6 Changed 6 months ago by Thomas Güttler

Yes, you are right. the ASCII art is not a perfect solution.

If I would provide a SVG diagram instead of the ascii art, would you include it into the docs?

comment:7 Changed 6 months ago by Tim Graham

I agree with Josh that a diagram probably isn't the best way to clarify things. The " To select all blogs" sentences seem clear to me but perhaps you can help state them more clearly if you found them confusing.

comment:8 Changed 6 months ago by Thomas Güttler

Experts like you are, don't need a diagram.

Last weekend I taught 32 people the joy of python programming who had few or no experience with this language. I do this yearly since about 13 years.

Trust me, this helps to see the IT world from a different perspective.

My goal is to make software development newbee friendly.

I think a diagram like this would help.

comment:9 Changed 6 months ago by Tim Graham

Diagrams are used sparsely because they are more difficult to maintain. I don't think the diagram offers advantages compared to an example shell session that creates objects, runs queries, and shows the results.

comment:10 Changed 6 months ago by Thomas Güttler

Yes, now I see. The real problem is that diagrams are hard to maintain. If maintaining them would be easier, then there would be more.

There are several extensions for sphinx which could be used (I needed to remove a link to the sphinx plugin overview page, otherwise trac thinks my post is spam)

But this gets off topic for this particular issue.

Tim, what do you think?

comment:11 Changed 6 months ago by Tim Graham

I'm not convinced that a diagram is advantageous for this but if you want to create one, you can get other opinions on the DevelopersMailingList.

comment:12 Changed 6 months ago by Thomas Güttler

Yes, know I think I understood you. The real problem is that diagrams are hard to maintain.

If maintaining them would be easier, then there would be more diagrams. The problem is the media break between human editable ascii (which we all love) and some svg which looks like some binary randomly encoded to xml.

There are several extensions for sphinx which could be used: http://www.sphinx-doc.org/en/stable/develop.html

But this gets off topic for this particular issue.

Tim, what do you think?

Note: See TracTickets for help on using tickets.
Back to Top