Opened 8 years ago

Closed 3 years ago

Last modified 3 years ago

#27936 closed Cleanup/optimization (fixed)

Add some clarifications to "Spanning multi-valued relationships"

Reported by: Thomas Güttler Owned by: Jacob Walls
Component: Documentation Version: dev
Severity: Normal Keywords:
Cc: tzanke@…, Simon Charette, Zach Borboa Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

First of all: Thank you for the great docs.
Since it took some time until we got the difference:

filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)

 vs

filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)

The docs are great:

https://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-valued-relationships

But maybe the ascii art below helps to understand it better?

What do you think?

                      +--------------------+                                                                                      
                      |           Lennon   |                                                                                      
                    - |  Entry 1           |                                                                                      
+---------------+  /  |           2008     |                                                                                      
|               | /   +--------------------+                                                                                      
| Blog 1        |/                                 filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)         
|               |\                               -                                                                                
+---------------+ \ -                                                                                                             
                   \  +--------------------+                                                                                      
                    \ |                    |                                                                                      
                     -|  Entry 2           |                                                                                      
                      |                    |                                                                                      
                      +--------------------+                                                                                      
                                                                                                                                  
                                                                                                                                  
                      +--------------------+                                                                                      
                      |           Lennon   |                                                                                      
                    - |  Entry 3           |                                                                                      
+---------------+  /  |                    |                                                                                      
|               | /   +--------------------+                                                                                      
| Blog 2        |/                                   filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)
|               |\                                                                                                                
+---------------+ \ -                                                                                                             
                   \  +--------------------+                                                                                      
                    \ |           2008     |                                                                                      
                     -|  Entry 4           |                                                                                      
                      |                    |                                                                                      
                      +--------------------+                                                                                      

Ascii Art: https://textik.com/#100375b764993664

The ascii art could get improved, I wanted to ask you first before polishing it.

Change History (23)

comment:1 by Josh Smeaton, 8 years ago

I think there's definitely scope to improve the docs around multi valued relationships, but I don't think ASCII art (or that diagram) is really the right way of doing it. For some added confusion, the docs fail to mention that duplicates are possible with the second query (with multiple filters) if there is an entry with the headline Lennon AND it was posted in 2008. It seems the docs go through a lot of effort to avoid mentioning that a second filter causes the query to create a second join to the same table.

I'm not proposing new language for this myself, but I've seen many an experienced developer get caught by this without understanding what was actually happening with the underlying query. I'd very much like to see these docs improved in some way.

Here's some shell output for those curious about what's happening:

In [1]: from datetime import date
In [2]: d2008 = date(2008, 6, 6)
In [3]: d2009 = date(2009, 6, 6)
In [4]: both = Blog.objects.create(name='Match Both')

In [5]: Entry.objects.create(blog=both, headline='1 Lennon 1', body_text='body', pub_date=d2008)
Out[5]: <Entry: Entry object>

In [6]: Entry.objects.create(blog=both, headline='2 Lennon 2', body_text='body', pub_date=d2009)
Out[6]: <Entry: Entry object>

In [7]: Entry.objects.create(blog=both, headline='3 Blah 3', body_text='body', pub_date=d2008)
Out[7]: <Entry: Entry object>

In [8]: Blog.objects.filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)
Out[8]: <QuerySet [<Blog: Match Both>]>

In [9]: justdate = Blog.objects.create(name='Match Date Only')

In [10]: Entry.objects.create(blog=justdate, headline='4 Blah 4', body_text='body', pub_date=d2008)
Out[10]: <Entry: Entry object>

In [11]: justheadline = Blog.objects.create(name='Match Headline Only')

In [12]: Entry.objects.create(blog=justheadline, headline='5 Lennon 5', body_text='body', pub_date=d2009)
Out[12]: <Entry: Entry object>

In [13]: Blog.objects.filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)
Out[13]: <QuerySet [<Blog: Match Both>]>

In [14]: Blog.objects.filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)
Out[14]: <QuerySet [<Blog: Match Both>, <Blog: Match Both>, <Blog: Match Both>, <Blog: Match Both>]>

And the queries:

# Blog.objects.filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)

SELECT
  "scratch_blog"."id",
  "scratch_blog"."name",
  "scratch_blog"."tagline"
FROM "scratch_blog"
  INNER JOIN "scratch_entry" ON ("scratch_blog"."id" = "scratch_entry"."blog_id")
WHERE (
  "scratch_entry"."pub_date" BETWEEN '2008-01-01' :: DATE AND '2008-12-31' :: DATE)
  AND "scratch_entry"."headline" LIKE '%Lennon%'
);


# Blog.objects.filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)

SELECT
  "scratch_blog"."id",
  "scratch_blog"."name",
  "scratch_blog"."tagline"
FROM "scratch_blog"
  INNER JOIN "scratch_entry" ON ("scratch_blog"."id" = "scratch_entry"."blog_id")
  INNER JOIN "scratch_entry" T3 ON ("scratch_blog"."id" = T3."blog_id")
WHERE (
    "scratch_entry"."headline" LIKE '%Lennon%'
AND T3."pub_date" BETWEEN '2008-01-01'::date AND '2008-12-31'::Date)

comment:2 by TZanke, 8 years ago

Cc: tzanke@… added

comment:3 by Tim Graham, 8 years ago

Summary: ASCII Art for docs "Spanning multi-valued relationships"Add some clarifications to "Spanning multi-valued relationships"
Triage Stage: UnreviewedAccepted
Type: UncategorizedCleanup/optimization

in reply to:  1 comment:4 by Thomas Güttler, 8 years ago

Replying to Josh Smeaton:

I think there's definitely scope to improve the docs around multi valued relationships, but I don't think ASCII art (or that diagram) is really the right way of doing it. For some added confusion, the docs fail to mention that duplicates are possible with the second query (with multiple filters) if there is an entry with the headline Lennon AND it was posted in 2008. It seems the docs go through a lot of effort to avoid mentioning that a second filter causes the query to create a second join to the same table.

Hi Josh,

I think your proposal to change the docs are valid.

This issue is about the ascii art.

Why not open a new issue for your proposal?

comment:5 by Tim Graham, 8 years ago

I don't think there would be consensus to use ASCII art in the Django documentation. If you think some diagram might be helpful (even though Josh said he didn't think the diagram is the right way to clarify the situation), please follow the pattern used by existing images. For simplicity, I retitled this ticket rather than closing it and creating a new one.

comment:6 by Thomas Güttler, 8 years ago

Yes, you are right. the ASCII art is not a perfect solution.

If I would provide a SVG diagram instead of the ascii art, would you include it into the docs?

comment:7 by Tim Graham, 8 years ago

I agree with Josh that a diagram probably isn't the best way to clarify things. The " To select all blogs" sentences seem clear to me but perhaps you can help state them more clearly if you found them confusing.

comment:8 by Thomas Güttler, 8 years ago

Experts like you are, don't need a diagram.

Last weekend I taught 32 people the joy of python programming who had few or no experience with this language. I do this yearly since about 13 years.

Trust me, this helps to see the IT world from a different perspective.

My goal is to make software development newbee friendly.

I think a diagram like this would help.

comment:9 by Tim Graham, 8 years ago

Diagrams are used sparsely because they are more difficult to maintain. I don't think the diagram offers advantages compared to an example shell session that creates objects, runs queries, and shows the results.

comment:10 by Thomas Güttler, 8 years ago

Yes, now I see. The real problem is that diagrams are hard to maintain. If maintaining them would be easier, then there would be more.

There are several extensions for sphinx which could be used (I needed to remove a link to the sphinx plugin overview page, otherwise trac thinks my post is spam)

But this gets off topic for this particular issue.

Tim, what do you think?

comment:11 by Tim Graham, 8 years ago

I'm not convinced that a diagram is advantageous for this but if you want to create one, you can get other opinions on the DevelopersMailingList.

comment:12 by Thomas Güttler, 8 years ago

Yes, know I think I understood you. The real problem is that diagrams are hard to maintain.

If maintaining them would be easier, then there would be more diagrams. The problem is the media break between human editable ascii (which we all love) and some svg which looks like some binary randomly encoded to xml.

There are several extensions for sphinx which could be used: http://www.sphinx-doc.org/en/stable/develop.html

But this gets off topic for this particular issue.

Tim, what do you think?

comment:13 by Simon Charette, 7 years ago

Cc: Simon Charette added
Version: 1.10master

Given how often the multi-valued filter() chaining behavior is reported as a bug I think this might be worth a shot.

I'm not a big fan of diagrams either and I think a simplified shell session would be a good step forward. I guess mixing both is also an option.

If we really want to go with graphs I'd suggest we use the the Graphiz extension which makes it easy to maintain, generate SVGs, and should be flexible enough to express the previously mentioned ASCII graph.

comment:14 by Thomas Güttler, 7 years ago

I personnaly prefer the ascii art (​https://textik.com/#100375b764993664) but with graphiz you can do much more. But on the other hand the
ascii art is straight forward. It is WYSIWYG :-)

Version 0, edited 7 years ago by Thomas Güttler (next)

comment:15 by Josh Smeaton, 6 years ago

My objection was not to any diagram, but to that one specifically. I think the right diagram definitely would help, but it should clearly show items that match and items that don't match for each query type, perhaps with colors. Bonus points if it's able to call out the duplicates problem, but that could be documented separately.

I don't personally mind which technology choice is used.

comment:16 by Adam Johnson, 6 years ago

Coming here from django-developers post

I side with Josh that the proposed diagram probably isn't the best, and if we were to include one, another could explain the problem better. I find the existing text fairly clear - I've forgotten and re-learned this distinction from the docs a few times.

On technology, graphviz isn't the most flexible as it can only do graphs. Other diagrams we might add to the docs in future might need more flexibility, so I think SVG is a better choice. There's nothing stopping the first version of an SVG diagram being generated with graphviz.

comment:17 by Claude Paroz, 6 years ago

WRT the format, +1 for SVG, but *clean* SVG if possible for better maintainership. The SVG syntax of the OmniGraffle-generated ones is awful.

comment:18 by Zach Borboa, 6 years ago

Cc: Zach Borboa added

in reply to:  1 comment:19 by Jacob Walls, 3 years ago

Owner: changed from nobody to Jacob Walls
Status: newassigned

I linked to this doc in the release note I wrote for #16063. I think the main flaw of the doc is it's a bit wordy. It starts with examples, but sort of hypothetically, then discusses the implementation, then does concrete examples. Could be streamlined. Use the saved space to discuss duplicates and joins (that's where I could see the shell session coming in). I'll try to give it a go.

comment:20 by Jacob Walls, 3 years ago

Has patch: set

comment:21 by Mariusz Felisiak, 3 years ago

Triage Stage: AcceptedReady for checkin

comment:22 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

Resolution: fixed
Status: assignedclosed

In 6174814d:

Fixed #27936 -- Rewrote spanning multi-valued relationships docs.

comment:23 by Mariusz Felisiak <felisiak.mariusz@…>, 3 years ago

In c46e996:

[4.0.x] Fixed #27936 -- Rewrote spanning multi-valued relationships docs.

Backport of 6174814dbe04fb6668aa212a6cdbca765a8b0522 from main

Note: See TracTickets for help on using tickets.
Back to Top