Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#30179 closed Bug (fixed)

Merging 3 or more media objects can throw unnecessary MediaOrderConflictWarnings

Reported by: Matt Westcott Owned by: nobody
Component: Forms Version: dev
Severity: Normal Keywords:
Cc: Johannes Maron, Matthias Kestenholz Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Consider the following form definition, where text-editor-extras.js depends on text-editor.js but all other JS files are independent:

from django import forms


class ColorPicker(forms.Widget):
    class Media:
        js = ['color-picker.js']


class SimpleTextWidget(forms.Widget):
    class Media:
        js = ['text-editor.js']


class FancyTextWidget(forms.Widget):
    class Media:
        js = ['text-editor.js', 'text-editor-extras.js', 'color-picker.js']


class MyForm(forms.Form):
    background_color = forms.CharField(widget=ColorPicker())
    intro = forms.CharField(widget=SimpleTextWidget())
    body = forms.CharField(widget=FancyTextWidget())

Django should be able to resolve the JS files for the final form into the order text-editor.js, text-editor-extras.js, color-picker.js. However, accessing MyForm().media results in:

/projects/django/django/forms/widgets.py:145: MediaOrderConflictWarning: Detected duplicate Media files in an opposite order:
text-editor-extras.js
text-editor.js
  MediaOrderConflictWarning,
Media(css={}, js=['text-editor-extras.js', 'color-picker.js', 'text-editor.js'])

The MediaOrderConflictWarning is a result of the order that the additions happen in: ColorPicker().media + SimpleTextWidget().media produces Media(css={}, js=['color-picker.js', 'text-editor.js']), which (wrongly) imposes the constraint that color-picker.js must appear before text-editor.js.

The final result is particularly unintuitive here, as it's worse than the "naïve" result produced by Django 1.11 before order-checking was added (color-picker.js, text-editor.js, text-editor-extras.js), and the pair of files reported in the warning message seems wrong too (aren't color-picker.js and text-editor.js the wrong-ordered ones?)

Change History (24)

comment:1 by Matt Westcott, 5 years ago

As a tentative fix, I propose that media objects should explicitly distinguish between cases where we do / don't care about ordering, notionally something like:

class FancyTextWidget(forms.Widget):
    class Media:
        js = {
            ('text-editor.js', 'text-editor-extras.js'),  # tuple = order is important
            'color-picker.js'  # set = order is unimportant
        }

(although using a set for this is problematic due to the need for contents to be hashable), and the result of adding two media objects should be a "don't care" so that we aren't introducing dependencies where the original objects didn't have them. We would then defer assembling them into a flat list until the final render call. I haven't worked out the rest of the algorithm yet, but I'm willing to dig further if this sounds like a sensible plan of attack...

comment:2 by Tim Graham, 5 years ago

Are you testing with the fix from #30153?

comment:3 by Matt Westcott, 5 years ago

Yes, testing against current master (b39bd0aa6d5667d6bbcf7d349a1035c676e3f972).

comment:4 by Tim Graham, 5 years ago

Cc: Johannes Maron Matthias Kestenholz added
Triage Stage: UnreviewedAccepted

comment:5 by Matthias Kestenholz, 5 years ago

So https://github.com/django/django/commit/959d0c078a1c903cd1e4850932be77c4f0d2294d (the fix for #30153) didn't make this case worse, it just didn't improve on it. The problem is actually the same I encountered, with the same unintuitive error message too. There is still a way to produce a conflicting order but it's harder to trigger in the administration interface now but unfortunately still easy. Also, going back to the state of things pre 2.0 was already discussed previously and rejected.

Here's a failing test and and an idea to make this particular test pass: Merge the JS sublists starting from the longest list and continuing with shorter lists. The CSS case is missing yet. The right thing to do would be (against worse is better) to add some sort of dependency resolution solver with backtracking but that's surely a bad idea for many other reasons.

The change makes some old tests fail (I only took a closer look at test_merge_js_three_way and in this case the failure is fine -- custom_widget.js is allowed to appear before jquery.js.)

diff --git a/django/forms/widgets.py b/django/forms/widgets.py
index 02aa32b207..d85c409152 100644
--- a/django/forms/widgets.py
+++ b/django/forms/widgets.py
@@ -70,9 +70,15 @@ class Media:
 
     @property
     def _js(self):
-        js = self._js_lists[0]
+        sorted_by_length = list(sorted(
+            filter(None, self._js_lists),
+            key=lambda lst: -len(lst),
+        ))
+        if not sorted_by_length:
+            return []
+        js = sorted_by_length[0]
         # filter(None, ...) avoids calling merge() with empty lists.
-        for obj in filter(None, self._js_lists[1:]):
+        for obj in filter(None, sorted_by_length[1:]):
             js = self.merge(js, obj)
         return js
 
diff --git a/tests/forms_tests/tests/test_media.py b/tests/forms_tests/tests/test_media.py
index 8cb484a15e..9d17ad403b 100644
--- a/tests/forms_tests/tests/test_media.py
+++ b/tests/forms_tests/tests/test_media.py
@@ -571,3 +571,12 @@ class FormsMediaTestCase(SimpleTestCase):
         # was never specified.
         merged = widget3 + form1 + form2
         self.assertEqual(merged._css, {'screen': ['a.css', 'b.css'], 'all': ['c.css']})
+
+    def test_merge_js_some_more(self):
+        widget1 = Media(js=['color-picker.js'])
+        widget2 = Media(js=['text-editor.js'])
+        widget3 = Media(js=['text-editor.js', 'text-editor-extras.js', 'color-picker.js'])
+
+        merged = widget1 + widget2 + widget3
+
+        self.assertEqual(merged._js, ['text-editor.js', 'text-editor-extras.js', 'color-picker.js'])

comment:6 by Matthias Kestenholz, 5 years ago

Thinking some more:

  • sorted() is more likely to break existing code because people probably haven't listed all dependencies in their js attributes now. Yes, that's not what they should have done, but breaking peoples' projects sucks and I don't really want to do that (even if introducing sorted() might be the least disruptive and at the same time most correct change)
  • wanting to handle the jquery, widget1, noConflict and jquery, widget2, noConflict case has introduced an unexpected amount of complexity
  • introducing a complex solving framework will have a really bad impact on runtime and will introduce even more complexity and is out of the question to me

I'm happy to help fixing this but right now I only see bad and worse choices.

comment:7 by Matt Westcott, 5 years ago

I don't think sorting by length is the way to go - it would be trivial to make the test fail again by extending the first list with unrelated items. It might be a good real-world heuristic for finding a solution more often, but that's just trading a reproducible bug for an unpredictable one.

(I'm not sure I'd trust it as a heuristic either: we've encountered this issue on Wagtail CMS, where we're making extensive use of form media on hierarchical form structures, and so those media definitions will tend to bubble up several layers to reach the top level. At that point, there's no way of knowing whether the longer list is the one with more complex dependencies, or just one that collected more unrelated files on the way up the tree...)

I'll do some more thinking on this. My hunch is that even if it does end up being a travelling-salesman-type problem, it's unlikely to be run on a large enough data set for performance to be an issue.

comment:8 by Matthias Kestenholz, 5 years ago

I don't think sorting by length is the way to go - it would be trivial to make the test fail again by extending the first list with unrelated items. It might be a good real-world heuristic for finding a solution more often, but that's just trading a reproducible bug for an unpredictable one.

Well yes, if the ColorPicker itself would have a longer list of JS files it depends on then it would fail too. If, on the other hand, it wasn't a ColorPicker widget but a ColorPicker formset or form the initially declared lists would still be preserved and sorting the lists by length would give the correct result.

Since #30153 the initially declared lists (or tuples) are preserved so maybe you have many JS and CSS declarations but as long as they are unrelated there will not be many long sublists.

I'm obviously happy though if you're willing to spend the time finding a robust solution to this problem.

(For the record: Personally I was happy with the state of things pre-2.0 too... and For the record 2: I'm also using custom widgets and inlines in feincms3/django-content-editor. It's really surprising to me that we didn't stumble on this earlier since we're always working on the latest Django version or even on pre-release versions if at all possible)

comment:9 by Johannes Maron, 5 years ago

Hi there,

I'm the dude who implemented the warning.

I am not so sure this is a bug. Let's try tackle this step by step.

The new merging algorithm that was introduced in version 2 is an improvement. It is the most accurate way to merge two sorted lists. It's not the simplest way, but has been reviewed plenty times.

The warning is another story. It is independent from the algorithm. It merely tells you that the a certain order could not be maintained. We figured back than, that this would be a good idea. It warns a developer about a potential issue, but does not raise an exception. With that in mind, the correct way to deal with the issue described right now, is to ignore the warning.

BUT, that doesn't mean that you don't have a valid point. There are implicit and explicit orders. Not all assets require ordering and (random) orders that only exist because of Media merging don't matter at all.

This brings me back to a point that I have [previously made](https://code.djangoproject.com/ticket/30153#comment:6). It would make sense to store the original lists, which is now the case on master, and only raise if the order violates the original list.

The current implementation on master could also be improved by removing duplicates.

Anyways, I would considers those changes improvements, but not bug fixes.

I didn't have time yet to look into this. But I do have some time this weekend. If you want I can take another look into this and propose a solution that solves this issue.

Best
-Joe

comment:10 by Matt Westcott, 5 years ago

"Ignore the warning" doesn't work here - the order-fixing has broken the dependency between text-editor.js and text-editor-extras.js. I can (reluctantly) accept an implementation that produces false warnings, and I can accept that a genuine dependency loop might produce undefined behaviour, but the combination of the two - breaking the ordering as a result of seeing a loop that isn't there - is definitely a bug.

(To be clear, I'm not suggesting that the 2.x implementation is a step backwards from not doing order checking at all - but it does introduce a new failure case, and that's what I'm keen to fix.)

comment:11 by Matt Westcott, 5 years ago

To summarise: Even with the new strategy in #30153 of holding on to the un-merged lists as long as possible, the final merging is still done by adding one list at a time. The intermediate results are lists, which are assumed to be order-critical; this means the intermediate results have additional constraints that are not present in the original lists, causing it to see conflicts where there aren't any. Additionally, we should try to preserve the original sequence of files as much as possible, to avoid unnecessarily breaking user code that hasn't fully specified its dependencies and is relying on the 1.x behaviour.

I think we need to approach this as a graph problem (which I realise might sound like overkill, but I'd rather start with something formally correct and optimise later as necessary): a conflict occurs whenever the dependency graph is cyclic. #30153 is a useful step towards this, as it ensures we have the accurate dependency graph up until the point where we need to assemble the final list.

I suggest we replace Media.merge with a new method that accepts any number of lists (using *args if we want to preserve the existing method signature for backwards compatibility). This would work as follows:

  • Iterate over all items in all sub-lists, building a dependency graph (where a dependency is any item that immediately precedes it within a sub-list) and a de-duplicated list containing all items indexed in the order they are first encountered
  • Starting from the first item in the de-duplicated list, backtrack through the dependency graph, following the lowest-indexed dependency each time until we reach an item with no dependencies. While backtracking, maintain a stack of visited items. If we encounter an item already on the stack, this is a dependency loop; throw a MediaOrderConflictWarning and break out of the backtracking loop
  • Output the resulting item, then remove it from the dependency graph and the de-duplicated list
  • If the 'visited items' stack is non-empty, pop the last item off it and repeat the backtracking step from there. Otherwise, repeat the backtracking step starting from the next item in the de-duplicated list
  • Repeat until no items remain

comment:12 by Matthias Kestenholz, 5 years ago

This sounds correct. I'm not sure it's right though. It does sound awfully complex for what there is to gain. Maintaining this down the road will not get easier. Finding, explaining and understanding the fix for #30153 did already cost a lot of time which could also have been invested elsewhere.

If I manually assign widget3's JS lists (see https://code.djangoproject.com/ticket/30179#comment:5) then everything just works and the final result is correct:

# widget3 = Media(js=['text-editor.js', 'text-editor-extras.js', 'color-picker.js'])
widget3 = Media()
widget3._js_lists = [['text-editor.js', 'text-editor-extras.js'], ['color-picker.js']]

So what you proposed first (https://code.djangoproject.com/ticket/30179#comment:1) might just work fine and would be good enough (tm).

Something like https://github.com/django/django/blob/543fc97407a932613d283c1e0bb47616cf8782e3/django/forms/widgets.py#L52

# Instead of self._js_lists = [js]:
self._js_lists = list(js) if isinstance(js, set) else [js]

comment:13 by Matt Westcott, 5 years ago

@Matthias: I think that solution will work, but only if:

1) we're going to insist that users always use this notation wherever a "non-dependency" exists - i.e. it is considered user error for the user to forget to put color-picker.js in its own sub-list
2) we have a very tight definition of what a dependency is - e.g. color-picker.js can't legally be a dependency of text-editor.js / text-editor-extras.js, because it exists on its own in ColorPicker's media - which also invalidates the [jquery, widget1, noconflict] + [jquery, widget2, noconflict] case (does noconflict depend on widget1 or not?)

I suspect you only have to go slightly before the complexity of [jquery, widget1, noconflict] + [jquery, widget2, noconflict] before you start running into counter-examples again.

comment:14 by Matt Westcott, 5 years ago

Has patch: set

PR: https://github.com/django/django/pull/11010

I encountered another subtle bug along the way (which I suspect has existed since 1.x): #12879 calls for us to strip duplicates from the input lists, but in the current implementation the only de-duplication happens during Media.merge, so this never happens in the case of a single list. I've now extended the tests to cover this: https://github.com/django/django/pull/11010/files#diff-7fc04ae9019782c1884a0e97e96eda1eR154 . As a minor side effect of this extra de-duplication step, tuples get converted to lists more often, so I've had to fix up some existing tests accordingly - hopefully that's acceptable fall-out :-)

comment:15 by Johannes Maron, 5 years ago

Matt, great work. I believe it is best to merge all lists at once and not sequentially as I did. Based on your work, I would suggest to simply use the algorithms implemented in Python. Therefore the whole merge function can be replaced with a simple one liner:

import heapq

from collections import OrderedDict

def merge(*sublists):
    return list(OrderedDict.fromkeys(heapq.merge(*sublists)))


# >>> merge([3],[1],[1,2],[2,3])
# [1, 2, 3]

comment:16 by Johannes Maron, 5 years ago

It actually behaves different. I will continue to review your pull-request.

As stated there, it would be helpful if there is some kind of resource to understand what strategy you implemented. For now I will try to review it without it.

comment:17 by Tim Graham, 5 years ago

A PR from matthiask proposes to use topological sort instead.

comment:18 by Tim Graham <timograham@…>, 5 years ago

In e04209e1:

Refs #30179 -- Moved topological sort functions to django.utils.

comment:19 by Johannes Maron, 5 years ago

Patch needs improvement: set

comment:20 by Tim Graham <timograham@…>, 5 years ago

Resolution: fixed
Status: newclosed

In 231b5139:

Fixed #30179 -- Fixed form Media merging when pairwise merging is insufficient.

Thanks gasman for the tests, and codingjoe and timgraham for the review.

comment:21 by Tim Graham <timograham@…>, 5 years ago

In 77e53da1:

[2.2.x] Refs #30179 -- Moved topological sort functions to django.utils.

Backport of e04209e181c99ac16ca769d115ac640015a83757 from master.

comment:22 by Tim Graham <timograham@…>, 5 years ago

In 459f7c8:

[2.2.x] Fixed #30179 -- Fixed form Media merging when pairwise merging is insufficient.

Thanks gasman for the tests, and codingjoe and timgraham for the review.

Backport of 231b513926f2bfd71f08058ce5013bd81678ac01 from master.

comment:23 by GitHub <noreply@…>, 5 years ago

In 418263c4:

Fixed #30263 -- Doc'd changes to form Media sorting (refs #30179).

Thanks to Tim Graham for review.

comment:24 by Carlton Gibson <carlton.gibson@…>, 5 years ago

In 19ab6989:

[2.2.x] Fixed #30263 -- Doc'd changes to form Media sorting (refs #30179).

Thanks to Tim Graham for review.
Backport of 418263c457636d3301f2068c47f09a0f42e15c52 from master

Note: See TracTickets for help on using tickets.
Back to Top