#30179 closed Bug (fixed)
Merging 3 or more media objects can throw unnecessary MediaOrderConflictWarnings
Reported by: | Matt Westcott | Owned by: | nobody |
---|---|---|---|
Component: | Forms | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Johannes Maron, Matthias Kestenholz | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
Consider the following form definition, where text-editor-extras.js
depends on text-editor.js
but all other JS files are independent:
from django import forms class ColorPicker(forms.Widget): class Media: js = ['color-picker.js'] class SimpleTextWidget(forms.Widget): class Media: js = ['text-editor.js'] class FancyTextWidget(forms.Widget): class Media: js = ['text-editor.js', 'text-editor-extras.js', 'color-picker.js'] class MyForm(forms.Form): background_color = forms.CharField(widget=ColorPicker()) intro = forms.CharField(widget=SimpleTextWidget()) body = forms.CharField(widget=FancyTextWidget())
Django should be able to resolve the JS files for the final form into the order text-editor.js, text-editor-extras.js, color-picker.js. However, accessing MyForm().media
results in:
/projects/django/django/forms/widgets.py:145: MediaOrderConflictWarning: Detected duplicate Media files in an opposite order: text-editor-extras.js text-editor.js MediaOrderConflictWarning, Media(css={}, js=['text-editor-extras.js', 'color-picker.js', 'text-editor.js'])
The MediaOrderConflictWarning
is a result of the order that the additions happen in: ColorPicker().media + SimpleTextWidget().media
produces Media(css={}, js=['color-picker.js', 'text-editor.js'])
, which (wrongly) imposes the constraint that color-picker.js must appear before text-editor.js.
The final result is particularly unintuitive here, as it's worse than the "naïve" result produced by Django 1.11 before order-checking was added (color-picker.js, text-editor.js, text-editor-extras.js), and the pair of files reported in the warning message seems wrong too (aren't color-picker.js and text-editor.js the wrong-ordered ones?)
Change History (24)
comment:1 by , 6 years ago
comment:3 by , 6 years ago
Yes, testing against current master (b39bd0aa6d5667d6bbcf7d349a1035c676e3f972).
comment:4 by , 6 years ago
Cc: | added |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:5 by , 6 years ago
So https://github.com/django/django/commit/959d0c078a1c903cd1e4850932be77c4f0d2294d (the fix for #30153) didn't make this case worse, it just didn't improve on it. The problem is actually the same I encountered, with the same unintuitive error message too. There is still a way to produce a conflicting order but it's harder to trigger in the administration interface now but unfortunately still easy. Also, going back to the state of things pre 2.0 was already discussed previously and rejected.
Here's a failing test and and an idea to make this particular test pass: Merge the JS sublists starting from the longest list and continuing with shorter lists. The CSS case is missing yet. The right thing to do would be (against worse is better) to add some sort of dependency resolution solver with backtracking but that's surely a bad idea for many other reasons.
The change makes some old tests fail (I only took a closer look at test_merge_js_three_way
and in this case the failure is fine -- custom_widget.js
is allowed to appear before jquery.js
.)
diff --git a/django/forms/widgets.py b/django/forms/widgets.py index 02aa32b207..d85c409152 100644 --- a/django/forms/widgets.py +++ b/django/forms/widgets.py @@ -70,9 +70,15 @@ class Media: @property def _js(self): - js = self._js_lists[0] + sorted_by_length = list(sorted( + filter(None, self._js_lists), + key=lambda lst: -len(lst), + )) + if not sorted_by_length: + return [] + js = sorted_by_length[0] # filter(None, ...) avoids calling merge() with empty lists. - for obj in filter(None, self._js_lists[1:]): + for obj in filter(None, sorted_by_length[1:]): js = self.merge(js, obj) return js diff --git a/tests/forms_tests/tests/test_media.py b/tests/forms_tests/tests/test_media.py index 8cb484a15e..9d17ad403b 100644 --- a/tests/forms_tests/tests/test_media.py +++ b/tests/forms_tests/tests/test_media.py @@ -571,3 +571,12 @@ class FormsMediaTestCase(SimpleTestCase): # was never specified. merged = widget3 + form1 + form2 self.assertEqual(merged._css, {'screen': ['a.css', 'b.css'], 'all': ['c.css']}) + + def test_merge_js_some_more(self): + widget1 = Media(js=['color-picker.js']) + widget2 = Media(js=['text-editor.js']) + widget3 = Media(js=['text-editor.js', 'text-editor-extras.js', 'color-picker.js']) + + merged = widget1 + widget2 + widget3 + + self.assertEqual(merged._js, ['text-editor.js', 'text-editor-extras.js', 'color-picker.js'])
comment:6 by , 6 years ago
Thinking some more:
sorted()
is more likely to break existing code because people probably haven't listed all dependencies in theirjs
attributes now. Yes, that's not what they should have done, but breaking peoples' projects sucks and I don't really want to do that (even if introducingsorted()
might be the least disruptive and at the same time most correct change)- wanting to handle the
jquery
,widget1
,noConflict
andjquery
,widget2
,noConflict
case has introduced an unexpected amount of complexity - introducing a complex solving framework will have a really bad impact on runtime and will introduce even more complexity and is out of the question to me
I'm happy to help fixing this but right now I only see bad and worse choices.
comment:7 by , 6 years ago
I don't think sorting by length is the way to go - it would be trivial to make the test fail again by extending the first list with unrelated items. It might be a good real-world heuristic for finding a solution more often, but that's just trading a reproducible bug for an unpredictable one.
(I'm not sure I'd trust it as a heuristic either: we've encountered this issue on Wagtail CMS, where we're making extensive use of form media on hierarchical form structures, and so those media definitions will tend to bubble up several layers to reach the top level. At that point, there's no way of knowing whether the longer list is the one with more complex dependencies, or just one that collected more unrelated files on the way up the tree...)
I'll do some more thinking on this. My hunch is that even if it does end up being a travelling-salesman-type problem, it's unlikely to be run on a large enough data set for performance to be an issue.
comment:8 by , 6 years ago
I don't think sorting by length is the way to go - it would be trivial to make the test fail again by extending the first list with unrelated items. It might be a good real-world heuristic for finding a solution more often, but that's just trading a reproducible bug for an unpredictable one.
Well yes, if the ColorPicker
itself would have a longer list of JS files it depends on then it would fail too. If, on the other hand, it wasn't a ColorPicker
widget but a ColorPicker
formset or form the initially declared lists would still be preserved and sorting the lists by length would give the correct result.
Since #30153 the initially declared lists (or tuples) are preserved so maybe you have many JS and CSS declarations but as long as they are unrelated there will not be many long sublists.
I'm obviously happy though if you're willing to spend the time finding a robust solution to this problem.
(For the record: Personally I was happy with the state of things pre-2.0 too... and For the record 2: I'm also using custom widgets and inlines in feincms3/django-content-editor. It's really surprising to me that we didn't stumble on this earlier since we're always working on the latest Django version or even on pre-release versions if at all possible)
comment:9 by , 6 years ago
Hi there,
I'm the dude who implemented the warning.
I am not so sure this is a bug. Let's try tackle this step by step.
The new merging algorithm that was introduced in version 2 is an improvement. It is the most accurate way to merge two sorted lists. It's not the simplest way, but has been reviewed plenty times.
The warning is another story. It is independent from the algorithm. It merely tells you that the a certain order could not be maintained. We figured back than, that this would be a good idea. It warns a developer about a potential issue, but does not raise an exception. With that in mind, the correct way to deal with the issue described right now, is to ignore the warning.
BUT, that doesn't mean that you don't have a valid point. There are implicit and explicit orders. Not all assets require ordering and (random) orders that only exist because of Media merging don't matter at all.
This brings me back to a point that I have [previously made](https://code.djangoproject.com/ticket/30153#comment:6). It would make sense to store the original lists, which is now the case on master, and only raise if the order violates the original list.
The current implementation on master could also be improved by removing duplicates.
Anyways, I would considers those changes improvements, but not bug fixes.
I didn't have time yet to look into this. But I do have some time this weekend. If you want I can take another look into this and propose a solution that solves this issue.
Best
-Joe
comment:10 by , 6 years ago
"Ignore the warning" doesn't work here - the order-fixing has broken the dependency between text-editor.js and text-editor-extras.js. I can (reluctantly) accept an implementation that produces false warnings, and I can accept that a genuine dependency loop might produce undefined behaviour, but the combination of the two - breaking the ordering as a result of seeing a loop that isn't there - is definitely a bug.
(To be clear, I'm not suggesting that the 2.x implementation is a step backwards from not doing order checking at all - but it does introduce a new failure case, and that's what I'm keen to fix.)
comment:11 by , 6 years ago
To summarise: Even with the new strategy in #30153 of holding on to the un-merged lists as long as possible, the final merging is still done by adding one list at a time. The intermediate results are lists, which are assumed to be order-critical; this means the intermediate results have additional constraints that are not present in the original lists, causing it to see conflicts where there aren't any. Additionally, we should try to preserve the original sequence of files as much as possible, to avoid unnecessarily breaking user code that hasn't fully specified its dependencies and is relying on the 1.x behaviour.
I think we need to approach this as a graph problem (which I realise might sound like overkill, but I'd rather start with something formally correct and optimise later as necessary): a conflict occurs whenever the dependency graph is cyclic. #30153 is a useful step towards this, as it ensures we have the accurate dependency graph up until the point where we need to assemble the final list.
I suggest we replace Media.merge
with a new method that accepts any number of lists (using *args
if we want to preserve the existing method signature for backwards compatibility). This would work as follows:
- Iterate over all items in all sub-lists, building a dependency graph (where a dependency is any item that immediately precedes it within a sub-list) and a de-duplicated list containing all items indexed in the order they are first encountered
- Starting from the first item in the de-duplicated list, backtrack through the dependency graph, following the lowest-indexed dependency each time until we reach an item with no dependencies. While backtracking, maintain a stack of visited items. If we encounter an item already on the stack, this is a dependency loop; throw a
MediaOrderConflictWarning
and break out of the backtracking loop - Output the resulting item, then remove it from the dependency graph and the de-duplicated list
- If the 'visited items' stack is non-empty, pop the last item off it and repeat the backtracking step from there. Otherwise, repeat the backtracking step starting from the next item in the de-duplicated list
- Repeat until no items remain
comment:12 by , 6 years ago
This sounds correct. I'm not sure it's right though. It does sound awfully complex for what there is to gain. Maintaining this down the road will not get easier. Finding, explaining and understanding the fix for #30153 did already cost a lot of time which could also have been invested elsewhere.
If I manually assign widget3's JS lists (see https://code.djangoproject.com/ticket/30179#comment:5) then everything just works and the final result is correct:
# widget3 = Media(js=['text-editor.js', 'text-editor-extras.js', 'color-picker.js']) widget3 = Media() widget3._js_lists = [['text-editor.js', 'text-editor-extras.js'], ['color-picker.js']]
So what you proposed first (https://code.djangoproject.com/ticket/30179#comment:1) might just work fine and would be good enough (tm).
Something like https://github.com/django/django/blob/543fc97407a932613d283c1e0bb47616cf8782e3/django/forms/widgets.py#L52
# Instead of self._js_lists = [js]: self._js_lists = list(js) if isinstance(js, set) else [js]
comment:13 by , 6 years ago
@Matthias: I think that solution will work, but only if:
1) we're going to insist that users always use this notation wherever a "non-dependency" exists - i.e. it is considered user error for the user to forget to put color-picker.js in its own sub-list
2) we have a very tight definition of what a dependency is - e.g. color-picker.js can't legally be a dependency of text-editor.js / text-editor-extras.js, because it exists on its own in ColorPicker's media - which also invalidates the [jquery, widget1, noconflict] + [jquery, widget2, noconflict] case (does noconflict depend on widget1 or not?)
I suspect you only have to go slightly before the complexity of [jquery, widget1, noconflict] + [jquery, widget2, noconflict] before you start running into counter-examples again.
comment:14 by , 6 years ago
Has patch: | set |
---|
PR: https://github.com/django/django/pull/11010
I encountered another subtle bug along the way (which I suspect has existed since 1.x): #12879 calls for us to strip duplicates from the input lists, but in the current implementation the only de-duplication happens during Media.merge
, so this never happens in the case of a single list. I've now extended the tests to cover this: https://github.com/django/django/pull/11010/files#diff-7fc04ae9019782c1884a0e97e96eda1eR154 . As a minor side effect of this extra de-duplication step, tuples get converted to lists more often, so I've had to fix up some existing tests accordingly - hopefully that's acceptable fall-out :-)
comment:15 by , 6 years ago
Matt, great work. I believe it is best to merge all lists at once and not sequentially as I did. Based on your work, I would suggest to simply use the algorithms implemented in Python. Therefore the whole merge function can be replaced with a simple one liner:
import heapq from collections import OrderedDict def merge(*sublists): return list(OrderedDict.fromkeys(heapq.merge(*sublists))) # >>> merge([3],[1],[1,2],[2,3]) # [1, 2, 3]
comment:16 by , 6 years ago
It actually behaves different. I will continue to review your pull-request.
As stated there, it would be helpful if there is some kind of resource to understand what strategy you implemented. For now I will try to review it without it.
comment:19 by , 6 years ago
Patch needs improvement: | set |
---|
As a tentative fix, I propose that media objects should explicitly distinguish between cases where we do / don't care about ordering, notionally something like:
(although using a set for this is problematic due to the need for contents to be hashable), and the result of adding two media objects should be a "don't care" so that we aren't introducing dependencies where the original objects didn't have them. We would then defer assembling them into a flat list until the final render call. I haven't worked out the rest of the algorithm yet, but I'm willing to dig further if this sounds like a sensible plan of attack...