Opened 4 weeks ago

Closed 13 days ago

Last modified 13 days ago

#37088 closed Bug (fixed)

path based MediaAsset equality

Reported by: Johannes Maron Owned by: Johannes Maron
Component: Forms Version: 6.0
Severity: Normal Keywords: Media MediaAsset
Cc: Johannes Maron Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: yes UI/UX: no

Description (last modified by Natalia Bidart)

This is a spin-off from here: https://github.com/django/django/pull/21239#issuecomment-4396168812

Currently MediaAsset.__eq__ compares only the path.

This approach was deliberately chosen to benefit performance.

However, in the case of link-elements and even script this approach can break.
For a link, rel=stylesheet and rel=prefetch can share the same path. Prefetching is also not too crazy a use case; someone will probably try to do it.

A compromise could be to define an explicit list of attributes to compare, instead of doing a full dictionary comparison via attributres.
Something like:

diff --git a/django/forms/widgets.py b/django/forms/widgets.py
index 6972155402..70561d2c8c 100644
--- a/django/forms/widgets.py
+++ b/django/forms/widgets.py
@@ -66,14 +66,20 @@ class MediaOrderConflictWarning(RuntimeWarning):
 class MediaAsset:
     element_template = "{path}"
 
+    # Only compare selected attributes to ensure performant comparison in Media.merge.
+    equality_attributes = ("path",)
+
     def __init__(self, path, **attributes):
         self._path = path
         self.attributes = attributes
 
     def __eq__(self, other):
-        # Compare the path only, to ensure performant comparison in
-        # Media.merge.
-        return (self.__class__ is other.__class__ and self.path == other.path) or (
+        return (
+            self.__class__ is other.__class__ and self.path == other.path and all(
+                self.attributes.get(attr, None) == other.attributes.get(attr, None)
+                for attr in self.equality_attributes
+            )
+        ) or (
             isinstance(other, str) and self._path == other
         )
 

Change History (13)

comment:1 by Natalia Bidart, 4 weeks ago

Description: modified (diff)
Keywords: Media MediaAsset added
Triage Stage: UnreviewedAccepted

Thank you Johannes, this makes sense. Though implementation wise, I would rather each class defines their own __eq__ isolating the specific semantic instead of defining a class attribute.

comment:2 by Johannes Maron, 4 weeks ago

Owner: set to Johannes Maron
Status: newassigned

I will try to do that. However, this might end up causing code duplication since it might be partially awkward to use super (because of the type check). But I will give it a try.

comment:3 by Johannes Maron, 4 weeks ago

With Stylesheet's rel being frozen now. I actually cannot think of a use case where this would still break.

Script has different attributes that change when it is loaded and how it is run. However, it doesn't change the accept headers or any other method that would result in path leading to different content.

The only remote thing I can think of would be one script asset loaded blocking the other deferred. We should then load blocking.

For Link-elements the media attribute could vary with the same path. However, <link … media="screen and print"> is a legal (and probably preferable) implementation.

Once the new stylesheet object is merged, I will do a performance benchmark. My gut feeling is that it's not worth it.

comment:4 by Johannes Maron, 4 weeks ago

Good thing I did. I found another bog, we are comparing self.path == other.path instead of self._path == other._path which causing a 1_000 x performance degradation, while adding and self.attributes == other.attributes adds almost no penalty.

Since most attributes will be empty dicts or include "rel". I believe we can safely compare the whole dict.

comment:5 by Johannes Maron, 4 weeks ago

I used the following script to benchmark:

import timeit

from django.forms.widgets import Script, Stylesheet

# Setup: create instances to compare
setup = """
import os
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "tests.staticfiles_tests.settings")

import django
django.setup()
from django.forms.widgets import Script, Stylesheet

s1 = Script("vendor/jquery.js")
s2 = Script("vendor/jquery.js")
s3 = Script("vendor/other.js")

ss1 = Stylesheet("vendor/style.css")
ss2 = Stylesheet("vendor/style.css")
ss3 = Stylesheet("vendor/other.css")
"""

benchmarks = {
    "Script equal (same path)":         "s1 == s2",
    "Script not equal (diff path)":     "s1 == s3",
    "Script equal to str":              "s1 == 'vendor/jquery.js'",
    "Script not equal to str":          "s1 == 'vendor/other.js'",
    "Stylesheet equal (same path)":     "ss1 == ss2",
    "Stylesheet not equal (diff path)": "ss1 == ss3",
    "Stylesheet equal to str":          "ss1 == 'vendor/style.css'",
    "Stylesheet not equal to str":      "ss1 == 'vendor/other.css'",
}

N = 1_000_000

print(f"{'Benchmark':<40} {'Total (s)':>10} {'Per call (ns)':>14}")
print("-" * 66)
for label, stmt in benchmarks.items():
    t = timeit.timeit(stmt, setup=setup, number=N)
    print(f"{label:<40} {t:>10.4f} {t / N * 1e9:>13.1f}ns")

comment:6 by Johannes Maron, 4 weeks ago

Has patch: set
Needs documentation: set
Patch needs improvement: set

comment:7 by Johannes Maron, 3 weeks ago

Needs documentation: unset
Patch needs improvement: unset

comment:8 by blighj, 2 weeks ago

Triage Stage: AcceptedReady for checkin

comment:9 by Jacob Walls, 2 weeks ago

Needs tests: set
Triage Stage: Ready for checkinAccepted

comment:10 by Johannes Maron, 2 weeks ago

Needs tests: unset

comment:11 by Jacob Walls, 13 days ago

Triage Stage: AcceptedReady for checkin

comment:12 by Jacob Walls <jacobtylerwalls@…>, 13 days ago

Resolution: fixed
Status: assignedclosed

In bc9bed57:

Fixed #37088 -- Included attributes in media object equality.

Since in the majority of cases the MediaAsset.attributes will be
empty or small, there's only a tiny performance penalty.

However, the accidental use of the path property caused
a 1_000x performacne degredation (N=1_000_000).

comment:13 by Jacob Walls <jacobtylerwalls@…>, 13 days ago

In fb30b085:

[6.1.x] Fixed #37088 -- Included attributes in media object equality.

Since in the majority of cases the MediaAsset.attributes will be
empty or small, there's only a tiny performance penalty.

However, the accidental use of the path property caused
a 1_000x performacne degredation (N=1_000_000).

Backport of bc9bed573ce39c3b739a4a3b7848816d464b6bdc from main.

Note: See TracTickets for help on using tickets.
Back to Top