Opened 3 weeks ago

Last modified 8 days ago

#36732 assigned Cleanup/optimization

Sitemaps with i18n=True load the whole table in memory — at Version 2

Reported by: Julien Palard Owned by:
Component: contrib.sitemaps Version: 5.2
Severity: Normal Keywords: sitemap, memory
Cc: Julien Palard, Roxane Triage Stage: Accepted
Has patch: yes Needs documentation: yes
Needs tests: yes Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Julien Palard)

Hi!

It's a bit like: https://code.djangoproject.com/ticket/11572

when enabling i18n on a sitemap, _items uses the following code:

            # Create (item, lang_code) tuples for all items and languages.
            # This is necessary to paginate with all languages already considered.
            items = [
                (item, lang_code)
                for item in self.items()
                for lang_code in self.get_languages_for_item(item)
            ]

The list comprehension loads the whole table times the number of languages in memory.

We tried with a table containing two millions elements and two languages: it does not fit in 16GB of memory and the process gets killed.

Change History (2)

comment:1 by Julien Palard, 3 weeks ago

While:

    def get_languages_for_item(self, item):
        """Languages for which this item is displayed."""
        return self._languages()

does not use item, something (non trivial) could be implemented to slice the queryset at the right spot (depending on the number of languages), and the total number of elements could be computed by a simple multiplication.

But as long as get_languages_for_item uses item all of this break.

comment:2 by Julien Palard, 3 weeks ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.
Back to Top