Opened 3 years ago
Closed 3 years ago
#33998 closed Bug (needsinfo)
Use alternates and i18n generate duplicated URLs in the sitemap
| Reported by: | brenosss | Owned by: | nobody | 
|---|---|---|---|
| Component: | contrib.sitemaps | Version: | 4.0 | 
| Severity: | Normal | Keywords: | sitemap, i18n, alternates | 
| Cc: | Florian Demmer | Triage Stage: | Unreviewed | 
| Has patch: | no | Needs documentation: | no | 
| Needs tests: | no | Patch needs improvement: | no | 
| Easy pickings: | no | UI/UX: | no | 
Description (last modified by )
If the i18n variable is set for True, the SiteMapClass generates a different URL for each item in LANGUAGES, but I'm using the alternates so I expected to have only one URL for the default language and have the translations version in the alternates URLs.
The current function:
    def _items(self):
        if self.i18n:
            # Create (item, lang_code) tuples for all items and languages.
            # This is necessary to paginate with all languages already considered.
            items = [
                (item, lang_code)
                for lang_code in self._languages()
                for item in self.items()
            ]
            return items
        return self.items()
This is a e.g of a result:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">
    <url>
        <loc>https://example.com/en/contact</loc>
        <changefreq>weekly</changefreq>
        <priority>0.5</priority>
        <xhtml:link rel="alternate" hreflang="en" href="https://example.com/en/contact" />
        <xhtml:link rel="alternate" hreflang="es" href="https://example.com/es/contact" />
        <xhtml:link rel="alternate" hreflang="el" href="https://example.com/el/contact" />
    </url>
    <url>
        <loc>https://example.com/es/contact</loc>
        <changefreq>weekly</changefreq>
        <priority>0.5</priority>
        <xhtml:link rel="alternate" hreflang="en" href="https://example.com/en/contact" />
        <xhtml:link rel="alternate" hreflang="es" href="https://example.com/es/contact" />
        <xhtml:link rel="alternate" hreflang="el" href="https://example.com/el/contact" />
    </url>
    <url>
        <loc>https://example.com/el/contact</loc>
        <changefreq>weekly</changefreq>
        <priority>0.5</priority>
        <xhtml:link rel="alternate" hreflang="en" href="https://example.com/en/contact" />
        <xhtml:link rel="alternate" hreflang="es" href="https://example.com/es/contact" />
        <xhtml:link rel="alternate" hreflang="el" href="https://example.com/el/contact" />
    </url>
</urlset>
I propose to verify if the alternates is True and generate the items only for the default language:
    def _items(self):
        if self.i18n:
            if self.alternates:
                lang_code = self.default_lang or settings.LANGUAGE_CODE
                items = self.items()
                items = [
                    # The url will be generated based on the default language the translations links will be added in the alternate links
                    (item, lang_code)
                    for item in items
                ]
                return items
            # Create (item, lang_code) tuples for all items and languages.
            # This is necessary to paginate with all languages already considered.
            items = [
                (item, lang_code)
                for lang_code in self._languages()
                for item in self.items()
            ]
            return items
        return self.items()
Then i expected a result more like:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">
    <url>
        <loc>https://example.com/en/contact</loc>
        <changefreq>weekly</changefreq>
        <priority>0.5</priority>
        <xhtml:link rel="alternate" hreflang="en" href="https://example.com/en/contact" />
        <xhtml:link rel="alternate" hreflang="es" href="https://example.com/es/contact" />
        <xhtml:link rel="alternate" hreflang="el" href="https://example.com/el/contact" />
    </url>
</urlset>
This makes sense? can I start work on it?
Change History (4)
comment:1 by , 3 years ago
| Description: | modified (diff) | 
|---|
comment:2 by , 3 years ago
| Cc: | added | 
|---|
comment:3 by , 3 years ago
comment:4 by , 3 years ago
| Resolution: | → needsinfo | 
|---|---|
| Status: | new → closed | 
Florian, thanks for checking.
Thank you for your report, but it is my understanding that the result you expect would not be correct. Each translation of a page needs a separate
urlentry.Here is an example showing the explicit listing of all language "pages" by themselves with their alternates including itself: https://developers.google.com/search/docs/advanced/crawling/localized-versions#sitemap
Do you have any references, that support your expectation?