Code

Opened 3 years ago

Closed 3 years ago

#15954 closed New feature (fixed)

Django's ignorable 404 list should include iphone favicons

Reported by: PaulM Owned by: aaugustin
Component: Core (Other) Version: 1.3
Severity: Normal Keywords: middleware
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX:

Description

The ignorable 404 list in global_settings should include some patterns to suppress the awful racket that iphones can make when hitting a site.

Specifically, the IGNORABLE_404_STARTS setting should include 'apple-touch-icon.png' for sure.

It's debatable if it should also include '/m/', '/mobi/, '/mobile/', '/iphone/', and /pda/. I'm not sure if those are a bot, or actually some versions of the phone itself, but 404s for those urls are pretty common.

Attachments (2)

15954.patch (10.9 KB) - added by aaugustin 3 years ago.
15954.2.patch (12.5 KB) - added by aaugustin 3 years ago.

Download all attachments as: .zip

Change History (10)

comment:1 Changed 3 years ago by aaugustin

  • Easy pickings unset
  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Design decision needed

The current default value is IGNORABLE_404_ENDS = ('mail.pl', 'mailform.pl', 'mail.cgi', 'mailform.cgi', 'favicon.ico', '.php').

It would make sense to ignore apple-touch-icon-precomposed.png and apple-touch-icon.png, but that will not be enough, since iDevices will try apple-touch-icon-<width>x<height>-precomposed.png and apple-touch-icon-<width>x<height>.png first starting with iOS 4.2.

See the documentation here: https://developer.apple.com/library/safari/#documentation/AppleApplications/Reference/SafariWebContent/ConfiguringWebApplications/ConfiguringWebApplications.html

While we are on this topic, maybe crossdomain.xml (Flash) and robots.txt should be ignored too.

Since all this is pretty debatable, an alternative solution would be to remove the default value for this settings. (Do I want to be notified that I did not create a robots.txt? Or a favicon.ico? Well, that completely depends on the goals of my website.) In practice, that would mean providing an empty tuple as the default value and suggesting a list of common offenders in the documentation: if you haven't created a favicon.ico, robots.txt, etc. add these files to IGNORABLE_404_ENDS.

For the record, there is also IGNORABLE_404_STARTS = ('/cgi-bin/', '/_vti_bin', '/_vti_inf'). Both settings should be handled consistently.

comment:2 Changed 3 years ago by lukeplant

One possible solution:

  • deprecate IGNORABLE_404_STARTS and IGNORABLE_404_ENDS
  • replace with IGNORABLE_404_URLS, which:
    • uses regular expressions (compiled or otherwise), allowing us to match the apple-touch-icon-* requests and others
    • defaults to an empty list, so that we are not making decisions about whether a site should have robots.txt etc. or not
    • possibly has a commented out line in the project template that includes a sensible set of defaults, so it is easy for people to enable. Alternatively we could put the list in the docs, as Aymeric (aaugustin) suggests.

A quick discussion on django-devs is needed, though, I think.

comment:3 Changed 3 years ago by aaugustin

  • Owner changed from nobody to aaugustin
  • Triage Stage changed from Design decision needed to Accepted

Given that the idea was suggested by a core developer, that there were 2 +1 from the community on django-devs, and that no one opposed the idea, I'll mark it as accepted.

Last edited 3 years ago by aaugustin (previous) (diff)

Changed 3 years ago by aaugustin

comment:4 Changed 3 years ago by aaugustin

  • Has patch set

Attached patch implements lukeplant's suggestion.

Bonus: I added tests for the 404 reporting by email—there wasn't any.

comment:5 Changed 3 years ago by lukeplant

  • Patch needs improvement set
  • Type changed from Cleanup/optimization to New feature

Good work, here's my review:

Even though it is trivial to replace IGNORABLE_404_STARTS/ENDS with IGNORABLE_404_URLS, our deprecation/backwards compatibility policy means we should keep supporting the former until we completely remove them two versions later. This is consistent with how we've handled DATABASE_* -> DATABASES (which is actually a simpler change as it doesn't change functionality).

So we need the existing settings to carry on working. We should also emit a PendingDeprecation warning if they are non-empty.

We also need this adding to the deprecation timeline under Django 1.6.

One nit in documentation: "of if you want to keep the old default value" should start or not of.

comment:6 Changed 3 years ago by aaugustin

  • Patch needs improvement unset

Here is a new patch that implements the deprecation path.

In fact, there are two places where we could raise the PendingDeprecationWarning:

  • when the settings are used: this is what I have done. A warning will be emitted for every 404 when SEND_BROKEN_LINK_EMAILS is True and DEBUG is False. It will be written in the server log.
  • when the settings are loaded: this would be a bigger change. In django/conf/__init__.py, we would emit the warning, compile IGNORABLE_404_STARTS/ENDS to regexps and append them to IGNORABLE_404_URLS. It has the advantage that the warning will also be visible under ./manage runserver, but I am not sure it is worth it.

There is still one backwards-incompatible change that will immediately appear in 1.4: the default value changes. Arguably, it's a bug fix.

Changed 3 years ago by aaugustin

comment:7 Changed 3 years ago by lukeplant

  • Triage Stage changed from Accepted to Ready for checkin

I think it's fine to put the warning in the place you've done it, putting it in django/conf/__init__.py

And given that we've decided to warn people if these old settings are non-empty, it wouldn't make sense for them to be non-empty by default, so that backwards incompatibility is also fine.

So marking as RFC accordingly.

comment:8 Changed 3 years ago by lukeplant

  • Resolution set to fixed
  • Status changed from new to closed

In [16160]:

Fixed #15954 - New IGNORABLE_404_URLS setting that allows more powerful filtering of 404s to ignore

Thanks to aaugustin for implementing this.

(Technically this doesn't fix the original report, as we've decided against
having *any* default values, but the new feature makes it possible, and the
docs have an example addressing #15954).

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.