Code

Opened 5 years ago

Closed 3 years ago

#9857 closed New feature (fixed)

URLfield with verify_exists hangs if given an unresponsive URL

Reported by: aptiko Owned by: fabian
Component: Forms Version: master
Severity: Normal Keywords:
Cc: Triage Stage: Design decision needed
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Steps to reproduce:

  1. Create a model that contains a URLfield with verify_exists=True (the default), and make it adminable.
  2. In the Admin, create a new object and specify a URL whose server does not respond (it should leave your web browser in the "Connecting to [server name] ..." state).

Result: Django hangs. In my development environment, the Django development server hangs. In my production environment, I get a 500 Internal Server Error, apparently after SCGI times out while waiting for a response from Django.

Attachments (4)

9857.patch (811 bytes) - added by aptiko 5 years ago.
Patch that adds a 10-second timeout
9857-2.patch (3.0 KB) - added by trbs 5 years ago.
9857-2-alternative-using-2.6-timeout.patch (3.0 KB) - added by trbs 5 years ago.
9857_urlfield_with_timeout_argument.diff (3.9 KB) - added by fabian 3 years ago.
A patch against rev 16150 (1.4 pre alpha) that allows e.g. models.URLField(verify_exists=True, timeout=5)

Download all attachments as: .zip

Change History (18)

Changed 5 years ago by aptiko

Patch that adds a 10-second timeout

comment:1 Changed 5 years ago by aptiko

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

The patch that I submitted (works against both 1.0 and svn) adds a hardwired 10-second timeout. If there is no response after the timeout, the URL is considered broken. Although this is not ideal (it's indeterminable rather than broken), a loud failure is much better than a hang, and the user will understand immediately why Django thinks that the URL is broken because they'll test it in their web browser and see that it is unresponsive. This is much better than a hang or an Internal Server Error.

comment:2 Changed 5 years ago by jacob

  • milestone set to 1.1
  • Triage Stage changed from Unreviewed to Accepted

comment:3 Changed 5 years ago by trbs

Couple of comments:

Setting socket.setdefaulttimeout can very easily break or create problems with other running code depending on socket operations. Since the new timeout values applies to all newly created socket objects. If we must do it this way we should consider:

default_timeout = socket.getdefaulttimeout()
socket.setdefaulttimeout(settings.SOCKET_DEFAULT_TIMEOUT)

... urllib2 code ...

socket.setdefaulttimeout(default_timeout)

So not to disturb with other socket code. Still when another thread would use socket code and is executed between the first and second setdefaulttimeout() the problem still persists. This could be a potential problem specially for situations where Django runs in a threaded fcgi/wsgi enviroment ? I would imaging anybody running into a problem where suddenly his sockets would timeout to quickly would be rather dumbstruck and having a hard time figuring out why this happens. Specially since before the first verify_exists everything worked fine, while afterwards stuff started breaking. Something which could potentially happen in a development vs production kind of case, where development works fine but production magically breaks after a while because of the verify_exists check in a URLField which sets socket defaulttimeout to low. (or too high)

Another point would be that IMO this kind of setting should be controllable from settings.py although most users would never ever see the attribute. Having this kind of hard coded (unchangeable) value seems wrong.

Lastly.. when urlopen() times-out it raises URLError: <urlopen error timed out> exception. So we should be able to catch that and provide a meaningful error message, which differs (slightly) from a normal broken-url and lets the user know the url timed out.

Patch provided but still is not perfect (see threading comment).

Changed 5 years ago by trbs

comment:4 Changed 5 years ago by trbs

As of Python2.6 urllib2 allows to specify a timeout argument to the urlopen() function.
This should be the preferred way, since changes the timeout on the socket not the default for new sockets.

The alternative patch uses this approach it checking for python versions and depending on the version use
either urlopen(req, timeout=timeout) or a try-finally block to set and reset defaulttimeout.

Currently the checks are all inside the URLField but it might be 'better' (in the sane that depending on a version check seems to be something one would like to avoid in principle) to check Python versions at import time.

comment:5 Changed 5 years ago by russellm

  • Patch needs improvement set

I acknowledge the problem that exists here, but I _really_ don't like this solution. Setting the default timeout is a really nasty workaround that has potential for nasty side-effects. I don't have a better suggestion though, so we might need to hold our nose and wear this particular stink until Python 2.6 is our minimum supported version.

I'm also not particularly enthused about introducing a global setting for the timeout - this strikes me as something that should be a per-URLField setting, not a global setting.

comment:6 Changed 5 years ago by russellm

  • milestone 1.1 deleted

comment:7 Changed 5 years ago by trbs

I agree, also agree that the timeout setting (if there would be one) belongs at the URLField.

Two suggestions for dealing with this ticket without having to wait for Python 2.6 to become the minimum version:

1) What about supporting this optionally on the URLField but only for Python 2.6 and higher ?

I don't think something like this (feature which depends on python version) is done anywhere else in Django and for good reason I imaging.

2) What about adding this as a field in django-extensions ? A field with support for python2.6's timeout setting.

This way users can get the variable timeout behavior by using django-extensions. It resolves the Django from looking further at
this issue now and still gives a solution to people stuck with this problem who have no problem running python2.6.

comment:8 Changed 3 years ago by adamnelson

  • Severity set to Normal
  • Type set to New feature
  • Version changed from 1.0 to SVN

This is still a problem and merits attention (took about 45 minutes to figure it out). I would say that this should go into milestone:1.4 and that that milestone should just go ahead and make Python 2.6 the minimum - but I doubt I'll get agreement on that :-)

comment:9 Changed 3 years ago by fabian

  • Easy pickings unset

I uploaded a patch that adds a timeout argument to URLField. The default is None and it checks if urllib2.urlopen has a timeout argument (using inspect). The behavior of the URLField will be kept unless the developer explicitly defines a timeout in a model's field, raising a ValidationError with 'timed_out' code. Is that solving the questions above?

For us, not having a timeout is a blocker, since our editors sometimes have to wait forever for the admin to save the data. My latest report was caused by the bad response time of a egyptian activist's website. Those sites tend to have a rather short live span, and verify_exists _with_ a timeout would be awesome!

Last edited 3 years ago by fabian (previous) (diff)

comment:10 Changed 3 years ago by fabian

  • Easy pickings set

comment:11 Changed 3 years ago by fabian

  • Owner changed from nobody to fabian
  • Status changed from new to assigned

Changed 3 years ago by fabian

A patch against rev 16150 (1.4 pre alpha) that allows e.g. models.URLField(verify_exists=True, timeout=5)

comment:12 Changed 3 years ago by fabian

  • Needs tests set
  • Patch needs improvement unset
  • Triage Stage changed from Accepted to Design decision needed

comment:13 Changed 3 years ago by d0ugal

  • Easy pickings unset

Removed Easy Pickings since its now DDN.

comment:14 Changed 3 years ago by claudep

  • Resolution set to fixed
  • Status changed from assigned to closed
  • UI/UX unset

Mainly for security reasons, the verify_exists parameter has been deprecated in 1.3.1 and will be removed in 1.4 (or 1.5, not sure).

See changeset:16760 for more details. Note also that the patch adds a timeout on requests.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.