Opened 15 years ago

Closed 12 years ago

#9857 closed New feature (fixed)

URLfield with verify_exists hangs if given an unresponsive URL

Reported by: Antonis Christofides Owned by: Fabian Topfstedt
Component: Forms Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Design decision needed
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Steps to reproduce:

  1. Create a model that contains a URLfield with verify_exists=True (the default), and make it adminable.
  2. In the Admin, create a new object and specify a URL whose server does not respond (it should leave your web browser in the "Connecting to [server name] ..." state).

Result: Django hangs. In my development environment, the Django development server hangs. In my production environment, I get a 500 Internal Server Error, apparently after SCGI times out while waiting for a response from Django.

Attachments (4)

9857.patch (811 bytes ) - added by Antonis Christofides 15 years ago.
Patch that adds a 10-second timeout
9857-2.patch (3.0 KB ) - added by trbs 15 years ago.
9857-2-alternative-using-2.6-timeout.patch (3.0 KB ) - added by trbs 15 years ago.
9857_urlfield_with_timeout_argument.diff (3.9 KB ) - added by Fabian Topfstedt 13 years ago.
A patch against rev 16150 (1.4 pre alpha) that allows e.g. models.URLField(verify_exists=True, timeout=5)

Download all attachments as: .zip

Change History (18)

by Antonis Christofides, 15 years ago

Attachment: 9857.patch added

Patch that adds a 10-second timeout

comment:1 by Antonis Christofides, 15 years ago

The patch that I submitted (works against both 1.0 and svn) adds a hardwired 10-second timeout. If there is no response after the timeout, the URL is considered broken. Although this is not ideal (it's indeterminable rather than broken), a loud failure is much better than a hang, and the user will understand immediately why Django thinks that the URL is broken because they'll test it in their web browser and see that it is unresponsive. This is much better than a hang or an Internal Server Error.

comment:2 by Jacob, 15 years ago

milestone: 1.1
Triage Stage: UnreviewedAccepted

comment:3 by trbs, 15 years ago

Couple of comments:

Setting socket.setdefaulttimeout can very easily break or create problems with other running code depending on socket operations. Since the new timeout values applies to all newly created socket objects. If we must do it this way we should consider:

default_timeout = socket.getdefaulttimeout()
socket.setdefaulttimeout(settings.SOCKET_DEFAULT_TIMEOUT)

... urllib2 code ...

socket.setdefaulttimeout(default_timeout)

So not to disturb with other socket code. Still when another thread would use socket code and is executed between the first and second setdefaulttimeout() the problem still persists. This could be a potential problem specially for situations where Django runs in a threaded fcgi/wsgi enviroment ? I would imaging anybody running into a problem where suddenly his sockets would timeout to quickly would be rather dumbstruck and having a hard time figuring out why this happens. Specially since before the first verify_exists everything worked fine, while afterwards stuff started breaking. Something which could potentially happen in a development vs production kind of case, where development works fine but production magically breaks after a while because of the verify_exists check in a URLField which sets socket defaulttimeout to low. (or too high)

Another point would be that IMO this kind of setting should be controllable from settings.py although most users would never ever see the attribute. Having this kind of hard coded (unchangeable) value seems wrong.

Lastly.. when urlopen() times-out it raises URLError: <urlopen error timed out> exception. So we should be able to catch that and provide a meaningful error message, which differs (slightly) from a normal broken-url and lets the user know the url timed out.

Patch provided but still is not perfect (see threading comment).

by trbs, 15 years ago

Attachment: 9857-2.patch added

comment:4 by trbs, 15 years ago

As of Python2.6 urllib2 allows to specify a timeout argument to the urlopen() function.
This should be the preferred way, since changes the timeout on the socket not the default for new sockets.

The alternative patch uses this approach it checking for python versions and depending on the version use
either urlopen(req, timeout=timeout) or a try-finally block to set and reset defaulttimeout.

Currently the checks are all inside the URLField but it might be 'better' (in the sane that depending on a version check seems to be something one would like to avoid in principle) to check Python versions at import time.

comment:5 by Russell Keith-Magee, 15 years ago

Patch needs improvement: set

I acknowledge the problem that exists here, but I _really_ don't like this solution. Setting the default timeout is a really nasty workaround that has potential for nasty side-effects. I don't have a better suggestion though, so we might need to hold our nose and wear this particular stink until Python 2.6 is our minimum supported version.

I'm also not particularly enthused about introducing a global setting for the timeout - this strikes me as something that should be a per-URLField setting, not a global setting.

comment:6 by Russell Keith-Magee, 15 years ago

milestone: 1.1

comment:7 by trbs, 15 years ago

I agree, also agree that the timeout setting (if there would be one) belongs at the URLField.

Two suggestions for dealing with this ticket without having to wait for Python 2.6 to become the minimum version:

1) What about supporting this optionally on the URLField but only for Python 2.6 and higher ?

I don't think something like this (feature which depends on python version) is done anywhere else in Django and for good reason I imaging.

2) What about adding this as a field in django-extensions ? A field with support for python2.6's timeout setting.

This way users can get the variable timeout behavior by using django-extensions. It resolves the Django from looking further at
this issue now and still gives a solution to people stuck with this problem who have no problem running python2.6.

comment:8 by Adam Nelson, 13 years ago

Severity: Normal
Type: New feature
Version: 1.0SVN

This is still a problem and merits attention (took about 45 minutes to figure it out). I would say that this should go into milestone:1.4 and that that milestone should just go ahead and make Python 2.6 the minimum - but I doubt I'll get agreement on that :-)

comment:9 by Fabian Topfstedt, 13 years ago

Easy pickings: unset

I uploaded a patch that adds a timeout argument to URLField. The default is None and it checks if urllib2.urlopen has a timeout argument (using inspect). The behavior of the URLField will be kept unless the developer explicitly defines a timeout in a model's field, raising a ValidationError with 'timed_out' code. Is that solving the questions above?

For us, not having a timeout is a blocker, since our editors sometimes have to wait forever for the admin to save the data. My latest report was caused by the bad response time of a egyptian activist's website. Those sites tend to have a rather short live span, and verify_exists _with_ a timeout would be awesome!

Last edited 13 years ago by Fabian Topfstedt (previous) (diff)

comment:10 by Fabian Topfstedt, 13 years ago

Easy pickings: set

comment:11 by Fabian Topfstedt, 13 years ago

Owner: changed from nobody to Fabian Topfstedt
Status: newassigned

by Fabian Topfstedt, 13 years ago

A patch against rev 16150 (1.4 pre alpha) that allows e.g. models.URLField(verify_exists=True, timeout=5)

comment:12 by Fabian Topfstedt, 13 years ago

Needs tests: set
Patch needs improvement: unset
Triage Stage: AcceptedDesign decision needed

comment:13 by Dougal Matthews, 13 years ago

Easy pickings: unset

Removed Easy Pickings since its now DDN.

comment:14 by Claude Paroz, 12 years ago

Resolution: fixed
Status: assignedclosed
UI/UX: unset

Mainly for security reasons, the verify_exists parameter has been deprecated in 1.3.1 and will be removed in 1.4 (or 1.5, not sure).

See changeset:16760 for more details. Note also that the patch adds a timeout on requests.

Note: See TracTickets for help on using tickets.
Back to Top