Opened 16 years ago
Closed 13 years ago
#9857 closed New feature (fixed)
URLfield with verify_exists hangs if given an unresponsive URL
Reported by: | Antonis Christofides | Owned by: | Fabian Topfstedt |
---|---|---|---|
Component: | Forms | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Design decision needed | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | yes | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Steps to reproduce:
- Create a model that contains a URLfield with verify_exists=True (the default), and make it adminable.
- In the Admin, create a new object and specify a URL whose server does not respond (it should leave your web browser in the "Connecting to [server name] ..." state).
Result: Django hangs. In my development environment, the Django development server hangs. In my production environment, I get a 500 Internal Server Error, apparently after SCGI times out while waiting for a response from Django.
Attachments (4)
Change History (18)
by , 16 years ago
Attachment: | 9857.patch added |
---|
comment:1 by , 16 years ago
The patch that I submitted (works against both 1.0 and svn) adds a hardwired 10-second timeout. If there is no response after the timeout, the URL is considered broken. Although this is not ideal (it's indeterminable rather than broken), a loud failure is much better than a hang, and the user will understand immediately why Django thinks that the URL is broken because they'll test it in their web browser and see that it is unresponsive. This is much better than a hang or an Internal Server Error.
comment:2 by , 16 years ago
milestone: | → 1.1 |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:3 by , 16 years ago
Couple of comments:
Setting socket.setdefaulttimeout can very easily break or create problems with other running code depending on socket operations. Since the new timeout values applies to all newly created socket objects. If we must do it this way we should consider:
default_timeout = socket.getdefaulttimeout() socket.setdefaulttimeout(settings.SOCKET_DEFAULT_TIMEOUT) ... urllib2 code ... socket.setdefaulttimeout(default_timeout)
So not to disturb with other socket code. Still when another thread would use socket code and is executed between the first and second setdefaulttimeout() the problem still persists. This could be a potential problem specially for situations where Django runs in a threaded fcgi/wsgi enviroment ? I would imaging anybody running into a problem where suddenly his sockets would timeout to quickly would be rather dumbstruck and having a hard time figuring out why this happens. Specially since before the first verify_exists everything worked fine, while afterwards stuff started breaking. Something which could potentially happen in a development vs production kind of case, where development works fine but production magically breaks after a while because of the verify_exists check in a URLField which sets socket defaulttimeout to low. (or too high)
Another point would be that IMO this kind of setting should be controllable from settings.py although most users would never ever see the attribute. Having this kind of hard coded (unchangeable) value seems wrong.
Lastly.. when urlopen() times-out it raises URLError: <urlopen error timed out> exception. So we should be able to catch that and provide a meaningful error message, which differs (slightly) from a normal broken-url and lets the user know the url timed out.
Patch provided but still is not perfect (see threading comment).
by , 16 years ago
Attachment: | 9857-2.patch added |
---|
comment:4 by , 16 years ago
As of Python2.6 urllib2 allows to specify a timeout argument to the urlopen() function.
This should be the preferred way, since changes the timeout on the socket not the default for new sockets.
The alternative patch uses this approach it checking for python versions and depending on the version use
either urlopen(req, timeout=timeout) or a try-finally block to set and reset defaulttimeout.
Currently the checks are all inside the URLField but it might be 'better' (in the sane that depending on a version check seems to be something one would like to avoid in principle) to check Python versions at import time.
by , 16 years ago
Attachment: | 9857-2-alternative-using-2.6-timeout.patch added |
---|
comment:5 by , 16 years ago
Patch needs improvement: | set |
---|
I acknowledge the problem that exists here, but I _really_ don't like this solution. Setting the default timeout is a really nasty workaround that has potential for nasty side-effects. I don't have a better suggestion though, so we might need to hold our nose and wear this particular stink until Python 2.6 is our minimum supported version.
I'm also not particularly enthused about introducing a global setting for the timeout - this strikes me as something that should be a per-URLField setting, not a global setting.
comment:6 by , 16 years ago
milestone: | 1.1 |
---|
comment:7 by , 16 years ago
I agree, also agree that the timeout setting (if there would be one) belongs at the URLField.
Two suggestions for dealing with this ticket without having to wait for Python 2.6 to become the minimum version:
1) What about supporting this optionally on the URLField but only for Python 2.6 and higher ?
I don't think something like this (feature which depends on python version) is done anywhere else in Django and for good reason I imaging.
2) What about adding this as a field in django-extensions ? A field with support for python2.6's timeout setting.
This way users can get the variable timeout behavior by using django-extensions. It resolves the Django from looking further at
this issue now and still gives a solution to people stuck with this problem who have no problem running python2.6.
comment:8 by , 14 years ago
Severity: | → Normal |
---|---|
Type: | → New feature |
Version: | 1.0 → SVN |
This is still a problem and merits attention (took about 45 minutes to figure it out). I would say that this should go into milestone:1.4 and that that milestone should just go ahead and make Python 2.6 the minimum - but I doubt I'll get agreement on that :-)
comment:9 by , 14 years ago
Easy pickings: | unset |
---|
I uploaded a patch that adds a timeout argument to URLField. The default is None and it checks if urllib2.urlopen has a timeout argument (using inspect). The behavior of the URLField will be kept unless the developer explicitly defines a timeout in a model's field, raising a ValidationError with 'timed_out' code. Is that solving the questions above?
For us, not having a timeout is a blocker, since our editors sometimes have to wait forever for the admin to save the data. My latest report was caused by the bad response time of a egyptian activist's website. Those sites tend to have a rather short live span, and verify_exists _with_ a timeout would be awesome!
comment:10 by , 14 years ago
Easy pickings: | set |
---|
comment:11 by , 14 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
by , 14 years ago
Attachment: | 9857_urlfield_with_timeout_argument.diff added |
---|
A patch against rev 16150 (1.4 pre alpha) that allows e.g. models.URLField(verify_exists=True, timeout=5)
comment:12 by , 14 years ago
Needs tests: | set |
---|---|
Patch needs improvement: | unset |
Triage Stage: | Accepted → Design decision needed |
comment:14 by , 13 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
UI/UX: | unset |
Mainly for security reasons, the verify_exists parameter has been deprecated in 1.3.1 and will be removed in 1.4 (or 1.5, not sure).
See changeset:16760 for more details. Note also that the patch adds a timeout on requests.
Patch that adds a 10-second timeout