﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
2934	[patch] validators.isExistingURL is frequently wrong	jdunck@…	Adrian Holovaty	"The existing isExistingURL validator uses urllib2's default user agent string, which is commonly rejected by servers.

Similarly, the validator fails if a 301 or 302 is returned, though a 401 is accepted as passing.

I think it's better to claim to support all sorts of responses, allow a configurable user agent (via settings) and accept 301,302 as valid.  As a philosophical issue, we could perhaps loop on 301,302, calling it a failure after a certain number of tries, but then you might fall into a cookied tarpit which is valid, but requires a cookie store.  Semi-aside, hey, [http://bitworking.org/projects/httplib2/ httplib2] is nice.

Sorry, no patch; I'm on 0.91 and can't easily diff w/ trunk.  Even so, here's my local isExistingURL:

{{{
def isExistingURL(field_data, all_data):
    import urllib2
    try:
        headers = {
            ""Accept"" : ""text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5"",
            ""Accept-Language"" : ""en-us,en;q=0.5"",
            ""Accept-Charset"": ""ISO-8859-1,utf-8;q=0.7,*;q=0.7"",
            ""Connection"" : ""close"",
            ""User-Agent"":URL_FETCH_USER_AGENT
            }
        req = urllib2.Request(field_data,None, headers)
        u = urllib2.urlopen(req)
    except ValueError:
        raise ValidationError, _(""Invalid URL: %s"") % field_data
    except urllib2.HTTPError, e:
        # 401s are valid; they just mean authorization is required.
        # 301 and 302 are redirects; they just mean look somewhere else.
        if str(e.code) not in ('401','301','302'):
            raise ValidationError, _(""The URL %s is a broken link."") % field_data
    except: # urllib2.URLError, httplib.InvalidURL, etc.
        raise ValidationError, _(""The URL %s is a broken link."") % field_data
}}}"	enhancement	closed	Validators	0.95	normal	fixed			Unreviewed	1	0	0	0	0	0
