Django

Code

Ticket #5418 (new)

Opened 2 years ago

Last modified 10 months ago

Add assertNoBrokenLinks() to test system

Reported by: adrian Assigned to: kkubasik
Milestone: Component: Testing framework
Version: SVN Keywords: feature
Cc: absoludity@gmail.com Triage Stage: Accepted
Has patch: 1 Needs documentation: 0
Needs tests: 0 Patch needs improvement: 0

Description

It would be convenient and useful if the test system could automatically check every <a href> in a rendered page to make sure the links did not cause 404s.

Obviously, for pages with dozens of links, this could take a long time to run. So I suggest an only_internal keyword argument, which would be True by default and would cause only the *internal* links (i.e., those without an "https?://") to be checked, thereby using Django's internal URL resolving to check the links instead of going over HTTP.

In the documentation for this feature, we should note that people should ensure none of their pages have side effects, because this test will essentially cause all links to be "clicked on."

Attachments

assert_no_broken_links.diff (4.9 kB) - added by absoludity on 09/14/07 09:25:31.
Initial thoughts for feedback.
assert_no_broken_links_with_tests_and_doc.diff (12.7 kB) - added by absoludity on 09/15/07 18:00:18.
New assertNoBrokenLinks (using HTMLParser) with regression tests and docs.
assert_no_broken_links_with_tests_and_doc.2.diff (19.1 kB) - added by absoludity on 09/16/07 00:45:55.
Updated patch that also checks for blank links and internal page links (ie. href="#content")

Change History

09/13/07 10:36:18 changed by adrian

  • needs_better_patch changed.
  • stage changed from Unreviewed to Accepted.
  • needs_tests changed.
  • needs_docs changed.

09/14/07 05:16:47 changed by anonymous

  • owner changed from nobody to anonymous.
  • status changed from new to assigned.

09/14/07 05:18:21 changed by absoludity

  • owner changed from anonymous to absoludity.
  • status changed from assigned to new.

09/14/07 05:18:53 changed by absoludity

  • status changed from new to assigned.

09/14/07 09:25:31 changed by absoludity

  • attachment assert_no_broken_links.diff added.

Initial thoughts for feedback.

09/14/07 09:28:20 changed by absoludity

I've created a patch for an initial solution, but there's a few questions (see doc string for assertNoBrokenLinks()).

The regexes are pretty ugly too. I've tried to use a little indentation to make it clearer, but guessing this goes against PEP08?

Any feedback appreciated.

09/14/07 09:39:18 changed by Michael Radziej <mir@noris.de>

I haven't reviewed this thoroughly, but line 287

self.fail(u'The URL %s appears to be invalid.')

misses the final % link

09/14/07 16:08:49 changed by absoludity

Thanks Michael - I'll finish it today (and write the docs/tests) - was just keen to know whether the way I was going about it was ok.

09/14/07 21:57:26 changed by absoludity

  • cc set to absoludity@gmail.com.
  • status changed from assigned to closed.
  • has_patch set to 1.
  • resolution set to fixed.

I've added a patch with the new assertion and it's own unit tests and docs. For some reason when you click on the attachment, it seems empty, but it's there when you click on the 'raw format' option.

Let me know if there's any improvements to be made! -Michael

09/14/07 23:07:49 changed by Simon G. <dev@simon.net.nz>

  • status changed from closed to reopened.
  • resolution deleted.

09/15/07 06:03:48 changed by Fredrik Lundh <fredrik@pythonware.com>

Any reason you cannot just use the standard HTMLParser module? It would eliminate all those RE:s, and also get rid of the <a id="">href issue you mentioned. The sample on this page should be helpful, I think:

http://effbot.org/librarybook/htmlparser.htm

09/15/07 12:51:55 changed by adrian

  • needs_better_patch set to 1.

Yes, this patch should be reworked to use HTMLParser instead of regexes. Also, please remove the commented-out print statements in the patch.

09/15/07 14:49:23 changed by PhiR

  • keywords set to feature.

09/15/07 16:06:05 changed by absoludity

Thanks for the hints and the link. Will try to get the changes in today (au-time).

09/15/07 18:00:18 changed by absoludity

  • attachment assert_no_broken_links_with_tests_and_doc.diff added.

New assertNoBrokenLinks (using HTMLParser) with regression tests and docs.

09/15/07 18:05:37 changed by absoludity

  • needs_better_patch deleted.

Added new version of assertNoBrokenLinks patch using the HTMLParser, includes regression tests and docs.

I'm not sure what the process is - unchecking patch needs improvement?

09/15/07 18:24:36 changed by absoludity

  • needs_better_patch set to 1.

Actually, hold off on reviewing... just thought of another test case (handling blank href's <a href=""> as the url template tag fails silently and just leaves a blank href.

Not sure if this should actually be an extra arg: ignore_blank_hrefs=False.

09/16/07 00:45:55 changed by absoludity

  • attachment assert_no_broken_links_with_tests_and_doc.2.diff added.

Updated patch that also checks for blank links and internal page links (ie. href="#content")

09/16/07 00:50:59 changed by absoludity

  • needs_better_patch deleted.

OK, I've uploaded a further patch that handles blank hrefs as well as verifying that internal page links (href="#content") are not broken. All with their own unit tests.

I didn't overwrite the previous patch in case you don't want the extra functionality.

04/20/09 16:18:25 changed by kkubasik

  • owner changed from absoludity to kkubasik.
  • status changed from reopened to new.

Add/Change #5418 (Add assertNoBrokenLinks() to test system)




Change Properties
Action