Opened 12 years ago

Closed 12 years ago

#17730 closed Bug (fixed)

Change django.utils.htmlparser to django.utils.html_parser to avoid possible issues with case-insensitivity and the HTMLParser stdlib module

Reported by: Val Neekman Owned by: nobody
Component: Testing framework Version: 1.4-beta-1
Severity: Release blocker Keywords: AttributeError
Cc: gldnspud Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Python Version: 2.6.5
OS: Ubuntu
Django==1.4b1
South==0.7.3
Unidecode==0.04.9
Unipath==0.2.1
Werkzeug==0.8.3
boto==2.1.1
django-debug-toolbar==0.9.4
django-extensions==0.7.1

===========
in utils/HTMLParser.py:
import HTMLParser as _HTMLParser (import succeeds)
As you see from below, _HTMLParser has an attribute called _HTMLParser.

dir(_HTMLParser)

['_HTMLParser', 'builtins', 'doc', 'file', 'name', 'package', 're']

However on line 11 of utils/HTMLParser.py, it tries to reference HTMParser without the underscore.
_HTMLParser.HTMLParser.init(self)

When I changed line 11 from the above to the following, it worked.
_HTMLParser._HTMLParser.init(self)

Image attached.

Attachments (3)

HTMLParser.png (87.9 KB ) - added by Val Neekman 12 years ago.
Debugger dump
ticket17730.diff (8.4 KB ) - added by gldnspud 12 years ago.
17730-alternative.patch (832 bytes ) - added by Aymeric Augustin 12 years ago.

Download all attachments as: .zip

Change History (17)

by Val Neekman, 12 years ago

Attachment: HTMLParser.png added

Debugger dump

comment:1 by Carl Meyer, 12 years ago

Resolution: worksforme
Status: newclosed

There is no such file django/utils/HTMLParser.py in the Django source tree or in the 1.4b1 source distribution, only django/utils/htmlparser.py. In your attached image, it appears that somehow you have both files. It seems that your django/utils/HTMLParser.py is shadowing the standard library HTMLParser module, and this is causing the import in htmlparser.py to import your stray HTMLParser.py rather than the real HTMLParser standard library module.

I have no idea how you came to have a django/utils/HTMLParser.py file, but it should be removed, and then I'd expect things to work normally.

comment:2 by Val Neekman, 12 years ago

I have verified and there is no HTMLParser.py in django/utils.
Also verified that django-debug-toolbar was causing it. When I removed it, everything works fine.
Not sure why the stack trace is a bit misleading.
Will investigate further.
Thx

comment:3 by gldnspud, 12 years ago

Resolution: worksforme
Status: closedreopened

Consider renaming django.utils.htmlparser to a different name, e.g. django.utils.html_parser.

This is very likely due to some platforms using case-insensitive filesystems. It is not limited to the use of django-debug-toolbar.

OSX is one example. When I try import django.utils.htmlparser within OSX, the version of Python appears to know about the distinction between the two, and handles imports properly. This is despite the fact that you can go into the .../django/utils directory and run cat HtMlPaRsEr.Py and it will happily show you the same output as cat htmlparser.py.

The way I ran into the error this ticket refers to is by using Parallels Desktop's shared folders feature to access my OSX filesystem from within an Ubuntu 10.04.3 virtual machine. The way Python is compiled in Linux does NOT know about the case insensitivity, yet the shared folders implementation exposes this -- you can go into the .../django/utils dir within the mounted filesystem in Linux, and type cat HtMlPaRsEr.Py and lo, there is the contents of the file.

All of this is unfortunate, but these are in fact realistic conditions for working on Django projects where you want your Django project to run inside a VM, but desire the use of an OSX-based editor and other tools.

comment:4 by gldnspud, 12 years ago

Indeed, I was able to confirm just now that doing the following allowed me to run tests and see them pass:

  1. Rename django/utils/htmlparser.py to django/utils/html_parser.py
  2. Line 7 of django/test/html.py, change from from django.utils.htmlparser import HTMLParser to from django.utils.html_parser import HTMLParser

Note: I was able to find no other references to htmlparser in the Django 1.4b1 code base (other than in EGG-INFO/SOURCES.txt which I assume to be auto-generated).

by gldnspud, 12 years ago

Attachment: ticket17730.diff added

comment:6 by gldnspud, 12 years ago

Cc: gldnspud added
Has patch: set

in reply to:  3 comment:7 by Ramiro Morales, 12 years ago

Replying to gldnspud:

using Parallels Desktop's shared folders feature to access my OSX filesystem from within an Ubuntu 10.04.3 virtual machine. The way Python is compiled in Linux does NOT know about the case insensitivity, yet the shared folders implementation exposes this [...] All of this is unfortunate, but these are in fact realistic conditions for working on Django projects where you want your Django project to run inside a VM, but desire the use of an OSX-based editor and other tools.

I must say I disagree with the realistic conditions part. That's a very uncommon development setup and calling for the kind of trouble you are finding.

comment:8 by Aymeric Augustin, 12 years ago

Renaming a module is backwards incompatible, we can't do this lightly.

comment:9 by Carl Meyer, 12 years ago

Summary: AttributeError: 'module' object has no attribute 'HTMLParser'Change django.utils.htmlparser to django.utils.html_parser to avoid possible issues with case-insensitivity and the HTMLParser stdlib module

I would be interested in the results of further investigation by un3kk. If it really is possible for django-debug-toolbar to trigger this problem, that would be a serious issue.

On the whole, I am tempted to make the change suggested by gldnspud. I agree that the "case-insensitive filesystem accessed by Python from case-sensitive OS" case is not exactly common, but it would be an easy issue to avoid entirely with a relatively simple change. Relying on case-sensitivity for correct operation seems like something it doesn't hurt to avoid. I'm not too concerned about the backwards incompatibility, given that django.utils.htmlparser was added to trunk less than a month ago, has never been in a production release, and is undocumented and used only by a test helper.

Tentatively accepting, but open to arguments that it would be a bad idea to make this change.

comment:10 by Carl Meyer, 12 years ago

Triage Stage: UnreviewedAccepted

comment:11 by Aymeric Augustin, 12 years ago

I wasn't aware that this module is a recent addition. I withdraw my previous comment.

comment:12 by Chris Beaven, 12 years ago

Severity: NormalRelease blocker

comment:13 by Aymeric Augustin, 12 years ago

The public API of the HTMLParser module is the HTMLParser class and the HTMLParseError exception.

If we want django.utils.htmlparser to mimic this interface, we could expose HTMLParseError there, and import it from that module.

I'm attaching a patch for this technique.

by Aymeric Augustin, 12 years ago

Attachment: 17730-alternative.patch added

comment:14 by Carl Meyer, 12 years ago

Resolution: fixed
Status: reopenedclosed

In [17607]:

Fixed #17730 - Renamed django.utils.htmlparser to django.utils.html_parser to avoid shadowing stdlib HTMLParser in rare case-sensitivity situations. Thanks un33k for the report and gldnspud for the patch.

Note: See TracTickets for help on using tickets.
Back to Top