Profanity filter suffers from the Scunthorpe problem
|Reported by:||Owned by:||nobody|
|Has patch:||no||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||no|
The implementation of the profanity filter suffers from the Scunthorpe Problem; ie. that it considers the town of Scunthorpe, amongst other innocuous words, to be profane.
Profanity filtering is A Hard Problem, and naïve solutions like this one cause frustrating problems to end-users.
Checking the current profanities list for false positives in a couple of word lists I had to hand also yields:
gobbledegook snigger Brushite Cushite Niggerhead Peshito Peshitto Shittah Shittah tree Shittim Shittim wood Shittle Shittlecock Shittleness
Obviously proper names are not in my dictionary, but they cause frequent and often more annoying problems.
I suggest to disable the filter by default so that scope of the problem is limited, and at the very least the filter must be restricted to
re.match(r'\b' + word + '\b'). Users who need stricter profanity filters should have the responsibility for doing so, and potentially annoying their users themselves. Django should not be doing it for them.