Profanity filter suffers from the Scunthorpe problem
|Reported by:||Daniel Pope <dan@…>||Owned by:||nobody|
|Has patch:||no||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||no|
The implementation of the profanity filter suffers from the Scunthorpe Problem; ie. that it considers the town of Scunthorpe, amongst other innocuous words, to be profane.
Profanity filtering is A Hard Problem, and naïve solutions like this one cause frustrating problems to end-users.
Checking the current profanities list for false positives in a couple of word lists I had to hand also yields:
gobbledegook snigger Brushite Cushite Niggerhead Peshito Peshitto Shittah Shittah tree Shittim Shittim wood Shittle Shittlecock Shittleness
Obviously proper names are not in my dictionary, but they cause frequent and often more annoying problems.
I suggest to disable the filter by default so that scope of the problem is limited, and at the very least the filter must be restricted to re.match(r'\b' + word + '\b'). Users who need stricter profanity filters should have the responsibility for doing so, and potentially annoying their users themselves. Django should not be doing it for them.
Change History (9)
comment:1 Changed 8 years ago by thejaswi_puthraya
- Needs documentation unset
- Needs tests unset
- Patch needs improvement unset
- Triage Stage changed from Unreviewed to Design decision needed
comment:2 Changed 8 years ago by adrian
- Triage Stage changed from Design decision needed to Accepted