﻿id	summary	reporter	owner	description	type	status	component	version	severity	resolution	keywords	cc	stage	has_patch	needs_docs	needs_tests	needs_better_patch	easy	ui_ux
16315	FileSystemStorage.listdir returns names with unicode normalization form that is different from names in database	philomat	nobody	"When you want to write a function that finds files on disk that are not stored in the database anymore, and use FileSystemStorage.listdir to compare what's returned with what's in the database: You will not be able to compare strings without normalizing them first since unicode characters can be encoded using different normalization forms.

This problem is best demonstrated with some example code:

{{{
# Assuming that my storage root contains one folder named u'ä'
>>> import os
>>> from django.core.files.storage import FileSystemStorage
>>> import unicodedata
>>>
# listdir returns u'a' followed by 'COMBINING DIAERESIS' (U+0308)
>>> FileSystemStorage().listdir('')[0][0]
u'a\u0308'
# in the database, this character is stored using a different normalization form: 
>>> os.path.basename(FileSystemStorage().path(u'ä'))
u'\xe4'
# the values should be normalized:
>>> unicodedata.normalize('NFC', FileSystemStorage().listdir('')[0][0])
u'\xe4'
}}}"	Bug	closed	File uploads/storage	1.3	Normal	wontfix	storage unicode normalization		Accepted	0	0	0	0	0	0
