Opened 13 years ago

Closed 12 years ago

#1355 closed enhancement (duplicate)

Internationalisation(charset) problems with FileField file names and core.db.backend.mysql

Reported by: little Owned by: nobody
Component: Core (Other) Version:
Severity: normal Keywords:
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by hugo)

The function django.utils.text.get_valid_filename() is not friendly for non-latin file-names

    s = s.strip().replace(' ', '_')
    return re.sub(r'[^-A-Za-z0-9_.]', '', s)

truncates file name to underscores only: "__________.txt" for example.

Let it retun a Unicode object of string s

    return unicode(s,'utf8')

Or make it possible to overload this function to end-programmer.

Change History (6)

comment:1 Changed 13 years ago by hugo

Description: modified (diff)

The problem here: we can't assume anything about the filesystem of the server beside the fact that it is possible to use us-ascii in filenames. So utf-8 won't be an option - it might produce unreadable filenames. And since there are several places that function like / and ., we can't just accept any char we want, or we would open up for filesystem traversal hackery.

One way would be to just turn non-ascii chars into a uXXXX form, so that at least the filename isn't all dashes.

I move the database stuff into it's own ticket, as that isn't i18n related, but more database backend related.

comment:2 Changed 13 years ago by little

Another good way to name files is to give them [database id].ext names
for example 12345.txt 34567.doc and so on...

Files like numbers are more better than underscores, any way.

comment:3 Changed 13 years ago by hugo

Component: InternationalizationCore framework
Owner: changed from hugo to Adrian Holovaty

comment:4 Changed 12 years ago by limodou@…

Yeah, I also encounter this problem. And I hope how to use i18n filename should determined by ender user but not automatically processed. Or we can set some flag in save_FIELD_file() method, just like:

object.save_FIELD_file(i18n_filename, content, safety=True)

This will use get_valid_filename to deal with filename, and if user invoke:

object.save_FIELD_file(i18n_filename, content, safety=False)

This will not use get_valid_filename. Parameter safety can be default True in order to keep compatibility with the old function.

comment:5 Changed 12 years ago by Chris Beaven

Triage Stage: UnreviewedAccepted

Accepted. Seems obvious something needs to change.

comment:6 Changed 12 years ago by James Bennett

Resolution: duplicate
Status: newclosed

Closing in favor of #3119, which has a patch.

Note: See TracTickets for help on using tickets.
Back to Top