#23982 closed Cleanup/optimization (fixed)
Clarify how to properly support both Python 2 and 3 for third-party apps with migrations
Reported by: | Luke Plant | Owned by: | Carl Meyer |
---|---|---|---|
Component: | Documentation | Version: | 1.7 |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The basic problem is that Python 2 produces migrations like this:
fields=[ ('name', models.CharField(max_length=255, verbose_name=b'First Name', blank=True)), ],
while Python 3 like this:
fields=[ ('name', models.CharField(max_length=255, verbose_name='First Name', blank=True)), ],
(i.e. notice the lack of b
prefix)
If you run 'makemigrations' under Python 3, with migrations made in the first way, it creates a new migration to fix the field.
I've attached a script that shows the problem. I've tested it against master as well as Django 1.7.1 and the bug is still present. A nice unit test for this might not be possible.
There are two things to consider:
- Fixing the generator so that it doesn't produce this problem
- Fixing the migration code so that it doesn't consider this a problem. Otherwise, going forward, all migrations created with 1.7 and 1.7.1 will produce confusion when people use Python 3, even if we fix 1.
This bug manifests itself most obviously when running makemigrations with 3rd party apps. It looks like we were lucky when it came to the migrations bundled with Django - perhaps they were created by Python 3?
Attachments (1)
Change History (14)
by , 10 years ago
Attachment: | migrations_bug_with_python2_python3.txt added |
---|
comment:1 by , 10 years ago
comment:2 by , 10 years ago
Third party applications supporting both Python 2 and 3 should consistently import unicode_literals
to avoid such behavior.
Previous changes made sure migrations generated on Python 2 can be run on Python 3 (and vice versa) but it assumes unicode_literals
were used.
The underlying issue here is that b'foo' == u'foo'
on Python 2 but not on 3.
comment:3 by , 10 years ago
I ran into a similar issue the other day with one of my 3rd party apps: https://github.com/MarkusH/django-dynamic-forms/issues/11 regarding unicode strings. And I think there is another issue somewhere that covers the byte string serialization in the migration write. I just cannot find it.
comment:4 by , 10 years ago
Component: | Migrations → Documentation |
---|---|
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → Cleanup/optimization |
Yes, charettes and bmispelon are right, this is a bug in the app, not a bug in migrations.
Migrations are correctly and consistently recording the verbose_name
as a bytestring when it is a bytestring, and as a text (unicode) string when it is a text string, and noticing the difference when it changes. (Note that migration files always have from __future__ import unicode_literals
, so an unmarked string is always a unicode string, on Python 2 or 3). That's the only sane behavior for migrations (and it was arrived at with a great deal of trial and error :-) ).
The app, per our https://docs.djangoproject.com/en/dev/topics/python3/#unicode-literals porting to Python 3 documentation, should also be using from __future__ import unicode_literals
so that the actual nature of the string does not change when moving between Python 2 and 3. (Or if they don't care to support 3.2, they could explicitly mark all strings with b
or u
if they prefer that for some reason.)
It would be great if we could provide some kind of hint or warning when creating the migration in this situation, but I don't see how we possibly can. I don't think we can reliably tell, when a migration is generated, whether the string we are seeing was explicitly marked or not, or whether the module containing it had from __future__ import unicode_literals
. And even if we could tell, we can't warn if not, because there's nothing wrong with unmarked strings and no __future__
import if the migration is being generated in a project that only supports Python 2 (or only 3).
I'm converting this to a Documentation issue, because the only thing I can see that we could do is add a note about the issue to this section of the docs: https://docs.djangoproject.com/en/dev/topics/migrations/#libraries-third-party-apps
comment:5 by , 10 years ago
Summary: | Migrations produced under Python 2 require fixing when run under Python 3 → Clarify how to properly support both Python 2 and 3 for third-party apps with migrations |
---|
comment:7 by , 10 years ago
I'm not convinced that playing whack-a-mole with conversions of specific model/field properties makes sense; I think you were right Tim to resist that approach in #23455. It's not clear that auto-converting _any_ such properties in migrations is safe: it introduces an unnecessary difference in behavior between the migration model state and the real models, and it's entirely possible that users will have code, particularly in Python 3, which is called from migrations in one way or another and which breaks if it gets a text string instead of a byte string.
And even if we don't care about that problem, it's still a game of whack-a-mole that doesn't end and only serves to longer conceal the real underlying problem, which will only be fixed by unicode_literals
.
comment:8 by , 10 years ago
I think when a doc fix is committed here, it should also revert the change from #23455 (before it gets into a release and causes spurious migrations to be generated). It's much simpler if we maintain "what you put in is what you get out" when it comes to text vs bytes in migration serialization.
comment:9 by , 10 years ago
Oh, looks like #23455 is already in 1.7.1 (odd that https://github.com/django/django/commit/e8a08514de6b97de1fce13b6c6b24cf72d1d60d9 doesn't list it as part of the 1.7.1 tag). And I don't think it resulted in spurious migrations generally, because on Python 3 it will usually have been a text string already, and on Python 2 unicode and bytes compare equal so the change doesn't cause a migration. But that also means we can revert it without causing spurious migrations.
comment:10 by , 10 years ago
Has patch: | set |
---|---|
Owner: | changed from | to
Status: | new → assigned |
Here's a PR for how I would address this: https://github.com/django/django/pull/3718
comment:11 by , 10 years ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:12 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Hi,
Isn't the problem the fact that you're using "native strings" (ie byte strings on Python 2 but unicode strings on Python 3) for your
verbose_name
in your models?If you're using
verbose_name='asdf'
on Python2 and you don't havefrom __future__ import unicode_literals
at the top of your file then you are actually usingverbose_name=b'asdf'
so the migration is correct, no?