Opened 13 years ago
Last modified 6 years ago
#16508 assigned New feature
Provide real support for virtual fields
Reported by: | Vlastimil Zíma | Owned by: | Asif Saifuddin Auvi |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | pirosb3, michal.petrucha@… | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I have created a virtual field based on GenericForeignKey
, but I found out that it can not be properly used for model initiation.
Altered example from my test:
class Managing(models.Model): shortcut = models.CharField(max_length=20) name_cs = TranslationProxyField('name', 'cs', False) class ManagingTranslation(models.Model): name = models.CharField(max_length=20) class TranslationProxyField(object): """ Descriptor that is a virtual field that provides access to translations. """ def __init__(self, field_name, language_code=None, fallback=False): self._field_name = field_name self._language_code = language_code self._fallback = fallback names = [field_name] if language_code is not None: names.append(language_code.replace('-', '_')) if fallback: names.append(FALLBACK_FIELD_SUFFIX) self.name = '_'.join(names) def contribute_to_class(self, cls, name): if self.name != name: raise ValueError('Field proxy %s is added to class under bad attribute name.' % self) self.model = cls cls._meta.add_virtual_field(self) # Connect myself as the descriptor for this field setattr(cls, name, self) # This results in Type error from Model.__init__, because of extra kwarg Managing(shortcut='name', name_cs='jméno')
I tried to use pre_init
signal, but I found out that I will not get instance
argument, so I can not use that. This way is truly usable only for virtual fields which are like GenericForeignKey
built on real model fields of object. When it gets more complicated, it can not be used. Confusing on this path is that instance
is in providing_args
of pre_init
signal.
Currently I solve this by quite a hack which is deriving of TranslationProxyField
from property
instead of object
. This works because of setting properties is allowed in Model.__init__
.
In my opinion, a stable solution would be either providing an instance
argument in pre_init
signal or enable setting of virtual field in model initiation. Second option would also enable removal of pre_init
connection from GenericForeignKey
.
Exact examples can be found at github: https://github.com/vzima/django-multilingual-ds9
Change History (23)
comment:1 by , 13 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 12 years ago
Type: | Uncategorized → New feature |
---|
comment:3 by , 10 years ago
Cc: | added |
---|---|
Owner: | changed from | to
Status: | new → assigned |
Hi,
I will be taking this ticked under consideration as we are currently planning a refactor of _meta.
Dan
comment:5 by , 10 years ago
I'm planning on tackling this issue. The initial plan is to:
- Change ForeignObjectRel subclasses to real field instances. (For example, ForeignKey generates a ManyToOneRel in the related model). The Rel instances are already returned from get_field(), but they aren't yet field subclasses.
- Allow direct usage of ForeignObjectRel subclasses. In certain cases it can be advantageous to be able to define reverse relations directly. For example, see https://github.com/akaariai/django-reverse-unique.
- Partition ForeignKey to virtual relation field, and concrete data field. The former is the model.author, the latter model.author_id's backing implementation.
- Consider other cases where true virtual fields are needed.
My initial try will be a class structure like:
Field RelationField (base class for ForeignKey, ManyToOneRel, maybe also ManyToManyField) ConcreteField (base class for concrete data fields, for example IntegerField, ...)
I expect this to cause a good amount of breakage in 3rd party apps. I'm going to use the old good "lets first break things, then fix those cases that are complained about".
comment:6 by , 10 years ago
Thanks akaariai,
If you need any help on my side (given the work I did on Meta) please let me know.
Daniel
comment:7 by , 10 years ago
Could I propose the following class structure:
Field VirtualField RelationField ConcreteField
The addition of an intermediary VirtualField
class would be for future support of a CompositeField
class, which isn't logically a RelationField
, but is logically virtual (and subfields of the CompositeField
would be limited to concrete fields for simplicity of implementation).
comment:8 by , 10 years ago
I am also interested in this ticket.
See this other discussion on Django Developers: https://groups.google.com/forum/#!topic/django-developers/_FmJRK3sJGs].
What's the proposed definition of VirtualField
?
I have a proposal:
"A virtual field is a model field which it correlates to one or multiple concrete fields, but doesn't add or alter columns in the database."
Best regards
Federico
comment:9 by , 10 years ago
I believe in the _meta refactor we simply defined it as "doesn't add or alter columns in the database.", something like field.column = None. It probably does correlates to one or multiple concrete fields, but it doesn't have to.
Also, one thing to note is that we don't want to distinguish (too much) between virtual and non virtual on the higher levels of the API (like Model Forms). Model Forms shouldn't care whether a field has a column in the database or not. So in that case, it doesn't make sense to have a _meta.virtual_fields like we used to do. But of course the ORM and migrations are going to want to know that they should ignore these fields.
comment:10 by , 9 years ago
Cc: | added |
---|
I have recently started to dust off my old code from GSoC 2013, where my first step was pretty much this. Let me summarize my take on this. (I'm sorry, this is going to be a bit long, but I'm trying to organize my thoughts on how to approach this entire thing.)
My implementation from a few years ago matches the description from comment:5 (as far as the conceptual handling of ForeignKey
goes), and also comment:7 (when it comes to the actual class hierarchy). And yeah, comment:9 is also mostly spot-on, except for the last sentence – the ORM or migrations certainly can't ignore ForeignKey
once it becomes virtual; instead, migrations will have to hide any auto-generated auxiliary concrete fields to make migrations backwards-compatible.
I think that before we start working on ForeignKey
and break the entire ORM, it would be kind of nice to just get the more general VirtualField
in. The code for just the VirtualField class without any changes to ForeignKey
and friends can be seen in this Github comparison. After that I implemented the virtual ForeignKey
change, but up to that point, it was pretty isolated and self-contained.
Key observations from my implementation: GenericForeignKey
and GenericRelation
are handled specially in
- model instance initialization (using the
pre_init
signal), - they are kind of special cased in deletion cascading, and
- they are cloned in every model subclass.
Number 1 was fixed by making Model.__init__
consider virtual fields when processing keyword arguments. This made it possible to remove the usage of pre_init
from GenericForeignKey
.
Number 2 I'm not entirely certain about right now; I haven't yet found the time to fully understand how this is handled after Daniel's _meta
refactor, but there used to be a few caches of arbitrary collections of relationship objects. I handled this by including GenericRelation
in those collections, which meant it was no longer necessary to special-case _meta.virtual_fields
just to reach GenericRelation
s. I guess these days this could be achieved by making sure GenericRelation
appears in _meta.related_objects
, but I'd need to study this a bit further to be certain. (Daniel, do you have any insight here? Can we even to that now without breaking backwards compatibility?)
Number 3 is a peculiarity of GenericRelation
. The problem here is that it needs a reference to the current model in order to use the correct content type, which means it really does need to be cloned in every subclass, even in concrete model inheritance (otherwise it would use the parent's content type).
Right now, all virtual fields are cloned in concrete inheritance, but that doesn't make any sense for other virtual fields, like ForeignKey
, or CompositeField
, or whatever. Those should be handled just like any other field, as far as inheritance is concerned. This doesn't really have anything to do with GenericRelation
being a virtual field, instead, a more fitting name for this particular concept would be something like “private” field (or “local private”, as I named it in my 2013's branch). After all, we already do have support for other virtual fields that do not land in _meta.virtual_fields
. (ForeignObject
, although AFAIK it is not a public API, but still.)
With all that being said, I would like to get the ball rolling by offering PR 6329 which implements number 3 above, the rename of _meta.virtual_fields
to _meta.private_fields
.
In the meantime I'll go investigate what to do about the rest of the changes in that range of commits I linked above.
comment:11 by , 9 years ago
Big +1 to introducing these changes in smaller chunks.
For composite fields we are going to have a design problem for mutable composite values. For example, assume User.full_name is a FullName instance. What should happen in the following cases, and how to implement the wanted behavior?
u = User() u.full_name = FullName('Anssi', 'Kääriäinen') u.full_name.first_name = 'Matti' u.first_name == u.full_name.first_name OUT: ??? u1 = User() u2 = User() u1.full_name = FullName('Anssi', 'Kääriäinen') u2.full_name = u1.full_name u2.full_name.first_name = 'Matti' u1.first_name == u2.first_name OUT: ???
This is a nasty issue, and I think we should avoid it by saying that for now Django doesn't support mutable composite field values (that is, the composite field's value should be a tuple). Later on we can then decide how to tackle this issue properly.
comment:12 by , 9 years ago
Heh, this is skipping a bit further ahead; I would prefer to finish the work on ForeignKey
before moving on to CompositeField
, and ForeignKey
itself will take quite some time. However, since you have already brought it up...
As far as I recall, I implemented CompositeField
as a descriptor that dynamically constructed a CompositeValue
on read, and unpacked any tuple / iterable into individual values on write, but that was it; the CompositeValue
itself was just a namedtuple, so it didn't act as a proxy for the value on the model instance itself. So in your example, u1.full_name.first_name
would be a read-only item.
I think that is a reasonable restriction; trying to turn a composite value into some kind of proxy object might be a fun exercise, but it would require some amount of descriptor hacking, and I honestly don't see enough value in it that would make it worth the complexity.
comment:13 by , 9 years ago
Sorry for bringing this up :)
What you describe seems like a sane default. We can offer hooks for users to do all the dirty work in case they want real classes as composite values, but I don't see a need for Django to include this stuff out of the box.
comment:15 by , 9 years ago
I tried to dig into item number 1 from comment:10, but I've hit a snag. I created PR 6477 with the work-in-progress patch.
The patch breaks the tests for GenericRelatedObjectManager.add
, and the reason is that before the change, instance_pre_init
only sets the values of the content type and primary key, but it does not cache the target instance in cache_attr
. However, with the change, the regular __set__
is called, which caches the instance as well.
The problem with this is that GenericRelatedObjectManager.add
only assigns the values of the content type, and the primary key, but as far as I can see, GenericRelatedObjectManager
does not hold any reference to the GenericForeignKey
field on the other side of the relationship, not even its name (either of the two would make this problem go away). It does not really need to, as long as it only operates with the basic values, but it makes it impossible for me to invalidate the cached instance.
Arguably, this might be considered a bug in the current implementation, since it is always possible to change the raw primary key of the target, but any subsequent reads of the target object will still return the old instance.
I can see the following ways out:
- Make
GenericRelation
hold a reference to the correspondingGenericForeignKey
. This would mean that a lack ofGenericForeignKey
on the other side would become a hard failure; right now, there's a check that reports this situation as an error, but in theory, theGenericRelation
should work just fine anyway, which would no longer be the case. - Reorganize
GenericForeignKey.__get__
a little bit to check that the cached instance matches the raw values. This would mean that even if there is already a cached instance, the content type would have to be looked up on every read. This adds a bit of overhead, although at least content types are cached, so it shouldn't actually hit the database most of the time.
Presonally, I would probably prefer option 2, because it solves the more general problem of stale cached instances, but if anyone disagrees, I'd be happy to implement the first option, too.
comment:16 by , 9 years ago
I implemented option 2, and pushed it to the PR; it seems to work all right. All existing tests pass, but I'll have to write some additional tests specifically for the cached object invalidation, as well as a mention of the changes in the release notes.
comment:17 by , 9 years ago
I'm now happy with the state of PR 6477; I rebased and squashed it into two commits. I consider it ready for final review.
comment:20 by , 9 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:23 by , 6 years ago
Owner: | set to |
---|---|
Status: | new → assigned |
Virtual fields are also discussed shortly by Michal Petrucha for the Composite fields API [1] and in further responses; it seems like there are more use cases, but I would strongly suggest taking it up on the django-developers mailing list to further discuss it.
[1]: https://groups.google.com/d/msg/django-developers/rF79c8z65cQ/D2ZM1yZPX4cJ