Deserializer cannot handle circular and forward references to object instances
|Reported by:||russellm||Owned by:||jacob|
|Severity:||Keywords:||JSON deferrable deserialization circular forward reference|
|Cc:||Triage Stage:||Design decision needed|
|Has patch:||yes||Needs documentation:||no|
|Needs tests:||no||Patch needs improvement:||yes|
The current deserializers are not able to represent circular references between objects, or forward references to objects. This is because the reconstruction of m2o and m2m relations calls on the database API to resolve the ID's mentioned in the serialization string into object instances which can be saved. Therefore, if an object has not been saved, it cannot be successfully referenced in a serialization file. This is a serious impediment to the use of serialization files for test fixtures and database migration (see ticket #2333).
To overcome this, I submit this modification to the deserializer that bypasses the database lookup step. The revised deserializer saves the primary key values defined in the serialization file directly into the object. This is the equivalent of doing:
Article.author_id = 3
Article.author = Author.objects.get(pk=3)
As well as being more flexible, it has the advantage of being faster (in that it requires less database lookups).
In order to work for m2m relations, this patch requires that patch #3389 be applied (this patch allows assignment of m2m sets using pk values).
This relatively simple approach works fine for all database backends except one - Postgres. Because Postgres has and enforces data consistency, it is not possible to INSERT an object with a foreign key value that does not have a corresponding entry in the foreign table.
To work around this problem, I have modified table declarations to make all cross-table references DEFERRABLE INITIALLY DEFERRED. This has the effect that the row constraints and consistency checks are not executed until the end of each transaction, rather than after every insert (although if transactions are not in use, there is no difference in behaviour).
In this way, if the deserialization process is placed inside a transaction, it is possible to deserialize an arbitrarily complex graph of foreign key relationships, and consistency is only checked once the entire graph has been deserialized.
On SQLite/MySQL, which don't have such strong consistency checks, there is no difference in behaviour or table definition.
The implementation presented here is for JSON only; if the idea is blessed, I will work up an analogous version for the XML deserializer.