Code

Opened 4 years ago

Closed 9 months ago

#13252 closed New feature (fixed)

Use the natural key instead of the primary key when serializing

Reported by: SmileyChris Owned by: mtredinnick
Component: Core (Serialization) Version: master
Severity: Normal Keywords:
Cc: mrmachine, chris@…, Cerin, dev@…, charette.s@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

This would allow existing objects to be updated and missing objects to be added across databases (e.g. when the autonumber primary keys may not match).

Attachments (7)

13252.diff (4.0 KB) - added by SmileyChris 4 years ago.
13252.2.diff (4.8 KB) - added by SmileyChris 4 years ago.
13252.3.diff (8.8 KB) - added by SmileyChris 4 years ago.
now with tests/docs
13252-natural-key-serializing-r14992.diff (8.8 KB) - added by mrmachine 4 years ago.
13252-natural-key-serializing-r14995.diff (39.7 KB) - added by mrmachine 4 years ago.
13252-natural-key-serializing-r16406.diff (39.1 KB) - added by claudep 3 years ago.
Updated to apply cleanly on current trunk
13252-natural-key-serializing-r17262.diff (39.7 KB) - added by claudep 3 years ago.
Updated to current trunk

Download all attachments as: .zip

Change History (35)

Changed 4 years ago by SmileyChris

comment:1 Changed 4 years ago by SmileyChris

  • Has patch set
  • Needs documentation set
  • Needs tests set
  • Patch needs improvement unset

This also fixes #11486 by proxy.

comment:2 Changed 4 years ago by russellm

  • Patch needs improvement set
  • Triage Stage changed from Unreviewed to Accepted

I accept the use case - that you should be able to say "update the existing object that matches the natural key, or create a new object if one doesn't exist". However I'm not convinced that providing a natural key for the PK value is the right way to do this.

Essentially, this approach means specifying the natural key data twice - once in the fields of the object, and once in the primary key value. This strikes me as especially un-DRY.

I haven't given this a lot of thought, but there might be a way of using the natural key without to specify it in the pk attribute. A natural key needs to be unique, so if a model defines a natural key, you shouldn't need to have the PK value specifically (or the natural key value as the primary key) in order to reconstruct the natural key for an object. This does leave the question of when the 'update or create' behavior should be invoked, but I think this will be less prone to error than reproducing the natural key.

Changed 4 years ago by SmileyChris

comment:3 Changed 4 years ago by SmileyChris

  • Patch needs improvement unset

Changed 4 years ago by SmileyChris

now with tests/docs

comment:4 Changed 4 years ago by SmileyChris

  • Needs documentation unset
  • Needs tests unset

comment:5 Changed 4 years ago by SmileyChris

  • Owner changed from nobody to mtredinnick

comment:6 Changed 4 years ago by slinkp

fwiw, patch 3 works for me in initial testing. Really useful feature!

comment:7 Changed 4 years ago by mrmachine

  • Cc mrmachine added

Changed 4 years ago by mrmachine

comment:8 Changed 4 years ago by mrmachine

Updated patch to work on r14992. Appears to work well locally. serializers_regress tests pass (on sqlite at least).

comment:9 Changed 4 years ago by Alex

  • Resolution set to fixed
  • Status changed from new to closed

Fixed by [14994].

comment:10 Changed 4 years ago by Alex

  • Resolution fixed deleted
  • Status changed from closed to reopened

Reopening.

Changed 4 years ago by mrmachine

comment:11 Changed 4 years ago by mrmachine

I've just updated the patch after feedback from freakboy3742 on IRC.

The new patch adds --natural-foreign and --natural-primary options to the dumpdata management command, maps the -n and --natural options to --natural-foreign and adds a note in the documentation that the old options are deprecated. This should retain backwards compatibility with anyone who was depending on data dumped with natural keys still containing a pk field for models that support natural keys. It also adds the corresponding use_natural_foreign_keys and use_natural_primary_keys arguments to serializers.serialize() and maps use_natural_keys to use_natural_foreign_keys.

I've left a comment in the source code where I think we should probably raise a deprecation warning (but not actually raised it) if that ends up being the desired behaviour.

One quirk is that the code beginning at line 30 in django/core/serializers/python.py appears to affect the order of fields in serialized output depending on how the data dict is constructed, even though the resulting data dict contains the same data. If I first create the data dict with model and fields items and then conditionally add a pk item (as the old patch did), then the serialized output is different to when I create the data dict with model, pk and fields items. Rather than change a tonne of hard coded tests, I construct the data dict the same way that it would have been before the patch which results in a couple of duplicated lines of code.

comment:12 Changed 3 years ago by lukeplant

  • Type set to New feature

comment:13 Changed 3 years ago by lukeplant

  • Severity set to Normal

comment:14 Changed 3 years ago by patchhammer

  • Easy pickings unset
  • Patch needs improvement set

13252-natural-key-serializing-r14995.diff fails to apply cleanly on to trunk

Changed 3 years ago by claudep

Updated to apply cleanly on current trunk

comment:15 Changed 3 years ago by claudep

  • Patch needs improvement unset
  • UI/UX unset

comment:16 Changed 3 years ago by SmileyChris

  • Patch needs improvement set

The new behaviour needs to be optional to be backwards compatible.

comment:17 Changed 3 years ago by acdha

  • Cc chris@… added

comment:18 Changed 3 years ago by Cerin

  • Cc Cerin added

Changed 3 years ago by claudep

Updated to current trunk

comment:19 Changed 3 years ago by brillgen

  • Cc dev@… added

comment:20 Changed 2 years ago by charettes

  • Cc charette.s@… added

comment:21 Changed 2 years ago by mrmachine

I've updated this patch again on a branch at GitHub.

https://github.com/thirstydigital/django/tree/tickets/13252-natural-key-serializing

SmileyChris, can you elaborate on making the new behaviour optional? Unless I'm missing something, it is? The --natural option maps to --natural-foreign (which behaves the same, with a pending deprecation warning) and we now have a new --natural-primary option as well.

I believe I've incorporated all the changes russellm requested and this should be RFC now.

comment:22 Changed 2 years ago by charettes

FWIW all tests pass on python 2.7.3rc1 sqlite3. I guess the backward compatibility or must opt-in issue SmileyChris was referring to is no more with two new flags?

comment:23 Changed 20 months ago by anonymous

I would be very pleased to see this feature enhancement reach the master branch asap 'cause I actually pretty much need it and have write myself some kind of the same..

comment:24 Changed 16 months ago by mrmachine

Rebased onto master and opened a pull request.

https://github.com/django/django/pull/894

comment:25 Changed 16 months ago by aaugustin

  • Status changed from reopened to new

comment:26 Changed 11 months ago by mrmachine

  • Patch needs improvement unset

Rebased onto master and changes for feedback integrated.

comment:27 Changed 11 months ago by timo

  • Patch needs improvement set

Left a couple more comments on the PR.

comment:28 Changed 9 months ago by Tim Graham <timograham@…>

  • Resolution set to fixed
  • Status changed from new to closed

In e527c0b6d808cb8e4bedf79ded3dc4ad1a7e17a8:

Fixed #13252 -- Added ability to serialize with natural primary keys.

Added --natural-foreign and --natural-primary options and
deprecated the --natural option to the dumpdata management
command.

Added use_natural_foreign_keys and use_natural_primary_keys
arguments and deprecated the use_natural_keys argument to
django.core.serializers.Serializer.serialize().

Thanks SmileyChris for the suggestion.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.