Code

Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#16069 closed Bug (wontfix)

Object returned by get_or_create() is different from same object retrieved with get()

Reported by: ssteinerX@… Owned by: nobody
Component: Database layer (models, ORM) Version: 1.3
Severity: Normal Keywords: get_or_create
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX:

Description

If an IntegerField is initialized in the get_or_create 'defaults' parameter with a string, it is coerced into the database correctly, but the record returned by get_or_create returns the field as a string instead of an integer.

If you subsequently get the same record with get(), the field is correctly returned as an int.

In my opinion the records should be identical -- they're supposed to be the same record.

Attachments (0)

Change History (7)

comment:1 Changed 3 years ago by anonymous

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset

To clarify, the object returned by get_or_create() is different depending on whether the database record was created.

IOW, if the record was created using the defaults, and the initial data to be used for IntegerField is a string, the field in the returned object will be a string, if the record was found and retrieved from the database, the field will be an integer.

This is very simple to demonstrate and is not covered by any of the tests I saw in the get_or_create section of the Django test suite.

comment:2 Changed 3 years ago by julien

  • Resolution set to wontfix
  • Status changed from new to closed

The value is coerced to the database as a convenience to avoid the query blowing up unnecessarily. However, at the Python level it is the developer's responsibility to either provide sane data or to coerce it into the type that they want. See #15947 for a related issue.

comment:3 Changed 3 years ago by anonymous

I disagree: the returned object should be the database record representation regardless of whether it previously existed or not.

That it coerces the data to the correct type is certainly handy and, once it is coerced, that should be the representation returned by get_or_create.

The current behaviour is unnecessarily confusing; the record should be the record.

I'm not in a position to force the issue, but I think "wont-fix" is the wrong answer; it's certainly simple enough and I'd be willing to do the work and provide tests.

comment:4 Changed 3 years ago by julien

Could you please provide some more context about your use case? For example, how are you retrieving the default values before feeding them to get_or_create()?

The defaults are used only when creating the object. This is equivalent to manually creating a new Python instance, setting those values directly to that instance, and saving it to the database. With that scenario, saving the object still preserves the values that were set at the Python level, and it is not expected for that Python object to have its fields' values automatically and implicitly cast to the "proper" format. So I don't see why it should be different using get_or_create().

I do see that this behaviour may be confusing, but if anything I'd just modify the documentation to clarify how defaults is used and behaves.

If you still disagree, please feel free to bring this up on the dev-list.

comment:5 Changed 3 years ago by ssteiner

The use case is actually quite common (for me, anyway).

I'm creating a new record based on some JSON I'm getting from an Ajax or POST request.

So, I have e.g.:

incomingJSON =
   {
   'theKey': "SOME UNIQUE KEY USUALLY CONTAINING THE DATETIME",
   'fromMode': "1",
   'toMode': "3" 
   }
theKey = incomingJSON.pop("theKey")
rec = get_or_create(theKey, defaults=incomingJSON)

if rec.fromMode == 1:     #this should work either way, but blows up 
                          #if the record is created since rec.fromMode = "1"
   ...
else:
   ...

For get_or_create to return two different representations, depending on what happened inside the function doesn't follow the "law of least astonishment" for me at all.

I don't care what happened inside; that's the whole point of the function.

The returned value should be the record as represented in the database. I shouldn't be able to get two different things depending on how the "black box" did its thing.

The result should be identical regardless of whether the record existed or not or whatever happened inside the box.

comment:6 Changed 3 years ago by julien

Thanks for clarifying your use case. I do see that one doesn't want to know what's happening inside the black box. However, providing some data that *you know* is not correctly formatted, and then expecting that those values will implicitly be correctly formatted as they come out of the black box isn't necessarily better practice IMO. FWIW, get_or_create() is a bit of a special case as its internals are actually documented in great detail: http://docs.djangoproject.com/en/dev/ref/models/querysets/#django.db.models.query.QuerySet.get_or_create

To be honest, I see this function as a "helper that's convenient in many cases but just be careful what you do with it".

So, I'm still unconvinced by your proposal. However, I don't want it to be just me standing in your way. So if you really feel that the behaviour ought to be changed, please don't hesitate to write a proof-of-concept patch and/or bring the question up on django-dev, as you'll reach a far larger audience there.

comment:7 Changed 3 years ago by jacob

  • milestone 1.4 deleted

Milestone 1.4 deleted

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.