Opened 5 years ago

Closed 5 years ago

#30564 closed Bug (invalid)

Cannot create custom field that returns a queryset AND uses pre_save().

Reported by: Dan J Strohl Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I tried creating a custom field that would store a string that was a list of pks, then return that as a queryset. (yes, I could probably have done it using a many2many field, but I was (am) trying to reduce the number of db queries for this.

In any case, all seems to work OK until I implemented a def pre_save(self, model_instance, add) method on my custom field.

What seems to be happening is that when the queryset comes back from the pre-save, it is run through the SQLInsertCompiler.prepare_value, which checks the value to see if it has a "resolve_expression" attrubute, which it assumes is then a SQL expression and then tries checking for "contains_column_references"... The QuerySet object DOES have the "resolve_expression" attribute, but does NOT have the others that are in the SQL expression objects.

I suspect this doesn't come up much.

My trace back

Error
Traceback (most recent call last):

  File "C:\Users\strohl\Documents\Project\Who\who_db\who_db_tests\test_model_methods.py", line 101, in setUp
    self.M1 = MixinTest.objects.create(name='M1')
  File "C:\Users\strohl\Documents\VirtualEnv\Who\lib\site-packages\django\db\models\manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "C:\Users\strohl\Documents\VirtualEnv\Who\lib\site-packages\django\db\models\query.py", line 422, in create
    obj.save(force_insert=True, using=self.db)
  File "C:\Users\strohl\Documents\Project\Who\who_db\models.py", line 132, in save
    super(MixinTest, self).save(*args, **kwargs)
  File "C:\Users\strohl\Documents\VirtualEnv\Who\lib\site-packages\django\db\models\base.py", line 741, in save
    force_update=force_update, update_fields=update_fields)
  File "C:\Users\strohl\Documents\VirtualEnv\Who\lib\site-packages\django\db\models\base.py", line 779, in save_base
    force_update, using, update_fields,
  File "C:\Users\strohl\Documents\VirtualEnv\Who\lib\site-packages\django\db\models\base.py", line 870, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "C:\Users\strohl\Documents\VirtualEnv\Who\lib\site-packages\django\db\models\base.py", line 908, in _do_insert
    using=using, raw=raw)
  File "C:\Users\strohl\Documents\VirtualEnv\Who\lib\site-packages\django\db\models\manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "C:\Users\strohl\Documents\VirtualEnv\Who\lib\site-packages\django\db\models\query.py", line 1186, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "C:\Users\strohl\Documents\VirtualEnv\Who\lib\site-packages\django\db\models\sql\compiler.py", line 1331, in execute_sql
    for sql, params in self.as_sql():
  File "C:\Users\strohl\Documents\VirtualEnv\Who\lib\site-packages\django\db\models\sql\compiler.py", line 1275, in as_sql
    for obj in self.query.objs
  File "C:\Users\strohl\Documents\VirtualEnv\Who\lib\site-packages\django\db\models\sql\compiler.py", line 1275, in <listcomp>
    for obj in self.query.objs
  File "C:\Users\strohl\Documents\VirtualEnv\Who\lib\site-packages\django\db\models\sql\compiler.py", line 1274, in <listcomp>
    [self.prepare_value(field, self.pre_save_val(field, obj)) for field in fields]
  File "C:\Users\strohl\Documents\VirtualEnv\Who\lib\site-packages\django\db\models\sql\compiler.py", line 1205, in prepare_value
    if value.contains_column_references:
AttributeError: 'Query' object has no attribute 'contains_column_references'

My custom field:

class PkListField(Field):
    empty_strings_allowed = False
    description = "PK List"
    default_error_messages = {}

    def __init__(self, *args, max_recs=100, max_pk_size=5, sep=',', ordered=None, as_pks=True, model=None, manager='objects', pre_save_func=None, **kwargs):
        self.pk_list_model_manager = manager
        self.max_recs = max_recs
        self.max_pk_size = max_pk_size
        self.sep = sep
        self.as_pks = as_pks
        self.ordered = ordered
        self.pre_save_func = pre_save_func
        self._pk_list_model = model

        if not as_pks and model is None:
            raise AttributeError('Model must be specified if not returning as_pks')
        if as_pks and ordered is not None and ordered != 'pk':
            raise AttributeError('Data can only be ordered by PKs when returning as pks')

        kwargs['max_length'] = ((max_pk_size + 2) * max_recs) + 1
        kwargs['blank'] = True
        kwargs['default'] = sep + sep
        super().__init__(*args, **kwargs)

    @property
    def pk_list_model(self):
        if isinstance(self._pk_list_model, str):
            self._pk_list_model = apps.get_model(self._pk_list_model)
        return self._pk_list_model

    def deconstruct(self):
        name, path, args, kwargs = super().deconstruct()
        if self.pre_save_func is not None:
            kwargs['pre_save_func'] = self.pre_save_func
        if self.pk_list_model is not None:
            kwargs['model'] = self.pk_list_model
        if self.pk_list_model_manager is not 'objects':
            kwargs['manager'] = self.pk_list_model_manager
        if not self.as_pks:
            kwargs['as_pks'] = self.as_pks
        if self.max_recs != 100:
            kwargs['max_recs'] = self.max_recs
        if self.max_pk_size != 5:
            kwargs['max_pk_size'] = self.max_pk_size
        if self.sep != ',':
            kwargs['sep'] = self.sep
        if self.ordered is not None:
            kwargs['ordered'] = self.ordered
        del kwargs['max_length']
        del kwargs['blank']
        del kwargs['default']
        return name, path, args, kwargs

    def get_internal_type(self):
        return "CharField"

    def conv_str_to_python(self, value):
        def conv_obj(obj_in):
            if obj_in is None:
                return []
            elif isinstance(obj_in, int):
                return [obj_in]
            elif isinstance(obj_in, str):
                if self.sep in obj_in:
                    return conv_obj(obj_in.strip().strip(self.sep).split(self.sep))
                if obj_in:
                    return [int(obj_in.strip())]
                else:
                    return []
            elif issubclass(obj_in.__class__, models.Model):
                return [obj_in.pk]
            else:
                try:
                    tmp_ret = []
                    for item in obj_in:
                        tmp_ret.extend(conv_obj(item))
                    return tmp_ret
                except TypeError:
                    raise ValidationError('Invalid object type: %r' % obj_in)

        if issubclass(value.__class__, models.QuerySet):
            if self.as_pks:
                return list(value.values_list('pk', flat=True))
            else:
                return value
        if not value:
            value = []
        else:
            value = conv_obj(value)

        if self.as_pks:
            if self.ordered == 'pk':
                value.sort()
            return value
        tmp_mgr = getattr(self.pk_list_model, self.pk_list_model_manager)
        print('in conv_to_python: value= %r' % value)
        # if value:
        tmp_ret = tmp_mgr.filter(pk__in=value)
        if self.ordered:
            tmp_ret = tmp_ret.order_by(*make_list(self.ordered))
        print('in conv_to_python: returning filtered = %r' % tmp_ret)

        return tmp_ret
        # else:
        #     tmp_ret = tmp_mgr.none()
        #     print('in conv_to_python: returning none = %r' % tmp_ret)

        #     return tmp_ret

    def to_python(self, value):
        print(f'IN ({self.attname}) to_python({repr(value)})')
        tmp_ret = self.conv_str_to_python(value)
        print(f'OUT ({self.attname}) to_python({repr(tmp_ret)})')
        return tmp_ret

    def from_db_value(self, value, expression, connection):
        print(f'IN ({self.attname}) from_db_value({repr(value)})')
        tmp_ret = self.conv_str_to_python(value)
        print(f'OUT ({self.attname}) from_db_value({repr(tmp_ret)})')
        return tmp_ret

    def conv_to_str(self, value, wrapped=True):
        def conv_obj(obj_in):
            if obj_in is None:
                return []
            elif isinstance(obj_in, int):
                return [str(obj_in)]
            elif isinstance(obj_in, str):
                if self.sep in obj_in:
                    return conv_obj(obj_in.strip().strip(self.sep).split(self.sep))
                if obj_in:
                    return [obj_in.strip()]
                else:
                    return []
            elif issubclass(obj_in.__class__, models.Model):
                return [str(obj_in.pk)]
            else:
                try:
                    tmp_ret = []
                    for item in obj_in:
                        tmp_ret.extend(conv_obj(item))
                    return tmp_ret
                except TypeError:
                    raise ValidationError('Invalid object type: %r' % obj_in)

        if issubclass(value.__class__, models.QuerySet):
            value = list(value.values_list('pk', flat=True))

        value = self.sep.join(conv_obj(value))

        if wrapped:
            return self.sep + value + self.sep
        else:
            return value

    def get_db_prep_value(self, value, connection, prepared=False):
        print(f'IN ({self.attname}) get_db_prep_value({repr(value)})')
        if not prepared:
            value = self.get_prep_value(value)
        value = super(PkListField, self).get_db_prep_value(value, connection, prepared=True)
        print(f'OUT ({self.attname}) get_db_prep_value({repr(value)})')
        return value

    def get_prep_value(self, value):
        print(f'IN ({self.attname}) get_prep_value({repr(value)})')
        value = super().get_prep_value(value)
        value = self.conv_to_str(value)
        print(f'OUT ({self.attname}) get_prep_value({repr(value)})')
        return value

    def pre_save(self, model_instance, add):
        if self.pre_save_func is not None:
            value = super(PkListField, self).pre_save(model_instance, add)
            value = self.pre_save_func(value=value, field=self.attname, model=model_instance, add=add)
            setattr(model_instance, self.attname, value)
            print(f'({self.attname}) OUT pre_Save({repr(value)})')
            return value
        else:
            value = super().pre_save(model_instance, add)
            print(f'OUT ({self.attname}) pre_save({repr(value)})')
            return value

Change History (5)

comment:1 by Dan J Strohl, 5 years ago

Perhaps instead of using "hasattr(value, 'resolve_expression')", a "issubclass(value, <sql class>):" could be used?

comment:2 by Mariusz Felisiak, 5 years ago

Resolution: invalid
Status: newclosed
Summary: Cannot create custom field that returns a queryset AND uses pre_save()Cannot create custom field that returns a queryset AND uses pre_save().
Version: 2.2master

Thanks for the report, however it is not an issue in Django. It seems that you want to get help with your custom implementation.

If you're on PostgreSQL you can use ArrayField without implementing a custom field. If you're on a different database and you need to implement a custom field for that, then please use one of support channels.

if hasattr(value, 'resolve_expression') is the way that we prefer to check if something is an expression, it is used in many places.

Closing per TicketClosingReasons/UseSupportChannels.

comment:3 by Dan J Strohl, 5 years ago

Resolution: invalid
Status: closednew

Re-opening this, I think you mis-understood my point. I understand that there are other approaches I can take, and I am looking at them (and am happy for the suggestions). however, the issue as I reported it actually IS with Django. The issue is that by using if hasattr(value, 'resolve_expression'), you cannot deterministically tell the difference between a SQL expression and a Django QuerySet. perhaps there is a way around this that I haven't seen, in which case, I apologize for missing it. however by following the docs, when I return a QuerySet object, and when a pre_save method is used, the system raises an error since the QuerySet object matches your check, but is not an expression object. (or, if you ARE considering it an expression object for some reason, then the bug is that it does not have all of the OTHER required methods that the expected one does.).

I understand that using if hasattr is the preferred approach, and I am not saying that the only fix is to change that, you could so a second check after it, change the method name in the queryset, have an alternative path that doesn't check for the method... etc... (all of which seem like much more work than changing the "if" check), but there may be other issues that I don't see (I haven't looked deeply enough at the code to see what other issues that would cause). I don't know the flow or code well enough to say either way. I simply brought it up as a potential approach.

Either way though, the bug I am reporting is that when returning a django queryset, via a custom field with a pre_save, it raises an AttributeError. Since these are all normal Django objects, and I'm using documented django approaches, I see this as a bug. If the returned object was something I controlled, I would simply re-name the "resolve_expression" method so it didn't' get caught, but I can't (easily) do that for a django QuerySet.

I won't argue this again if you want to close this again, I'm not trying to get into a war here or anything. I think it's pretty much a corner case, but I do see it as a bug and wanted to make sure I was at least being clear.

comment:4 by Simon Charette, 5 years ago

The issue is that by using if hasattr(value, 'resolve_expression'), you cannot deterministically tell the difference between a SQL expression and a Django QuerySet.

FWIW even if the QuerySet class doesn't completely implement the expression API it partially behaves like one; that's what makes usage of QuerySet as subqueries work (e.g. .filter(foo__in=Foo.objects.filter(bar=1)). In that sense QuerySet is an SQL expression.

comment:5 by Mariusz Felisiak, 5 years ago

Resolution: invalid
Status: newclosed

Either way though, the bug I am reporting is that when returning a django queryset, via a custom field with a pre_save, it raises an AttributeError. Since these are all normal Django objects, and I'm using documented django approaches, I see this as a bug.

Your custom field implementation is quite complicated. Custom fields with pre_save() works properly because you can treat any Django's field as a custom field e.g. DateField. It seems (but it is hard to tell without pre_save_func()) that in your case pre_save() returns Query which is not expected by SQLInsertCompiler.prepare_value(), that's why IMO it is an issue in your implementation/approach.

Since these are all normal Django objects.

You cannot assume that Django internals will handle everything, even if these objects are defined by Django.

I think that return self.get_db_prep_value(value) in pre_save() in branch with pre_save_func() should fix your issue.

Note: See TracTickets for help on using tickets.
Back to Top