Opened 10 years ago

Closed 7 years ago

Last modified 7 years ago

#24367 closed Bug (fixed)

ORA-00918 (column ambiguously defined) when using a combination of slicing, distinct, and order_by

Reported by: skoot Owned by: nobody
Component: Database layer (models, ORM) Version: 1.7
Severity: Normal Keywords: ORA-00918 oracle queryset
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The steps to reproduce the problem are fairly simple.

First, you need at least 3 models :

  • The first model has foreign keys to the other two
  • The other two have a field with a similar name.

Here is a minimalistic example (pseudo-code):

class A (Model):
  b = ForeignKey(B)
  c = ForeignKey(C)

class B (Model):
  name = CharField()

class C (Model)
  name = CharField()

So model A has foreign keys to models B and C, and models B and C both have a field called "name".

If we then select B's "name" field while ordering by C's "name" field in a distinct and sliced query, it fails with an ORA-00918 error (column ambiguously defined). In our example, the queryset would look something like this:

A.objects.values("b__name").order_by("c__name").distinct()[:10]
or
A.objects.select_related("b").order_by("c__name").distinct()[:10]

If you remove either the order_by, the distinct(), or the slicing, it works. If you have all 3, it breaks.

The generated SQL query looks something like this:

SELECT * FROM (SELECT ROWNUM AS "_RN", "_SUB".* FROM (SELECT DISTINCT "B"."NAME" AS "NAME", "C"."NAME" FROM "A" INNER JOIN "B" ON ( "A"."A_ID" = "B"."A_ID" ) INNER JOIN "C" ON ( "A"."C_ID" = "C"."C_ID" ) ORDER BY "C"."NAME" ASC) "_SUB" WHERE ROWNUM <= 10) WHERE "_RN" > 0

I've reproduced the bug both on django 1.4 and on django 1.7, and it is oracle specific (my application works fine on pgsql).

Change History (9)

comment:1 by skoot, 10 years ago

The bug is actually easier to spot when using this queryset:

A.objects.values("b__name", "c__name").order_by("c__name").distinct()[:10]

which generates this query:

SELECT * FROM (SELECT ROWNUM AS "_RN", "_SUB".* FROM (
  SELECT DISTINCT "B"."NAME" AS "NAME", "C"."NAME" AS Col1, "C"."NAME"
  FROM (...)
  ORDER BY "C"."NAME" ASC
) "_SUB" WHERE ROWNUM <= 10) WHERE "_RN" > 0

IMO, the broken code is in SQLCompiler.get_ordering():

                for table, col, order in self.find_ordering_name(field,
                        self.query.model._meta, default_order=asc):
                    if (table, col) not in processed_pairs:
                        elt = '%s.%s' % (qn(table), qn2(col))
                        processed_pairs.add((table, col))
                        if distinct and elt not in select_aliases:
                            ordering_aliases.append(elt)
                        result.append('%s %s' % (elt, order))
                        group_by.append((elt, []))

It's from django 1.4 code, but it's pretty similar in 1.7.

"elt not in select_aliases" will never be False, because get_columns() was called with the "with_aliases" parameter set to True, and so the entry in select_aliases for C's "name" is "Col1" and not "C"."NAME".

Which predicts that this query should works despite the bug:

A.objects.values("b__name", "c__name").order_by("b__name").distinct()[:10]

And it does: B's name being in first position, il will be aliased as "NAME", not "Col1". And so "elt not in select_aliases" will find it, and won't add the column again in the SELECT part of the statement.

I got a patch, but I don't like it. It solves my problem, but it doesn't prevent get_ordering form adding in the SELECT part of the statement a column that's already there. All it does is alias it unconditionally using a name (hopefully) won't be used anywhere else:

      ord = 1
      (...)
                        if distinct and elt not in select_aliases:
                            ordering_aliases.append("%s AS Ord%d" % (elt, ordn))
                            ordn += 1

Definitely not the way it should be fixed, but it may help whoever works on this understand what happens and why.

comment:2 by skoot, 10 years ago

I just checked and it's fixed in django 1.8.

comment:3 by Tim Graham, 10 years ago

Resolution: fixed
Status: newclosed

comment:4 by Kishor Pawar, 7 years ago

Resolution: fixed
Status: closednew

I have models like (result of inspectdb)

class Profile(models.Model):
    id = models.BigIntegerField(primary_key=True)
    name = models.CharField(max_length=1024, blank=True, null=True)

    class Meta:
        managed = False
        db_table = 'cpe_profile'
    
class ResourceMgmt(models.Model):
    id = models.BigIntegerField(primary_key=True)
    profile = models.ForeignKey(Profile, models.DO_NOTHING, related_name='cpe_mgmt_profile')

    class Meta:
        managed = False
        db_table = 'cpe_resource_mgmt'

Serializer like

class ResourceMgmtSerializer(serializers.ModelSerializer):
    profile_name = serializers.StringRelatedField(source='profile.name', read_only=True)
  
    class Meta:
        model = ResourceMgmt
        fields = (<tuple of fields including profile_name>)

View like

class RunningSchedules(generics.ListCreateAPIView):
        
    serializer_class = ResourceMgmtSerializer
    model = ResourceMgmt
    filter_backends = (filters.OrderingFilter, filters.SearchFilter,)
    ordering_fields = (<tuple of fields including profile_name as profile__name>)
    ordering = ('-schedule_start_time',)
    search_fields = ordering_fields

Now when I hit url like ?ordering=profile__name I get an error ORA-00918: column ambiguously defined.

comment:5 by Tim Graham, 7 years ago

Could you give a QuerySet that reproduces the problem rather than something that requires Django REST Framework?

in reply to:  5 comment:6 by Kishor Pawar, 7 years ago

Replying to Tim Graham:

Could you give a QuerySet that reproduces the problem rather than something that requires Django REST Framework?

Sorry for delayed reply. I did not get notification for your reply.

Query =

ResourceMgmt.objects.filter(is_active='Y')\
    .filter(schedule_next_run_time__lt=future)\ # future is calculated date
    .filter(schedule_end_time__gt=now + datetime.timedelta(minutes=1))\
    .extra(where=["schedule_next_run_time > sysdate - (schedule_frequency/1440)",
                 "owner_id IN ( \
                  select id from cpe_user \
                  where is_active='Y' \
                  connect by prior id = parent_id start with id = % s)"], 154)

the output looks like

{"count":4,"next":null,"previous":null,"results":[{"id":396511,"owner":385436,"owner_login":"sa@nv.com","resource_owner":"sa@nv.com","resource":396387,"resource_name":"20nov","profile":157,"profile_name":"some string here","run_remediate":"Y","schedule_start_time":"2017-11-20T06:05:00.578032Z","schedule_frequency":1,"schedule_create_time":"2017-11-20T06:02:35Z","schedule_end_time":"2017-11-20T06:05:00.578032Z","schedule_next_run_time":"2017-11-20T06:05:00.578032Z","is_active":"Y","run_cost_scan":"28.00","run_cost_remediate":"140.00","last_run":null,"created_by":"sa@nv.com","create_date":"2017-11-20T06:04:00Z","updated_by":"RAXAK3","update_date":"2017-11-20T06:05:00Z","overall_status":null,"overall_info":null,"resource_info":{"username":"raxak","os_version":"v6.3.9600","os_architecture":"6.3.9600","os_name":"some string here","ip_addr":"103.157.4.217"}},{"id":380200,"owner":379468,"owner_login":"sa17@nov.com","resource_owner":"sa17@nov.com","resource":379903,"resource_name":"WIN-EOVV3Q2I5KK","profile":157,"profile_name":"some string here","run_remediate":"N","schedule_start_time":"2017-11-17T09:49:03.235260Z","schedule_frequency":1,"schedule_create_time":"2017-11-17T09:46:38Z","schedule_end_time":"2017-11-17T09:49:03.235260Z","schedule_next_run_time":"2017-11-17T09:49:03.235260Z","is_active":"Y","run_cost_scan":"28.00","run_cost_remediate":"140.00","last_run":null,"created_by":"sa17@nov.com","create_date":"2017-11-17T09:48:03Z","updated_by":"RAXAK3","update_date":"2017-11-17T09:49:00Z","overall_status":null,"overall_info":null,"resource_info":{"username":"raxak","os_version":"v6.3.9600","os_architecture":"6.3.9600","os_name":"Microsoft Windows Server 2012 R2","ip_addr":"13.126.155.240"}},{"id":367211,"owner":354149,"owner_login":"sa@amz.com","resource_owner":"sa@amz.com","resource":359474,"resource_name":"2012_1","profile":359771,"profile_name":"prof1","run_remediate":"N","schedule_start_time":"2017-11-16T08:47:24.278393Z","schedule_frequency":1,"schedule_create_time":"2017-11-16T08:45:00Z","schedule_end_time":"2017-11-16T08:47:24.278393Z","schedule_next_run_time":"2017-11-16T08:47:24.278393Z","is_active":"Y","run_cost_scan":"0.50","run_cost_remediate":"2.50","last_run":null,"created_by":"sa@amz.com","create_date":"2017-11-16T08:46:24Z","updated_by":"RAXAK3","update_date":"2017-11-16T08:47:00Z","overall_status":null,"overall_info":null,"resource_info":{"username":"raxak","os_version":"v6.3.9600","os_architecture":"6.3.9600","os_name":"Microsoft Windows Server 2012 R2","ip_addr":"13.127.3.77"}},{"id":366341,"owner":354149,"owner_login":"sa@amz.com","resource_owner":"sa@amz.com","resource":359474,"resource_name":"2012_1","profile":157,"profile_name":"some string here","run_remediate":"Y","schedule_start_time":"2017-11-16T08:29:24.543755Z","schedule_frequency":1,"schedule_create_time":"2017-11-16T08:27:00Z","schedule_end_time":"2017-11-16T08:29:24.543755Z","schedule_next_run_time":"2017-11-16T08:29:24.543755Z","is_active":"Y","run_cost_scan":"28.00","run_cost_remediate":"140.00","last_run":null,"created_by":"sa@amz.com","create_date":"2017-11-16T08:28:24Z","updated_by":"RAXAK3","update_date":"2017-11-16T08:29:00Z","overall_status":null,"overall_info":null,"resource_info":{"username":"raxak","os_version":"v6.3.9600","os_architecture":"6.3.9600","os_name":"some string here","ip_addr":"13.127.3.77"}}]}

comment:7 by Simon Charette, 7 years ago

Resolution: fixed
Status: newclosed

Hey Kishor, could you try reproducing without using extra? I suspect this is the origin of your issue here as id could be ambiguous in your subquery's context.

The sysdate - (schedule_frequency/1440) could be replaced by something along schedule_next_run_time__gt=(Func(template='sysdate') - (F('schedule_frequency')/1440)) and the owner_id IN part by owner_id__in=....

I'll reset this ticket as fixed because while you have the same symptoms as the initial report (ORA-00918 (column ambiguously defined)) you are not using slicing, distinct, or order_by which is what this ticket is about.

If you manage to reproduce a similar failure without using extra() please submit a new ticket with reproduction steps.

in reply to:  7 comment:8 by Kishor Pawar, 7 years ago

Hey Simon, I got what you are saying. I see now what I missed. Also , thank you for improvisation tips. I will update you with results of your suggestion.

Replying to Simon Charette:

Hey Kishor, could you try reproducing without using extra? I suspect this is the origin of your issue here as id could be ambiguous in your subquery's context.

The sysdate - (schedule_frequency/1440) could be replaced by something along schedule_next_run_time__gt=(Func(template='sysdate') - (F('schedule_frequency')/1440)) and the owner_id IN part by owner_id__in=....

I'll reset this ticket as fixed because while you have the same symptoms as the initial report (ORA-00918 (column ambiguously defined)) you are not using slicing, distinct, or order_by which is what this ticket is about.

If you manage to reproduce a similar failure without using extra() please submit a new ticket with reproduction steps.

in reply to:  7 comment:9 by Kishor Pawar, 7 years ago

Hey Simon,

Thank you for pointer. When I tried without extra clause, It worked. But id wasn't the issue, issue was owner_id. So I added the dotted notation like cpe_resource_mgmt.owner_id and it worked perfect.

Replying to Simon Charette:

Hey Kishor, could you try reproducing without using extra? I suspect this is the origin of your issue here as id could be ambiguous in your subquery's context.

The sysdate - (schedule_frequency/1440) could be replaced by something along schedule_next_run_time__gt=(Func(template='sysdate') - (F('schedule_frequency')/1440)) and the owner_id IN part by owner_id__in=....

I'll reset this ticket as fixed because while you have the same symptoms as the initial report (ORA-00918 (column ambiguously defined)) you are not using slicing, distinct, or order_by which is what this ticket is about.

If you manage to reproduce a similar failure without using extra() please submit a new ticket with reproduction steps.

Note: See TracTickets for help on using tickets.
Back to Top