Opened 5 years ago

Last modified 5 months ago

#13528 new Bug

db_table truncation is applied based on the properties of the default database

Reported by: russellm Owned by:
Component: Database layer (models, ORM) Version: master
Severity: Normal Keywords:
Cc: loic84 Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Reported by Peter Long in django-dev:

In django/db/models/options.py (line 54), the contribute_to_class method uses the default database connection to get an ops module that is used to truncate the db_table name. The options class isn't in a position to do this truncation, and there's no guarantee that the default database has an appropriate truncation module.

Attachments (1)

13528-1.diff (987 bytes) - added by claudep 2 years ago.
Possible fix

Download all attachments as: .zip

Change History (23)

comment:1 Changed 5 years ago by russellm

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Triage Stage changed from Unreviewed to Accepted

comment:2 Changed 5 years ago by russellm

  • milestone changed from 1.2 to 1.3

This isn't a trivial thing to fix, and it only affects users that have:

  1. multiple databases
  2. from different vendors
  3. where the default database has a longer maximum allowed length than the other vendors database.
  4. and there are tables whose table names exceed the shorter max_length
  5. and the db_table can't be renamed.

For example, if you have a PostgreSQL default database (max_length=64), and an Oracle secondary database (max_length=30), and you can't keep your table names to under 30 characters, you will be affected by this bug. This is a sufficiently edge case problem that I'm going to bump from 1.2.

comment:3 Changed 5 years ago by russellm

Anssi Kaariainen reports that the same problem exists with the validate command.

comment:4 Changed 5 years ago by Alex

FWIW solving this for the validate command will be a lot easier. Just loop through each DB for each model, check if the model should be synced (with the router) and go from there.

comment:5 Changed 5 years ago by andrewgodwin

  • Owner changed from nobody to andrewgodwin
  • Status changed from new to assigned

I'm working on this - my proposed solution is thus:

  • Deprecate Options.db_table (PendingDeprecationWarning for 1.3)
  • Add Options.get_db_table(using=db_alias)
  • Change all the places Django uses db_table to use get_db_table

This should both solve the problem, and allow third-party apps to continue using db_table (as it's still valid if you've only got one database).

comment:6 Changed 5 years ago by russellm

See #13711 for an analogous problem with field.colname.

comment:7 Changed 5 years ago by andrewgodwin

Update on this ticket: There's several places where the SQL is (imho incorrectly) part of the Query and other classes, so I'm waiting for the second queryset refactor work to get along to the point where the patch can be on top of that, otherwise there's going to be massive conflicts.

comment:8 Changed 4 years ago by julien

  • Severity set to Normal
  • Type set to Bug

comment:9 Changed 4 years ago by jacob

  • milestone 1.3 deleted

Milestone 1.3 deleted

comment:11 Changed 3 years ago by aaugustin

  • UI/UX unset

Change UI/UX from NULL to False.

comment:12 Changed 3 years ago by aaugustin

  • Easy pickings unset

Change Easy pickings from NULL to False.

comment:13 Changed 3 years ago by akaariai

Ive been thinking that model._meta should contain backend specific sections. This should contain at least quoted db_table name and column names. In addition to solving this problem such an arrangement should speed up SQL generation. Quoting the table and column names can consume a significant amount of time.

So, instead of doing model._meta.db_table you would do something like model._meta[conn].db_table. However, this requires somewhat extensive changes in the sql/query.py and sql/compiler.py, so I guess this has to wait...

Changed 2 years ago by claudep

Possible fix

comment:14 Changed 2 years ago by claudep

  • Has patch set
  • Needs tests set

Using router might be a solution, however I don't know if this does negatively affect performance or not.

comment:15 Changed 17 months ago by anubhav9042

claudep: In the diff you added:

db = router.db_for_read(self.model)
Option object doesn't have an attribute model

comment:16 Changed 17 months ago by anubhav9042

claudep: Making self.model to self.concrete_model works fine.

Last edited 17 months ago by anubhav9042 (previous) (diff)

comment:17 Changed 15 months ago by andrewgodwin

  • Owner andrewgodwin deleted
  • Status changed from assigned to new

comment:18 Changed 12 months ago by anubhav9042

  • Owner set to anubhav9042
  • Status changed from new to assigned

comment:19 Changed 12 months ago by anubhav9042

  • Version changed from 1.2-beta to master

As discussed with Florian on IRC, we should check for minimum length that all available dbs allow and truncate accordingly.
PR: https://github.com/django/django/pull/2765

Last edited 12 months ago by anubhav9042 (previous) (diff)

comment:20 follow-up: Changed 12 months ago by timo

  • Needs tests unset
  • Patch needs improvement set

I don't think the approach of the latest PR is good. Let's try implementing what Andrew suggested in comment 5 of this ticket.

comment:21 Changed 12 months ago by loic84

  • Cc loic84 added

comment:22 in reply to: ↑ 20 Changed 12 months ago by anubhav9042

Replying to timo:

I don't think the approach of the latest PR is good. Let's try implementing what Andrew suggested in comment 5 of this ticket.

I tried to implement the above, I think that it is infeasible, atleast at the moment for the following reason:

  • There are many places for eg. as in query creation where db_table is used but there is no knowledge of alias or connection in order to call get_db_table. Even if try to increase the function parameters it would be very hacky to continue all the way up the function calls until an alias is found and pass it down all the way where it is needed.
  • In the backends there is no alias used so we don't know using the connection to call this function will be correct or not.
  • There are many places where still connection is used instead of connections[alias] so that might continue to create similar multi-db problems as the above.

I discussed this issue a little with Loic and this is what he suggested that we can note down where and why it is possible to change the suggested behaviour and where all the alias is not used and should be so that we can move on from this and try to fix that bigger problem instead of this one and that might take considerable amount of time.

comment:23 Changed 5 months ago by timgraham

  • Owner anubhav9042 deleted
  • Status changed from assigned to new
Note: See TracTickets for help on using tickets.
Back to Top