Code

Opened 3 years ago

Closed 15 months ago

Last modified 15 months ago

#15586 closed Bug (wontfix)

Fields with missing columns populated when querying on PostgreSQL

Reported by: sebzur Owned by: nobody
Component: Documentation Version: 1.2
Severity: Normal Keywords: postgresql
Cc: lrekucki@…, botondus@… Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

With PostgreSQL as backend, when I use CharField field and name it "name", i.e.:

class MyModel(models.Model):
   name = models.CharField(max_length=255, ....)

I can reference to a 'name' field even though the appropriate db column is missing... (I forgot to create it
manually that's how I've noticed that strange issue)

The issue went out when I updated the model (no 'name' field before), I did not create a DB column, but that did not
stop Django from filling the field with a value! On .save() everything is OK and expected DatabseError
exception is raised by psycopg2.

The value stored in the field looks like a tuple, namely its: '(A, B, C)' where A-C are integers.

Attachments (1)

test.sql (270 bytes) - added by lrekucki 3 years ago.
SQL query to check against.

Download all attachments as: .zip

Change History (19)

comment:1 Changed 3 years ago by ramiro

  • Needs documentation unset
  • Needs tests unset
  • Patch needs improvement unset
  • Resolution set to wontfix
  • Status changed from new to closed

I t is not completely clear to me what is the behavior you expect here.

If what you are suggesting is that Django flags the divergence between the model design and its corresponding database table structure as soon as possible (e.g. at Python model field population time) it would be a waste of resources (it would need to introspect the DB table every time). IMHO current behavior is a good compromise: Let the DB-API driver/database engine to flag the error when actual interaction with the DB is performed. Remember, we are using a RDBMS plus an ORM, not a OODB, there are two separate realms, the model objects one and the database one.

If what you are asking for is database schema evolution so it follows the model design, it is a feature that hopefully will be added in a future version and there are third party tools that implement that now (but AFAIK none of them perform automatic schema evolution so they won't be of help in cases where the developer wants to get away with changing the model design at will and hoping the software will take care of the database changes).

See also http://docs.djangoproject.com/en/1.2/faq/models/#if-i-make-changes-to-a-model-how-do-i-update-the-database

Version 0, edited 3 years ago by ramiro (next)

comment:2 Changed 3 years ago by sebzur

Of course testing the synchronization between the Django models & db schema every time would be a stupid idea - that was not my point.

Also, I was not asking for automated schema evolution, which is quite a tricky issue and in most cases, for complex DB schema updates, You just have to handle
it manually on your own.

I was just curious HOW 'SELECT' statement could retrieve something from database, when there was no column to get from?

Once again I try to describe the proces:

  • model MyMode was created and db was synced
  • to model was added a CharField and I did not create corresponding DB table on my own
  • I performed qs = MyMode.objects.all(), that I had something in: qs[0].name, qs[1].name and so on ... which I do not understand how this field was field when there was no data in DB? It looks to me like an unexpected behavior....

I would appreciate Your further comments.

comment:3 Changed 3 years ago by sebzur

sorry I made a mistake - of course in second point of process workflow should be:

to model was added a CharField and I did not create corresponding DB *column* on my own

comment:4 Changed 3 years ago by lrekucki

  • Keywords postgresql added
  • Resolution wontfix deleted
  • Status changed from closed to reopened
  • Summary changed from Field populated when no DB column present to Fields with missing columns populated when querying on PostgreSQL
  • Triage Stage changed from Unreviewed to Design decision needed

This is a PostgreSQL related issue. The query looks something like this:

SELECT "ticket15586_mymodel"."id", "ticket15586_mymodel"."name" FROM "ticket15586_mymodel"

And with absence of name column on the table, one would expect it to fail (and it does on SQLite, for example). But PostgreSQL seems to let you select the table as a ROW object (or something like that - I'm not an expert on that). So the following queries are valid:

SELECT "ticket15586_mymodel" FROM "ticket15586_mymodel";
SELECT "ticket15586_mymodel".name FROM "ticket15586_mymodel";
SELECT ("ticket15586_mymodel")::name FROM "ticket15586_mymodel";

With the second being treated as the third, which is a type cast. Don't know if there is a way to make it unambiguous that we want a table column, not some weird type thing. I'm going to reopen, so maybe someone with more PostgreSQL-related knowledge can fix this.

Last edited 3 years ago by lrekucki (previous) (diff)

comment:5 Changed 3 years ago by lrekucki

  • Cc lrekucki@… added
  • Component changed from Uncategorized to Database layer (models, ORM)

comment:6 Changed 3 years ago by bberes

  • Cc botondus@… added

comment:7 Changed 3 years ago by ramiro

  • Resolution set to needsinfo
  • Status changed from reopened to closed

I can't reproduce this. Tried with Django trunk and 1.2.5

  • PostgresSQL version is 8.3.14
  • psycopg2 version is 2.0.7-4
# t15586/models.py

from django.db import models

class MyModel(models.Model):
    name = models.CharField(max_length=255)
$ ./manage.py dbshell
Welcome to psql 8.3.14, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
       \h for help with SQL commands
       \? for help with psql commands
       \g or terminate with semicolon to execute query
       \q to quit

dbname=> select * from t15586_mymodel;
 id
----
(0 rows)

dbname=>

$ ./manage.py shell
Python 2.5.2 (r252:60911, Jan 24 2010, 14:53:14)
Type "copyright", "credits" or "license" for more information.

IPython 0.8.4 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object'. ?object also works, ?? prints more.

In [1]: from django.db import connection

In [2]: from t15586.models import MyModel

In [3]: MyModel.objects.all()
Out[3]: ---------------------------------------------------------------------------
DatabaseError                             Traceback (most recent call last)

dtest07/<ipython console> in <module>()

/var/lib/python-support/python2.5/IPython/Prompts.pyc in __call__(self, arg)
    549
    550             # and now call a possibly user-defined print mechanism
--> 551             manipulated_val = self.display(arg)
    552
    553             # user display hooks can change the variable to be stored in

/var/lib/python-support/python2.5/IPython/Prompts.pyc in _display(self, arg)
    575             return IPython.generics.result_display(arg)
    576         except TryNext:
--> 577             return self.shell.hooks.result_display(arg)
    578
    579     # Assign the default display method:

/var/lib/python-support/python2.5/IPython/hooks.pyc in __call__(self, *args, **kw)
    133             #print "prio",prio,"cmd",cmd #dbg
    134             try:
--> 135                 ret = cmd(*args, **kw)
    136                 return ret
    137             except ipapi.TryNext, exc:

/var/lib/python-support/python2.5/IPython/hooks.pyc in result_display(self, arg)
    163
    164     if self.rc.pprint:
--> 165         out = pformat(arg)
    166         if '\n' in out:
    167             # So that multi-line strings line up with the left column of

/usr/lib/python2.5/pprint.pyc in pformat(self, object)
    109     def pformat(self, object):
    110         sio = _StringIO()
--> 111         self._format(object, sio, 0, 0, {}, 0)
    112         return sio.getvalue()
    113

/usr/lib/python2.5/pprint.pyc in _format(self, object, stream, indent, allowance, context, level)
    127             self._readable = False
    128             return
--> 129         rep = self._repr(object, context, level - 1)
    130         typ = _type(object)
    131         sepLines = _len(rep) > (self._width - 1 - indent - allowance)

/usr/lib/python2.5/pprint.pyc in _repr(self, object, context, level)
    193     def _repr(self, object, context, level):
    194         repr, readable, recursive = self.format(object, context.copy(),
--> 195                                                 self._depth, level)
    196         if not readable:
    197             self._readable = False

/usr/lib/python2.5/pprint.pyc in format(self, object, context, maxlevels, level)
    205         and whether the object represents a recursive construct.
    206         """
--> 207         return _safe_repr(object, context, maxlevels, level)
    208
    209

/usr/lib/python2.5/pprint.pyc in _safe_repr(object, context, maxlevels, level)
    290         return format % _commajoin(components), readable, recursive
    291
--> 292     rep = repr(object)
    293     return rep, (rep and not rep.startswith('<')), False
    294

django/db/models/query.py in __repr__(self)
     65
     66     def __repr__(self):
---> 67         data = list(self[:REPR_OUTPUT_SIZE + 1])
     68         if len(data) > REPR_OUTPUT_SIZE:
     69             data[-1] = "...(remaining elements truncated)..."

django/db/models/query.py in __len__(self)
     80                 self._result_cache = list(self.iterator())
     81         elif self._iter:
---> 82             self._result_cache.extend(list(self._iter))
     83         return len(self._result_cache)
     84

django/db/models/query.py in iterator(self)
    269         model = self.model
    270         compiler = self.query.get_compiler(using=db)
--> 271         for row in compiler.results_iter():
    272             if fill_cache:
    273                 obj, _ = get_cached_row(model, row,

django/db/models/sql/compiler.py in results_iter(self)
    675         fields = None
    676         has_aggregate_select = bool(self.query.aggregate_select)
--> 677         for rows in self.execute_sql(MULTI):
    678             for row in rows:
    679                 if resolve_columns:

django/db/models/sql/compiler.py in execute_sql(self, result_type)
    730
    731         cursor = self.connection.cursor()
--> 732         cursor.execute(sql, params)
    733
    734         if not result_type:

django/db/backends/util.py in execute(self, sql, params)
     13         start = time()
     14         try:
---> 15             return self.cursor.execute(sql, params)
     16         finally:
     17             stop = time()

django/db/backends/postgresql_psycopg2/base.py in execute(self, query, args)
     42     def execute(self, query, args=None):
     43         try:
---> 44             return self.cursor.execute(query, args)
     45         except Database.IntegrityError, e:
     46             raise utils.IntegrityError, utils.IntegrityError(*tuple(e)), sys.exc_info()[2]

DatabaseError: column t15586_mymodel.name does not exist
LINE 1: SELECT "t15586_mymodel"."id", "t15586_mymodel"."name" FROM "...
                                      ^

comment:8 Changed 3 years ago by lrekucki

Could you try with PostgreSQL >= 8.4 ? I was using PostgreSQL 8.4.7 and psycopg2 2.3.2. Only reference to a similar problem I could find is: http://archives.postgresql.org/pgsql-general/2010-05/msg01126.php.

Changed 3 years ago by lrekucki

SQL query to check against.

comment:9 Changed 3 years ago by sebzur

I was using Django 1.2.3, Postgres 9.0.1, psycopg2 v. 2.2.2 (dt dec ext pq3). In a few hours I'll try once again, with a raw SQL query and I'll report the results. I'll try to test it also with Postgres 8.

comment:10 Changed 3 years ago by ramiro

  • Resolution needsinfo deleted
  • Status changed from closed to reopened

I can reproduce this with PostgreSQL 8.4.7. Reopening the ticket.

comment:11 Changed 3 years ago by adrian

This is an interesting one! I was able to reproduce it in Postgres 8.4.1. The issue seems to be isolated to the column called "name". I tried selecting non-existent columns with other names besides "name", and it worked as expected, raising an error. Here's an example:

db=# CREATE TABLE test (id INTEGER PRIMARY KEY);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "test_pkey" for table "test"
CREATE TABLE
db=# INSERT INTO test VALUES (1), (2), (3);
INSERT 0 3
db=# SELECT "test"."id", "test"."name" FROM "test";
 id | name 
----+------
  1 | (1)
  2 | (2)
  3 | (3)
(3 rows)

db=# SELECT "test"."id", "test"."name2" FROM "test";
ERROR:  column test.name2 does not exist

Here is a thread from the Postgres mailing list with more info: http://archives.postgresql.org/pgsql-general/2010-02/msg01038.php. "Calling t.name is the same as name(t) if a column reference is not found."

I agree with the previous comments that we should avoid introspecting the database, and I can't immediately think of a user-friendly solution to this beyond documenting it for Postgres users. Any ideas?

comment:12 Changed 3 years ago by lukeplant

  • Type set to Bug

comment:13 Changed 3 years ago by lukeplant

  • Severity set to Normal

comment:14 Changed 3 years ago by sebzur

  • Easy pickings unset

After a few weeks I'm returning to this issue. To summarize what we see, let me point out the two elements:

  • the issue has rather low probability to appear: You have to use PG backend + You have to have Your model defining non-existing (in DB schema sense) field/column
  • I feel this is also a kind of a issue in PostgreSQL itself

What we should do now, is for sure to document this behavior --- while it's not clear in DB engine itself what does it mean to select column from table, I believe Django should not be smarter in handling it in other way than the engine does it.

comment:15 Changed 2 years ago by aaugustin

  • Component changed from Database layer (models, ORM) to Documentation
  • Triage Stage changed from Design decision needed to Accepted
  • UI/UX unset

Since no one came up with a better plan, let's document it in ref/databases.txt.

We have comparable limitations documented for other databases, for instance, "don't create a field called timestamp if you're using Oracle".

comment:16 Changed 19 months ago by timo

The usefulness of adding something like the following seems very limited to me:

"If you are using PostgreSQL and add a field called 'name' to an existing model but forget to add the column in your database, SELECT statements will appear to work even though the column doesn't exist (since NAME is a Postgres function and SELECT foo.name FROM foo is equivalent to SELECT name(foo) FROM foo) ."

but since the ticket has been accepted, is that the essence of what is being suggested?

comment:17 Changed 19 months ago by lrekucki

Note that "name" is just an example here. From my understanding, you can swap it to any other datatype that PostgreSQL supports and instead of raising an error about a missing column, it will treat it as a cast of the whole row to that datatype.

Anyways, I tried this on PostgreSQL 9.2 and the behavior is no longer there. I'm guessing it was removed in version 9, but don't have an instance to verify.

comment:18 Changed 15 months ago by timo

  • Resolution set to wontfix
  • Status changed from reopened to closed

Since this doesn't happen on PostgreSQL 9.2 and seems like it has a very low probability of happening, I don't think it's worth documenting at this point. If someone really wants to see this added and can provide a patch, I'll be happy to review and commit it.

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
as The resolution will be set. Next status will be 'closed'
The resolution will be deleted. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.